I recently made an interesting observation while monitoring database performance on an Oracle Enterprise Edition system. While looking at some ASH data (for which you must be licensed appropriately!) I came across direct path reads attributed to a select statement performing a primary key lookup. At first, this didn’t make much sense to me, but it’s actually intended behaviour and not a bug.
In this post I’m reproducing what I observed. I am using Oracle 18.4.0 for this experiment running on my Linux lab environment. The hardware uses 1s8c16t with 64 GB of memory.
Direct Path Reads are multi-block reads you often find in decision support systems when a query is going over large amounts of data. They are un-buffered, as in they use the reading session’s private memory, not the buffer cache. Not exactly what you would expect with a primary key lookup if the index was used. It should be more like this tkprof’d SQL trace:
SQL ID: b5dxjj3wm4yz8 Plan Hash: 4043159647
select *
from
orders where order_id = 100
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.01 0 378 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.00 0.00 0 4 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.01 0.01 0 382 0 1
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 82
Number of plan statistics captured: 1
Rows (1st) Rows (avg) Rows (max) Row Source Operation
---------- ---------- ---------- ---------------------------------------------------
1 1 1 TABLE ACCESS BY INDEX ROWID ORDERS (cr=4 pr=0 pw=0 time=32 us starts=1 cost=3 size=95 card=1)
1 1 1 INDEX UNIQUE SCAN ORDER_PK (cr=3 pr=0 pw=0 time=22 us starts=1 cost=2 size=0 card=1)(object id 81853)
Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
SQL Net message to client 2 0.00 0.00
SQL Net message from client 2 11.40 11.40
No physical I/O anywhere to be seen.
During my original investigation I noticed that direct path reads were attributed to a table containing a CLOB column. The query was something along the lines of “select * from table where primaryKeyColumn = :bindVariable”. The primaryKeyColumn was defined as a number. This should be easy to recreate.
The test case
Based on the Swingbench “ORDERS” table I created a new table using the following DDL:
CREATE TABLE martin.orders_clob (
order_id NUMBER(12,0)
NOT NULL ENABLE,
order_clob CLOB
NOT NULL ENABLE,
order_date TIMESTAMP(6) WITH LOCAL TIME ZONE
NOT NULL ENABLE,
order_mode VARCHAR2(8 BYTE),
customer_id NUMBER(12,0)
NOT NULL ENABLE,
order_status NUMBER(2,0),
order_total NUMBER(8,2),
sales_rep_id NUMBER(6,0),
promotion_id NUMBER(6,0),
warehouse_id NUMBER(6,0),
delivery_type VARCHAR2(15 BYTE),
cost_of_delivery NUMBER(6,0),
wait_till_all_available VARCHAR2(15 BYTE),
delivery_address_id NUMBER(12,0),
customer_class VARCHAR2(30 BYTE),
card_id NUMBER(12,0),
invoice_address_id NUMBER(12,0),
constraint pk_orders_clob primary key ( order_id )
) tablespace bigfile_tbs
lob ( order_clob ) store as securefile (enable storage in row);
Please note that the clob is stored in row.
Next I inserted a large number of rows into the table, based again on soe.orders:
INSERT/*+ enable_parallel_dml append parallel(4) */ INTO martin.orders_clob
SELECT /*+ parallel(4) */
order_id,
case
when mod(rownum,10) = 0 then rpad('X',5000,'Y')
else rpad('Y', 50, 'Z')
end,
order_date,
order_mode,
customer_id,
order_status,
order_total,
sales_rep_id,
promotion_id,
warehouse_id,
delivery_type,
cost_of_delivery,
wait_till_all_available,
delivery_address_id,
customer_class,
card_id,
invoice_address_id
FROM
soe.orders
WHERE
ROWNUM <= 1E6;
The case statement ensures that every 10th row has a lob exceeding the size limit of an inline LOB. Apparently that’s 4000 bytes minus a little bit of overhead. As a result the LOB data should be stored outside the table.
Load!
I am now generating some load against the system, being careful not to overload my lab system. I limited myself to 8 sessions:
SQL> select count(*) from v$session where username = 'MARTIN' and program like 'JDBC%';
COUNT(*)
----------
8
After a few minutes I can see these direct path reads show up. This is really easy if you have the right tools – I use Tanel Poder’s scripts a lot. One particularly useful script is ashtop.sql (remember the license thing again!) that you can see here:
SQL> @ashtop sql_id,event,session_state username='MARTIN' "to_date('19.12.2018 15:35:00', 'dd.mm.yyyy hh24:mi:ss')" "to_date('19.12.2018 15:45:00','dd.mm.yyyy hh24:mi:ss')"
Total
Seconds AAS %This SQL_ID EVENT SESSION FIRST_SEEN LAST_SEEN DIST_SQLEXEC_SEEN
--------- ------- ------- ------------- ---------------------------------------- ------- ------------------- ------------------- -----------------
1380 2.3 74% | 7hth4y8d9h7q8 ON CPU 2018-12-19 15:35:02 2018-12-19 15:44:58 1232
225 .4 12% | ON CPU 2018-12-19 15:35:04 2018-12-19 15:44:59 2
175 .3 9% | 7hth4y8d9h7q8 library cache: mutex X WAITING 2018-12-19 15:35:15 2018-12-19 15:44:49 175
65 .1 3% | 7hth4y8d9h7q8 direct path read WAITING 2018-12-19 15:35:25 2018-12-19 15:44:41 65
8 .0 0% | 7hth4y8d9h7q8 SQL*Net more data to client WAITING 2018-12-19 15:38:52 2018-12-19 15:44:29 8
3 .0 0% | 7hth4y8d9h7q8 cursor: pin S WAITING 2018-12-19 15:36:59 2018-12-19 15:42:08 1
3 .0 0% | cursor: pin S WAITING 2018-12-19 15:35:57 2018-12-19 15:42:08 1
7 rows selected.
I’ll also show you the execution plan to confirm I’m not actually performing an unintentional full table scan:
SQL> select * from dbms_xplan.display_cursor('7hth4y8d9h7q8');
PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------
SQL_ID 7hth4y8d9h7q8, child number 0
-------------------------------------
select /* CLOB */ * from martin.orders_clob where order_id =
trunc(dbms_random.value(1,1000000))
Plan hash value: 3284193906
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 3 (100)| |
| 1 | TABLE ACCESS BY INDEX ROWID| ORDERS_CLOB | 1 | 227 | 3 (0)| 00:00:01 |
|* 2 | INDEX UNIQUE SCAN | PK_ORDERS_CLOB | 1 | | 2 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ORDER_ID"=TRUNC("DBMS_RANDOM"."VALUE"(1,1000000)))
20 rows selected.
I didn’t want to use Java’s random method generator, hence the call to dbms_random.value. So there you have it: direct path reads when performing index lookups.
Thanks and big shout out to Tanel Poder for his scripts, they are awesome.