So I'll blog about my steps diagnosing and resolving this issue.
{ I've changed table names because they are not important for this example }
Oracle 9.2.0.6.0
Solaris 64
SQL> select count(*) from user1.table1; COUNT(*) ---------- 597259 SQL> select count(*) from user1.table2; COUNT(*) ---------- 2609503 SQL> select count(*) from user2.table3@rm_db; COUNT(*) ---------- 538512 SQL> select count(*) from user1.table4; COUNT(*) ---------- 93
First to check initial query and what was explain plan telling me.
SQL> explain plan for 2 SELECT 3 DISTINCT col1, 0, 0, TRUNC (SYSDATE - 1), t1.c_id 4 FROM user1.table1 t1, 5 user1.table2 t2, 6 user2.table3@rm_db rt 7 WHERE 1 = 1 8 AND t1.c_t = 'ABC' 9 AND t1.c_id = t2.con_id 10 AND t2.status = 'XY' 11 AND ph NOT IN (SELECT ph 12 FROM table4) 13 AND t1.c_nr = rt.s_cr 14 AND TRUNC (rt.fut) = TRUNC (SYSDATE - 1); Explained.
SQL> select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | Inst |IN-OUT| ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 2 | 340 | 8535 | | | | 1 | SORT UNIQUE | | 2 | 340 | 8523 | | | | 2 | FILTER | | | | | | | | 3 | TABLE ACCESS BY INDEX ROWID| TABLE2 | 1 | 12 | 8 | | | | 4 | NESTED LOOPS | | 2 | 340 | 8522 | | | | 5 | NESTED LOOPS | | 2 | 316 | 8507 | | | | 6 | TABLE ACCESS FULL | TABLE1 | 157 | 6751 | 8013 | | | | 7 | REMOTE | | 1 | 115 | 4 | DW_DB | R->S | | 8 | INDEX RANGE SCAN | IDX_C | 6 | | 3 | | | | 9 | TABLE ACCESS FULL | TABLE4 | 20 | 540 | 6 | | | ----------------------------------------------------------------------------------------
I've executed query and turned tracing on.
call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.03 0.12 0 3 1 0 Execute 1 0.00 0.01 0 0 0 0 Fetch 2 88.15 238.90 11396 20206 0 165 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 4 88.18 239.04 11396 20209 1 165 Elapsed times include waiting on following events: Event waited on Times Max. Wait Total Waited ---------------------------------------- Waited ---------- ------------ SQL*Net message to client 2 0.00 0.00 db file scattered read 1502 0.50 2.47 SQL*Net message to dblink 292103 0.00 0.46 SQL*Net message from dblink 292103 1.33 147.04 db file sequential read 1384 0.44 1.41 latch free 98 0.05 0.23 SQL*Net message from client 2 28.84 28.95 SQL*Net more data to client 1 0.00 0.00 ***************************************************** Rows Row Source Operation ------- --------------------------------------------------- 165 SORT UNIQUE 166 FILTER 166 TABLE ACCESS BY INDEX ROWID TABLE2 1015 NESTED LOOPS 169 NESTED LOOPS 291934 TABLE ACCESS FULL TABLE1 169 REMOTE 845 INDEX RANGE SCAN IDX_C (object id 30818) 0 TABLE ACCESS FULL TABLE4
From this results I could see that bottleneck was network as biggest waits were related to dblink.
To tune that I decided to try USE_HASH hint and use hash join instead of nested loops on remote table.
SQL> explain plan for 2 SELECT /*+USE_HASH(rt t1) */ 3 DISTINCT col1, 0, 0, TRUNC (SYSDATE - 1), t1.c_id 4 FROM user1.table1 t1, 5 user1.table2 t2, 6 user2.table3@rm_db rt 7 WHERE 1 = 1 8 AND t1.c_t = 'ABC' 9 AND t1.c_id = t2.con_id 10 AND t2.status = 'XY' 11 AND ph NOT IN (SELECT ph 12 FROM table4) 13 AND t1.c_nr = rt.s_cr 14 AND TRUNC (rt.fut) = TRUNC (SYSDATE - 1); Explained.
SQL> select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | Inst |IN-OUT| -------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 2 | 340 | 15222 | | | | 1 | SORT UNIQUE | | 2 | 340 | 15210 | | | | 2 | FILTER | | | | | | | | 3 | TABLE ACCESS BY INDEX ROWID| TABLE2 | 1 | 12 | 8 | | | | 4 | NESTED LOOPS | | 2 | 340 | 15209 | | | | 5 | HASH JOIN | | 2 | 316 | 15194 | | | | 6 | TABLE ACCESS FULL | TABLE1 | 157 | 6751 | 8013 | | | | 7 | REMOTE | | 3307 | 371K| 7178 | RM_DB | R->S | | 8 | INDEX RANGE SCAN | IDX_C | 6 | | 3 | | | | 9 | TABLE ACCESS FULL | TABLE4 | 20 | 540 | 6 | | | ----------------------------------------------------------------------------------------
Hm... this time explain plan shows double cost compare to query without USE_HASH hint.
Trace results:
call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.03 0.14 0 0 1 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 2 2.66 4.76 18417 20195 0 165 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 4 2.69 4.90 18417 20195 1 165 Elapsed times include waiting on following events: Event waited on Times Max. Wait Total Waited ---------------------------------------- Waited ---------- ------------ SQL*Net message to client 2 0.00 0.00 db file scattered read 1669 0.01 0.25 db file sequential read 341 0.05 0.40 direct path write 120 0.00 0.11 SQL*Net message to dblink 7 0.00 0.00 SQL*Net message from dblink 7 1.65 1.90 direct path read 120 0.00 0.00 SQL*Net message from client 2 0.32 0.32 SQL*Net more data to client 1 0.00 0.00 *************************************************** Rows Row Source Operation ------- --------------------------------------------------- 165 SORT UNIQUE 166 FILTER 166 TABLE ACCESS BY INDEX ROWID TABLE2 1015 NESTED LOOPS 169 HASH JOIN 291938 TABLE ACCESS FULL TABLE1 281 REMOTE 845 INDEX RANGE SCAN IDX_C (object id 30818) 0 TABLE ACCESS FULL TABLE4
As you can see second query finished in 4,90 secs compare to first query which lasted 239.04 secs.
Using USE_HASH hint I resolved network bottleneck as much less data was being sent over the network.
My colleague was very pleased with query performance improvement :)
It would be interesting to determine how the Hash Join over a network for :
ReplyDelete291938 TABLE ACCESS FULL TABLE1
281 REMOTE
got executed with very little network traffic (as should have been visible in the SQL*Net waits).
Hemant K Chitale
I've kept trace files to investigate this case little more in my spare time.
ReplyDeleteThis little network traffic was the main reason of the performance improvement and I will try to examine Hash Join benefit.
If I manage to find something interesting, I will leave notice here in post comments.
Thank you for suggestion...
Regards,
Marko
I am too also impressed with the solution that you have provided. I have never used this function earlier and even not aware about the performance of this awesome function. I will definitely use this one every time when there is need.
ReplyDeletehi
ReplyDeletedid you tried
/*+ DRIVING_SITE(rt) */
I can't remember as I had this issue 7 years ago :-)
ReplyDeleteRegards,
Marko