Issue
We had very slow query running from 20 to 70 minutes. Also, it affected some other database processes during execution. We replaced subquery with join and overall performance of query is much better now. Let's see performance graph created with OBI. Performance improved to 6 minutes in average. You can see minutes on Y axis.
In this article I try to restore issue on EMP, DEPT example. Please download and run
emp_dept.sql script. It will insert 4004 records in DEPT table and 1300013 records in EMP table.
Non-optimized query
select e.job, e.deptno from emp e
group by e.job, e.deptno
having avg(e.sal) >
(
select avg(e1.sal) from emp e1 where e1.deptno = e.deptno
);
It takes more than 2 minutes on Oracle database version 11.2.0.4.0 to fetch first 500 of 4004 rows.
Optimized query
with avgs as
(
select deptno, avg(sal) avg_sal from emp group by deptno
)
select e.job, e.deptno from emp e
join avgs on avgs.deptno = e.deptno
group by e.job, e.deptno, avgs.avg_sal
having avg(e.sal) > avgs.avg_sal;
It takes 1 second to fetch first 500 of 4004 rows!
Execution plans
Non-optimized query
Plan hash value: 2799045373
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1303K| 39M| 2606 (6)| 00:00:32 |
|* 1 | FILTER | | | | | |
| 2 | HASH GROUP BY | | 1303K| 39M| 2606 (6)| 00:00:32 |
| 3 | TABLE ACCESS FULL| EMP | 1303K| 39M| 2499 (2)| 00:00:30 |
| 4 | SORT AGGREGATE | | 1 | 26 | | |
|* 5 | TABLE ACCESS FULL| EMP | 13031 | 330K| 2505 (2)| 00:00:31 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(AVG("E"."SAL")> (SELECT AVG("E1"."SAL") FROM
"BK_APEX_MANUAL_CMS"."EMP" "E1" WHERE "E1"."DEPTNO"=:B1))
5 - filter("E1"."DEPTNO"=:B1)
Note
-----
- dynamic sampling used for this statement (level=2)
- SQL plan baseline "SQL_PLAN_9rx6m0mzg4ja1984ce69e" used for this statement
Note use of binded variable :B1 in the plan. Following query uses exactly the same execution plan.
select e.job, e.deptno from emp e
group by e.job, e.deptno
having avg(e.sal) >
(
select avg(e1.sal) from emp e1 where e1.deptno = :b1
);
Check hash value of the plan. It's equal to previous plan hash:
Plan hash value: 2799045373
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1303K| 39M| 2606 (6)| 00:00:32 |
|* 1 | FILTER | | | | | |
| 2 | HASH GROUP BY | | 1303K| 39M| 2606 (6)| 00:00:32 |
| 3 | TABLE ACCESS FULL| EMP | 1303K| 39M| 2499 (2)| 00:00:30 |
| 4 | SORT AGGREGATE | | 1 | 26 | | |
|* 5 | TABLE ACCESS FULL| EMP | 13031 | 330K| 2516 (3)| 00:00:31 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(AVG("E"."SAL")> (SELECT AVG("E1"."SAL") FROM
"BK_APEX_MANUAL_CMS"."EMP" "E1" WHERE "E1"."DEPTNO"=TO_NUMBER(:B1)))
5 - filter("E1"."DEPTNO"=TO_NUMBER(:B1))
Note
-----
- dynamic sampling used for this statement (level=2)
It's obvious that execution plan doesn't know how many times subquery must be evaluated. Subquery is evaluated for each deptno (4004 times). Subquery unnesting doesn't take place here.
Execution plan doesn't reflect real effort to process query with nested subquery.
Optimized query
Execution plan reflects real processing:
Plan hash value: 652522642
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 987M| 53G| | 136K (93)| 00:27:20 |
|* 1 | FILTER | | | | | | |
| 2 | HASH GROUP BY | | 987M| 53G| | 136K (93)| 00:27:20 |
|* 3 | HASH JOIN | | 987M| 53G| 47M| 18931 (48)| 00:03:48 |
| 4 | VIEW | | 1303K| 32M| | 2606 (6)| 00:00:32 |
| 5 | HASH GROUP BY | | 1303K| 32M| | 2606 (6)| 00:00:32 |
| 6 | TABLE ACCESS FULL| EMP | 1303K| 32M| | 2499 (2)| 00:00:30 |
| 7 | TABLE ACCESS FULL | EMP | 1303K| 39M| | 2499 (2)| 00:00:30 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("AVGS"."AVG_SAL"<AVG("E"."SAL"))
3 - access("AVGS"."DEPTNO"="E"."DEPTNO")
Note
-----
- dynamic sampling used for this statement (level=2)
- SQL plan baseline "SQL_PLAN_1kp0nyy5tdw9r987d3348" used for this statement