četrtek, 28. februar 2019

Subquery vs join performance example

Issue

We had very slow query running from 20 to 70 minutes. Also, it affected some other database processes during execution. We replaced subquery with join and overall performance of query is much better now. Let's see performance graph created with OBI. Performance improved to 6 minutes in average. You can see minutes on Y axis.













In this article I try to restore issue on EMP, DEPT example. Please download and run emp_dept.sql script. It will insert 4004 records in DEPT table and 1300013 records in EMP table.


Non-optimized query

select e.job, e.deptno from emp e
group by e.job, e.deptno 
having avg(e.sal) > 
(
    select avg(e1.sal) from emp e1 where e1.deptno = e.deptno
); 

It takes more than 2 minutes on Oracle database version 11.2.0.4.0 to fetch first 500 of 4004 rows. 


Optimized query

with avgs as
(
select deptno, avg(sal) avg_sal from emp group by deptno
select e.job, e.deptno from emp e
join avgs on avgs.deptno = e.deptno
group by e.job, e.deptno, avgs.avg_sal 
having avg(e.sal) > avgs.avg_sal;

It takes 1 second to fetch first 500 of 4004 rows!


Execution plans


Non-optimized query

Plan hash value: 2799045373
----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |  1303K|    39M|  2606   (6)| 00:00:32 |
|*  1 |  FILTER             |      |       |       |            |          |
|   2 |   HASH GROUP BY     |      |  1303K|    39M|  2606   (6)| 00:00:32 |
|   3 |    TABLE ACCESS FULL| EMP  |  1303K|    39M|  2499   (2)| 00:00:30 |
|   4 |   SORT AGGREGATE    |      |     1 |    26 |            |          |
|*  5 |    TABLE ACCESS FULL| EMP  | 13031 |   330K|  2505   (2)| 00:00:31 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(AVG("E"."SAL")> (SELECT AVG("E1"."SAL") FROM 
              "BK_APEX_MANUAL_CMS"."EMP" "E1" WHERE "E1"."DEPTNO"=:B1))
   5 - filter("E1"."DEPTNO"=:B1)
Note
-----
   - dynamic sampling used for this statement (level=2)
   - SQL plan baseline "SQL_PLAN_9rx6m0mzg4ja1984ce69e" used for this statement

Note use of binded variable :B1 in the plan. Following query uses exactly the same execution plan.

select e.job, e.deptno from emp e
group by e.job, e.deptno 
having avg(e.sal) > 
(
    select avg(e1.sal) from emp e1 where e1.deptno = :b1
); 

Check hash value of the plan. It's equal to previous plan hash:

Plan hash value: 2799045373
----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |  1303K|    39M|  2606   (6)| 00:00:32 |
|*  1 |  FILTER             |      |       |       |            |          |
|   2 |   HASH GROUP BY     |      |  1303K|    39M|  2606   (6)| 00:00:32 |
|   3 |    TABLE ACCESS FULL| EMP  |  1303K|    39M|  2499   (2)| 00:00:30 |
|   4 |   SORT AGGREGATE    |      |     1 |    26 |            |          |
|*  5 |    TABLE ACCESS FULL| EMP  | 13031 |   330K|  2516   (3)| 00:00:31 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(AVG("E"."SAL")> (SELECT AVG("E1"."SAL") FROM 
              "BK_APEX_MANUAL_CMS"."EMP" "E1" WHERE "E1"."DEPTNO"=TO_NUMBER(:B1)))
   5 - filter("E1"."DEPTNO"=TO_NUMBER(:B1))
Note
-----
   - dynamic sampling used for this statement (level=2)

It's obvious that execution plan doesn't know how many times subquery must be evaluated. Subquery is evaluated for each deptno (4004 times). Subquery unnesting doesn't take place here. 
Execution plan doesn't reflect real effort to process query with nested subquery. 


Optimized query

Execution plan reflects real processing: 

Plan hash value: 652522642
---------------------------------------------------------------------------------------
| Id  | Operation              | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |      |   987M|    53G|       |   136K (93)| 00:27:20 |
|*  1 |  FILTER                |      |       |       |       |            |          |
|   2 |   HASH GROUP BY        |      |   987M|    53G|       |   136K (93)| 00:27:20 |
|*  3 |    HASH JOIN           |      |   987M|    53G|    47M| 18931  (48)| 00:03:48 |
|   4 |     VIEW               |      |  1303K|    32M|       |  2606   (6)| 00:00:32 |
|   5 |      HASH GROUP BY     |      |  1303K|    32M|       |  2606   (6)| 00:00:32 |
|   6 |       TABLE ACCESS FULL| EMP  |  1303K|    32M|       |  2499   (2)| 00:00:30 |
|   7 |     TABLE ACCESS FULL  | EMP  |  1303K|    39M|       |  2499   (2)| 00:00:30 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("AVGS"."AVG_SAL"<AVG("E"."SAL"))
   3 - access("AVGS"."DEPTNO"="E"."DEPTNO")
Note
-----
   - dynamic sampling used for this statement (level=2)
   - SQL plan baseline "SQL_PLAN_1kp0nyy5tdw9r987d3348" used for this statement
  

petek, 07. december 2018

How to check equality of two SQL select result sets in Oracle


Often you need to know if the new SQL select statement is equivalent to old one. Let’s say You rewrite select and it’s much faster, but is it equivalent to old one? You need to compare result sets of old and new select statements. You can compare result sets with different combinations of bind variables. If you don’t have any bind variables, create them by replacing constant values. If results sets are equal in all test scenarios you are good to go. Probability that new SQL select statement is equivalent to old one is greater, if you test selects with more different combinations of bind variables and if result sets are bigger.

Select for comparing old select with new select is:

WITH old_select as (select …)
,new_select as (select …)
select count(*), ‘diffs’ desc from
 (
         (
                select * from old_select
                        minus
                 select * from new_select
             )
             union
             (
                    select * from new_select
                        minus
                    select * from old_select
                )
  )
  union all
  select count(*), 'new_select rows' desc from new_select   
  union all
  select count(*), 'old_select rows' desc from old_select   
 ;

Following result set means that result sets are equal and that there are 130 rows in each result set.

0      ‘diffs’
130    'new_select rows'
130    'old_select rows'




sobota, 19. december 2015

WebLogic server and java.lang.ClassCastException

I got this exception:
java.lang.ClassCastException: weblogic.jdbc.wrapper.Blob_oracle_sql_BLOB cannot be cast to oracle.sql.BLOB.
It's because WLS properly wraps all vendor-specific types when using a connection pool. Oracle BLOB and CLOB are obviously Oracle specific types. Programmatic solution is usage of getVendorConnection() method.
Other solution is disabling type wrapping on WLS for specific data source.
  • Open WLS Administration Console and select right data source:

  •  Click on tab Connection Pool and expand Advanced settings at the bottom of the page:

  • Uncheck CheckBox Wrap Data Types (again at the bottom of the page) and save changes:


 We needed to restart WLS to activate change. Exception is gone!