Oracle SQL | Tag Archives: query optimization

Correct syntax for the table_stats hint

Posted on April 16, 2019 by Roger MacNicol Posted in adaptive serial direct path reads, CBO, hints, oracle, SmartScan, trace, troubleshooting, undocumented 2,556 Page views 3 Comments

A friend contacted me to ask why they were having problems using the table_stats hint to influence optimizer decision making and also to influence the decision to use direct read or buffer cache scan so this is just a quick blog post to clarify the syntax as it is not well documented.

table_stats(<table_name> <method> {<keyword>=<value>} )

Method is one of: DEFAULT, SET, SCALE, SAMPLE

Keyword is one of: BLOCKS, ROWS, ROW_LENGTH

Continue reading→

Lateral view decorrelation(VW_DCL) causes wrong results with rownum

Posted on February 16, 2019 by Sayan Malakshinov Posted in 12c, bug, CBO, oracle, query optimizing, rownum, troubleshooting 1,987 Page views 3 Comments

Everyone knows that rownum in inline views blocks many query transformations, for example pushing/pulling predicates, scalar subquery unnesting, etc, and many people use it for such purposes as a workaround to avoid unwanted transformations(or even CBO bugs).

Obviously, the main reason of that is different calculation of rownum:

If we pull the predicate “column_value = 3″ from the following query to higher level
[sourcecode language=”sql” highlight=””]
select *
from (select * from table(odcinumberlist(1,1,1,2,2,2,3,3,3)) order by 1)
where rownum <= 2
and column_value = 3;

COLUMN_VALUE
————
3
3
[/sourcecode]
we will get different results:
[sourcecode language=”sql” highlight=”8″]
select *
from (select *
from (select * from table(odcinumberlist(1,1,1,2,2,2,3,3,3)) order by 1)
where rownum <= 2
)
where column_value = 3;

no rows selected
[/sourcecode]
Doc ID 62340.1

[collapse]

But we recently encountered a bug with it: lateral view with ROWNUM returns wrong results in case of lateral view decorrelation.
Compare results of this query with and without no_decorrelation hint:

with 
 t1(a) as (select * from table(odcinumberlist(1,3)))
,t2(b) as (select * from table(odcinumberlist(1,1,3,3)))
,t(id) as (select * from table(odcinumberlist(1,2,3)))
select
  *
from t,
     lateral(select/*+ no_decorrelate */ rownum rn 
             from t1,t2 
             where t1.a=t2.b and t1.a = t.id
            )(+)
order by 1,2;

        ID         RN
---------- ----------
         1          1
         1          2
         2
         3          1
         3          2

with 
 t1(a) as (select * from table(odcinumberlist(1,3)))
,t2(b) as (select * from table(odcinumberlist(1,1,3,3)))
,t(id) as (select * from table(odcinumberlist(1,2,3)))
select
  *
from t,
     lateral(select rownum rn 
             from t1,t2 
             where t1.a=t2.b and t1.a = t.id
            )(+)
order by 1,2;

        ID         RN
---------- ----------
         1          1
         1          2
         2
         3          3
         3          4

Of course, we can draw conclusions even from these results: we can see that in case of decorrelation(query with hint) rownum was calculated before the join. But to be sure we can check optimizer’s trace 10053:

Final query after transformations:

[sourcecode language=”sql”]
******* UNPARSED QUERY IS *******
SELECT VALUE(KOKBF$2) "ID", "VW_DCL_76980902"."RN" "RN"
FROM TABLE("ODCINUMBERLIST"(1, 2, 3)) "KOKBF$2",
(SELECT ROWNUM "RN_0", VALUE(KOKBF$0) "ITEM_3"
FROM TABLE("ODCINUMBERLIST"(1, 3)) "KOKBF$0",
TABLE("ODCINUMBERLIST"(1, 1, 3, 3)) "KOKBF$1"
WHERE VALUE(KOKBF$0) = VALUE(KOKBF$1)
) "VW_DCL_76980902"
WHERE "VW_DCL_76980902"."ITEM_3"(+) = VALUE(KOKBF$2)
ORDER BY VALUE(KOKBF$2), "VW_DCL_76980902"."RN"

*************************
[/sourcecode]

[collapse]

I’ll modify it a bit just to make it more readable:
we can see that

select
  *
from t,
     lateral(select rownum rn 
             from t1,t2 
             where t1.a=t2.b and t1.a = t.id)(+)
order by 1,2;

was transformed to

select
  t.id, dcl.rn
from t,
     (select rownum rn 
      from t1,t2 
      where t1.a=t2.b) dcl
where dcl.a(+) = t.id
order by 1,2;

And it confirms that rownum was calculated on the different dataset (t1-t2 join) without join filter by table t.
I created SR with Severity 1 (SR #3-19117219271) more than a month ago, but unfortunately Oracle development doesn’t want to fix this bug and moreover they say that is not a bug. So I think this is a dangerous precedent and probably soon we will not be able to be sure in the calculation of rownum and old fixes…

Top-N again: fetch first N rows only vs rownum

Posted on December 30, 2018 by Sayan Malakshinov Posted in adaptive serial direct path reads, CBO, oracle, query optimizing, SQL, troubleshooting 7,586 Page views Leave a comment

Three interesting myths about rowlimiting clause vs rownum have recently been posted on our Russian forum:

TopN query with rownum<=N is always faster than "fetch first N rows only" (ie. row_number()over(order by ...)<=N)
“fetch first N rows only” is always faster than rownum<=N
“SORT ORDER BY STOPKEY” stores just N top records during sorting, while “WINDOW SORT PUSHED RANK” sorts all input and stores all records sorted in memory.

Interestingly that after Vyacheslav posted first statement as an axiom and someone posted old tests(from 2009) and few people made own tests which showed that “fetch first N rows” is about 2-3 times faster than the query with rownum, the final decision was that “fetch first” is always faster.

First of all I want to show that statement #3 is wrong and “WINDOW SORT PUSHED RANK” with row_number works similarly as “SORT ORDER BY STOPKEY”:
It’s pretty easy to show using sort trace:
Let’s create simple small table Tests1 with 1000 rows where A is in range 1-1000 (just 1 block):

create table test1(a not null, b) as
  select level, level from dual connect by level<=1000;

alter session set max_dump_file_size=unlimited;
ALTER SESSION SET EVENTS '10032 trace name context forever, level 10';

ALTER SESSION SET tracefile_identifier = 'rownum';
select * from (select * from test1 order by a) where rownum<=10;

ALTER SESSION SET tracefile_identifier = 'rownumber';
select * from test1 order by a fetch first 10 rows only;

And we can see from the trace files that both queries did the same number of comparisons:

rownum:

[sourcecode language=”sql” highlight=”7″]
—– Current SQL Statement for this session (sql_id=bbg66rcbt76zt) —–
select * from (select * from test1 order by a) where rownum<=10

—- Sort Statistics ——————————
Input records 1000
Output records 10
Total number of comparisons performed 999
Comparisons performed by in-memory sort 999
Total amount of memory used 2048
Uses version 1 sort
—- End of Sort Statistics ———————–
[/sourcecode]

[collapse]

row_number

[sourcecode language=”sql” highlight=”7″]
—– Current SQL Statement for this session (sql_id=duuy4bvaz3d0q) —–
select * from test1 order by a fetch first 10 rows only

[collapse]

Ie. each row (except first one) was compared with the biggest value from top 10 values and since they were bigger than top 10 value, oracle doesn’t compare it with other TopN values.

And if we change the order of rows in the table both of these queries will do the same number of comparisons again:

from 999 to 0

[sourcecode language=”sql”]
create table test1(a not null, b) as
select 1000-level, level from dual connect by level<=1000;

alter session set max_dump_file_size=unlimited;
ALTER SESSION SET EVENTS ‘10032 trace name context forever, level 10’;

ALTER SESSION SET tracefile_identifier = ‘rownum’;
select * from (select * from test1 order by a) where rownum<=10;

ALTER SESSION SET tracefile_identifier = ‘rownumber’;
select * from test1 order by a fetch first 10 rows only;
[/sourcecode]

[collapse]

rownum

[sourcecode language=”sql” highlight=”7″]
—– Current SQL Statement for this session (sql_id=bbg66rcbt76zt) —–
select * from (select * from test1 order by a) where rownum<=10

—- Sort Statistics ——————————
Input records 1000
Output records 1000
Total number of comparisons performed 4976
Comparisons performed by in-memory sort 4976
Total amount of memory used 2048
Uses version 1 sort
—- End of Sort Statistics ———————–
[/sourcecode]

[collapse]

row_number

[sourcecode language=”sql” highlight=”7″]
—– Current SQL Statement for this session (sql_id=duuy4bvaz3d0q) —–
select * from test1 order by a fetch first 10 rows only

[collapse]

We can see that both queries required much more comparisons(4976) here, that’s because each new value is smaller than the biggest value from the topN and even smaller than lowest value, so oracle should get right position for it and it requires 5 comparisons for that (it compares with 10th value, then with 6th, 3rd, 2nd and 1st values from top10). Obviously it makes less comparisons for the first 10 rows.

Now let’s talk about statements #1 and #2:
We know that rownum forces optimizer_mode to switch to “first K rows”, because of the parameter “_optimizer_rownum_pred_based_fkr”

SQL> @param_ rownum

NAME                               VALUE  DEFLT  TYPE      DESCRIPTION
---------------------------------- ------ ------ --------- ------------------------------------------------------
_optimizer_rownum_bind_default     10     TRUE   number    Default value to use for rownum bind
_optimizer_rownum_pred_based_fkr   TRUE   TRUE   boolean   enable the use of first K rows due to rownum predicate
_px_rownum_pd                      TRUE   TRUE   boolean   turn off/on parallel rownum pushdown optimization

while fetch first/row_number doesn’t (it will be changed after the patch #22174392) and it leads to the following consequences:
1. first_rows disables serial direct reads optimization(or smartscan on Exadata), that’s why the tests with big tables showed that “fetch first” were much faster than the query with rownum.
So if we set “_serial_direct_read”=always, we get the same performance in both tests (within the margin of error).

2. In cases when index access (index full scan/index range scan) is better, CBO differently calculates the cardinality of underlying INDEX FULL(range) SCAN:
the query with rownum is optimized for first_k_rows and the cardinality of index access is equal to K rows, but CBO doesn’t reduce cardinality for “fetch first”, so the cost of index access is much higher, compare them:

rownum

[sourcecode language=”sql” highlight=”13″]
SQL> explain plan for
2 select *
3 from (select * from test order by a,b)
4 where rownum<=10;

——————————————————————————————–
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
——————————————————————————————–
| 0 | SELECT STATEMENT | | 10 | 390 | 4 (0)| 00:00:01 |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | VIEW | | 10 | 390 | 4 (0)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID| TEST | 1000K| 12M| 4 (0)| 00:00:01 |
| 4 | INDEX FULL SCAN | IX_TEST_AB | 10 | | 3 (0)| 00:00:01 |
——————————————————————————————–

Predicate Information (identified by operation id):
—————————————————

1 – filter(ROWNUM<=10)
[/sourcecode]

[collapse]

fetch first

[sourcecode language=”sql” highlight=”13″]
SQL> explain plan for
2 select *
3 from test
4 order by a,b
5 fetch first 10 rows only;

—————————————————————————————–
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
—————————————————————————————–
| 0 | SELECT STATEMENT | | 10 | 780 | | 5438 (1)| 00:00:01 |
|* 1 | VIEW | | 10 | 780 | | 5438 (1)| 00:00:01 |
|* 2 | WINDOW SORT PUSHED RANK| | 1000K| 12M| 22M| 5438 (1)| 00:00:01 |
| 3 | TABLE ACCESS FULL | TEST | 1000K| 12M| | 690 (1)| 00:00:01 |
—————————————————————————————–

Predicate Information (identified by operation id):
—————————————————

1 – filter("from$_subquery$_002"."rowlimit_$$_rownumber"<=10)
2 – filter(ROW_NUMBER() OVER ( ORDER BY "TEST"."A","TEST"."B")<=10)
[/sourcecode]

[collapse]

fetch first + first_rows

[sourcecode language=”sql” highlight=”14″]
SQL> explain plan for
2 select/*+ first_rows */ *
3 from test
4 order by a,b
5 fetch first 10 rows only;

——————————————————————————————–
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
——————————————————————————————–
| 0 | SELECT STATEMENT | | 10 | 780 | 27376 (1)| 00:00:02 |
|* 1 | VIEW | | 10 | 780 | 27376 (1)| 00:00:02 |
|* 2 | WINDOW NOSORT STOPKEY | | 1000K| 12M| 27376 (1)| 00:00:02 |
| 3 | TABLE ACCESS BY INDEX ROWID| TEST | 1000K| 12M| 27376 (1)| 00:00:02 |
| 4 | INDEX FULL SCAN | IX_TEST_AB | 1000K| | 2637 (1)| 00:00:01 |
——————————————————————————————–

Predicate Information (identified by operation id):
—————————————————

1 – filter("from$_subquery$_002"."rowlimit_$$_rownumber"<=10)
2 – filter(ROW_NUMBER() OVER ( ORDER BY "TEST"."A","TEST"."B")<=10)
[/sourcecode]

[collapse]

fetch first + index

[sourcecode language=”sql” highlight=”14″]
SQL> explain plan for
2 select/*+ index(test (a,b)) */ *
3 from test
4 order by a,b
5 fetch first 10 rows only;

Predicate Information (identified by operation id):
—————————————————

1 – filter("from$_subquery$_002"."rowlimit_$$_rownumber"<=10)
2 – filter(ROW_NUMBER() OVER ( ORDER BY "TEST"."A","TEST"."B")<=10)
[/sourcecode]

[collapse]

So in this case we can add hints “first_rows” or “index”, or install the patch #22174392.

ps. I thought to post this note later, since I hadn’t time enough to add other interesting details about the different TopN variants, including “with tie”, rank(), etc, so I’ll post another note with more details later.

How to group connected elements (or pairs)

Posted on September 29, 2017 by Sayan Malakshinov Posted in oracle, PL/SQL, PL/SQL optimization, query optimizing, SQL 1,858 Page views 7 Comments

I see quite often when developers ask questions about connected components:

Table “MESSAGES” contains fields “SENDER” and “RECIPIENT”, which store clients id.
How to quickly get all groups of clients who are connected even through other clients if the table has X million rows?
So for this table, there should be 4 groups:

(1, 2, 4, 8, 16)
(3, 6, 12)
(5, 10, 20)
(7, 14)
(9, 18)

SENDER	RECIPIENT
1	2
2	4
3	6
4	8
5	10
6	12
7	14
8	16
9	18
10	20

Of course, we can solve this problem using SQL only (model, recursive subquery factoring or connect by with nocycle), but such solutions will be too slow for huge tables.

Example of SQL solution

[sourcecode language=”sql”]
with
t(sender,recipient) as (select level,level*2 from dual connect by level<=10)
, v1 as (select rownum id,t.* from t)
, v2 as (select id, account
from v1
unpivot (
account for x in (sender,recipient)
))
, v3 as (
select
id
,account
,dense_rank()over(order by account) account_n
,count(*)over() cnt
from v2)
, v4 as (
select distinct grp,account
from v3
model
dimension by (id,account_n)
measures(id grp,account,cnt)
rules
iterate(1e6)until(iteration_number>cnt[1,1])(
grp[any,any] = min(grp)[any,cv()]
,grp[any,any] = min(grp)[cv(),any]
)
)
select
listagg(account,’,’)within group(order by account) s
from v4
group by grp
[/sourcecode]

[collapse]

In such situations it’s much better to adopt standard algorithms like Quick-find or Weighted quick-union for PL/SQL.
The first time I wrote such solution about 5 years ago and I even posted here one of the latest solutions, but all of them were not universal, so I’ve created the package today with a couple of functions for most common problems: XT_CONNECTED_COMPONENTS

It contains 2 functions based on Weighted quick-find quick-union algorithm:

function get_strings(cur in sys_refcursor, delim varchar2:=’,’) return strings_array pipelined;
It accepts a cursor and returns found connected components as table of varchar2(v_size). You can change v_size in the package definition.
Input cursor should contain one Varchar2 column with linked strings, for example: ‘a,b,c’.
You can also specify list delimiter, by default it is comma.
Examples:

select * from table(xt_connected_components.get_strings( cursor(select ELEM1||','||ELEM2 from TEST));
select * 
from
 table(
   xt_connected_components.get_strings( 
     cursor(select 'a,b,c' from dual union all
            select 'd,e,f' from dual union all
            select 'e,c'   from dual union all
            select 'z'     from dual union all
            select 'X,Y'   from dual union all
            select 'Y,Z'   from dual)));
COLUMN_VALUE
-----------------------------------------
STRINGS('X', 'Y', 'Z')
STRINGS('a', 'b', 'c', 'd', 'e', 'f')
STRINGS('z')

function get_numbers(cur in sys_refcursor) return numbers_array pipelined;
This function also returns connected components, but for numbers.
Input cursor should contain two columns with linked numbers.
Examples:

select * 
from table(
        xt_connected_components.get_numbers( 
          cursor(
            select sender_id, recipient_id from messages
        )));
select * 
from
  table(
    xt_connected_components.get_numbers( 
       cursor(
          select level   account1
               , level*2 account2 
          from dual 
          connect by level<=10
    )));
SQL> select *
  2  from
  3    table(
  4      xt_connected_components.get_numbers(
  5         cursor(
  6            select level   account1
  7                 , level*2 account2
  8            from dual
  9            connect by level<=10
 10*     )))
SQL> /

COLUMN_VALUE
------------------------
NUMBERS(1, 2, 4, 8, 16)
NUMBERS(3, 6, 12)
NUMBERS(5, 10, 20)
NUMBERS(7, 14)
NUMBERS(9, 18)

How to install:
Download all files from Github and execute “@install” in SQL*Plus or execute them in another tool in the following order:
xt_connected_components_types.sql
xt_connected_components_pkg.sql
xt_connected_components_bdy.sql

Download URL: https://github.com/xtender/xt_scripts/tree/master/extra/xt_connected_components

Why are HCC stats being bumped when Smart Scanning row major data in 12.2

Posted on May 4, 2017 by Roger MacNicol Posted in oracle, SmartScan 1,626 Page views Leave a comment

In 12.2, there is a stat “cell blocks pivoted” that points to a new optimization. When Smart Scan processes data, it has to create a new synthetic block that only contains the columns that are needed and the rows that pass the offloaded predicates.

If Smart Scan has to create a new block, why would you create a new row-major block when we can just as easily create uncompressed HCC block? It is roughly the same amount of work but once the synthetic block is returned to the RDBMS for processing, columnar data is 10X cheaper to process when the query uses rowsets. SO where does the name of this stat come from? Simple, it works just like a spreadsheet PIVOT operation, swapping rows for columns.

So what is a “rowset”? It simply means that a step in a query plan will take in a set of rows before processing and then process that entire batch of rows in one go before fetching more rows; this is significantly more efficient than processing one row at a time. You can see when a query uses rowsets via:

select * from table(dbms_xplan.display_cursor('<sql_id>',1,'-note +projection'));

in which case you will see projection metadata with details such as “(rowsets=200)”.

When Smart Scan sees that the query on the RDBMS is using rowsets, it tries to create a columnar output block instead of a row-major output block and when this arrives on the RDBMS, the table scan will see an HCC block and bump the HCC stats even though the table being scanned doesn’t use HCC. Let’s see how this works in practice:

Continue reading→

How to speed up slow unicode migration of a table with xmltype columns

Posted on February 20, 2017 by Sayan Malakshinov Posted in oracle, parallel, query optimizing, troubleshooting 2,384 Page views Leave a comment

Recently I have had an issue with slow unicode migration of the database upgraded from 10g to 12.1.0.2. The main problem was a table with xmltype: we spent about 4 hours for this table(~17GB) during test migration, though all other tables (~190GB) migrated just for about 20 minutes.
We used DMU(Database Migration Assistant for Unicode), and the root cause of the problem was update statement generated by DMU for this table:

update  /*+ PARALLEL(A,16)*/ "RRR"."T_XMLDATA" A  set A."SYS_NC00011$" = SYS_OP_CSCONV(A."SYS_NC00011$", 'AL16UTF16')

“SYS_NC00011$” was internal hidden CLOB column used to store XMLTYPE. As you can see DMU added PARALLEL hint, but though oracle can use parallel dml for xmltype since 12.1.0.1, we can’t use it because of its’ limitations:

Changes in Oracle Database 12c Release 1 (12.1.0.1) for Oracle XML DB

Parallel DML Support for XMLType
Support for parallel DML has been improved for XMLType storage model binary XML using SecureFiles LOBs. The performance and scalability have been improved for both CREATE TABLE AS SELECT and INSERT AS SELECT.

Restrictions on Parallel DML

Parallel DML can be done on tables with LOB columns provided the table is partitioned. However, intra-partition parallelism is not supported.

For non-partitioned tables with LOB columns, parallel INSERT operations are supported provided that the LOB columns are declared as SecureFiles LOBs. Parallel UPDATE, DELETE, and MERGE operations on such tables are not supported.

Btw, Oracle didn’t support parallel dml for xmltype on previous releases:

No Parallel DML for XMLType – DML operations on XMLType data are always performed in serial. Parallel DML is not supported for XMLType. (Parallel query and DDL are supported for XMLType.)

So I had to use manual parallelization:
1. Monitor “Convert application tables” step through “View Table Conversion progress” and press “Stop” button during conversion of this table.
2. Create table with ROWIDs of this table and split them into 16 groups:

create table tmp_rids as 
select rowid rid, ntile(16)over(order by rowid) grp 
from t_xmldata;

3. Execute

ALTER SYSTEM SET EVENTS '22838 TRACE NAME CONTEXT LEVEL 1,FOREVER';

to avoid “ORA-22839: Direct updates on SYS_NC columns are disallowed”
4. Start 16 sessions and each of them have to update own part:

update t_xmldata A 
set A."SYS_NC00011$" = SYS_OP_CSCONV(A."SYS_NC00011$", 'AL16UTF16') 
where rowid in (select rid from tmp_rids where grp=&grp);
commit;

5. Disable event 22838:

ALTER SYSTEM SET EVENTS '22838 TRACE NAME CONTEXT OFF';

6. Open “View Table Conversion progress” window, click on this table and change “Retry” to “Skip” option for the update step.

This simple method allowed to make unicode migration about 16 times faster.

WINDOW NOSORT STOPKEY + RANK()

Posted on March 12, 2016 by Sayan Malakshinov Posted in CBO, oracle, query optimizing, SQL 2,222 Page views Leave a comment

Recently I found that WINDOW NOSORT STOPKEY with RANK()OVER() works very inefficiently: http://www.freelists.org/post/oracle-l/RANKWINDOW-NOSORT-STOPKEY-stopkey-doesnt-work
The root cause of this behaviour is that Oracle optimizes WINDOW NOSORT STOPKEY with RANK the same way as with DENSE_RANK:

rnk1

[sourcecode language=”sql” highlight=””]
create table test(n not null) as
with gen as (select level n from dual connect by level<=100)
select g2.n as n
from gen g1, gen g2
where g1.n<=10
/
create index ix_test on test(n)
/
exec dbms_stats.gather_table_stats(”,’TEST’);
select/*+ gather_plan_statistics */ n
from (select rank()over(order by n) rnk
,n
from test)
where rnk<=3
/
select * from table(dbms_xplan.display_cursor(”,”,’allstats last’));
drop table test purge;
[/sourcecode]

[collapse]

Output

[sourcecode language=”sql” highlight=”29,30,31″]
N
———-
1
1
1
1
1
1
1
1
1
1

10 rows selected.

PLAN_TABLE_OUTPUT
———————————————————————————————————————–
SQL_ID 8tbq95dpw0gw7, child number 0
————————————-
select/*+ gather_plan_statistics */ n from (select rank()over(order by
n) rnk ,n from test) where rnk<=3

Plan hash value: 1892911073

———————————————————————————————————————–
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
———————————————————————————————————————–
| 0 | SELECT STATEMENT | | 1 | | 10 |00:00:00.01 | 3 | | | |
|* 1 | VIEW | | 1 | 1000 | 10 |00:00:00.01 | 3 | | | |
|* 2 | WINDOW NOSORT STOPKEY| | 1 | 1000 | 30 |00:00:00.01 | 3 | 73728 | 73728 | |
| 3 | INDEX FULL SCAN | IX_TEST | 1 | 1000 | 31 |00:00:00.01 | 3 | | | |
———————————————————————————————————————–

Predicate Information (identified by operation id):
—————————————————

1 – filter("RNK"<=3)
2 – filter(RANK() OVER ( ORDER BY "N")<=3)
[/sourcecode]

[collapse]

As you can see, A-Rows in plan step 2 = 30 – ie, that is the number of rows where

DENSE_RANK<=3

but not

RANK<=3

The more effective way will be to stop after first 10 rows, because 11th row already has RANK more than 3!
But we can create own STOPKEY version with PL/SQL:

PLSQL STOPKEY version

[sourcecode language="sql"]
create or replace type rowids_table is table of varchar2(18);
/
create or replace function get_rowids_by_rank(
n int
,max_rank int
)
return rowids_table pipelined
as
begin
for r in (
select/*+ index_rs_asc(t (n)) */ rowidtochar(rowid) chr_rowid, rank()over(order by n) rnk
from test t
where t.n > get_rowids_by_rank.n
order by n
)
loop
if r.rnk <= max_rank then
pipe row (r.chr_rowid);
else
exit;
end if;
end loop;
return;
end;
/
select/*+ leading(r t) use_nl(t) */
t.*
from table(get_rowids_by_rank(1, 3)) r
,test t
where t.rowid = chartorowid(r.column_value)
/
[/sourcecode]

[collapse]

In that case the fetch from a table will stop when rnk will be larger than max_rank

PRECOMPUTE_SUBQUERY hint

Posted on August 28, 2014 by Sayan Malakshinov Posted in CBO, hints, undocumented 2,881 Page views 3 Comments

I’ve just found out that we can specify query block for PRECOMPUTE_SUBQUERY: /*+ precompute_subquery(@sel$2) */
So we can use it now with SQL profiles, SPM baselines and patches.

SQL> select/*+ precompute_subquery(@sel$2) */ * from dual where dummy in (select chr(level) from dual connect by level<=100);

D
-
X

SQL> @last

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  c437vsqj7c4jy, child number 0
-------------------------------------
select/*+ precompute_subquery(@sel$2) */ * from dual where dummy in
(select chr(level) from dual connect by level<=100)

Plan hash value: 272002086

---------------------------------------------------------------------------
| Id  | Operation         | Name | E-Rows |E-Bytes| Cost (%CPU)| E-Time   |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |        |       |     2 (100)|          |
|*  1 |  TABLE ACCESS FULL| DUAL |      1 |     2 |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

   1 - SEL$1 / DUAL@SEL$1

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(("DUMMY"='' OR "DUMMY"='' OR "DUMMY"='♥' OR "DUMMY"='♦'
              OR "DUMMY"='♣' OR "DUMMY"='♠' OR "DUMMY"='' OR "DUMMY"=' OR
              "DUMMY"=' ' OR "DUMMY"=' ' OR "DUMMY"='' OR "DUMMY"='' OR "DUMMY"=' '
              OR "DUMMY"='' OR "DUMMY"='' OR "DUMMY"='►' OR "DUMMY"='◄' OR
              "DUMMY"='' OR "DUMMY"='' OR "DUMMY"='' OR "DUMMY"='' OR "DUMMY"=''
              OR "DUMMY"='' OR "DUMMY"='↑' OR "DUMMY"='↓' OR "DUMMY"='' OR
              "DUMMY"=' OR "DUMMY"='' OR "DUMMY"='' OR "DUMMY"='' OR "DUMMY"=''
              OR "DUMMY"=' ' OR "DUMMY"='!' OR "DUMMY"='"' OR "DUMMY"='#' OR
              "DUMMY"='$' OR "DUMMY"='%' OR "DUMMY"='&' OR "DUMMY"='''' OR
              "DUMMY"='(' OR "DUMMY"=')' OR "DUMMY"='*' OR "DUMMY"='+' OR "DUMMY"=','
              OR "DUMMY"='-' OR "DUMMY"='.' OR "DUMMY"='/' OR "DUMMY"='0' OR
              "DUMMY"='1' OR "DUMMY"='2' OR "DUMMY"='3' OR "DUMMY"='4' OR "DUMMY"='5'
              OR "DUMMY"='6' OR "DUMMY"='7' OR "DUMMY"='8' OR "DUMMY"='9' OR
              "DUMMY"=':' OR "DUMMY"=';' OR "DUMMY"='<' OR "DUMMY"='=' OR "DUMMY"='>'
              OR "DUMMY"='?' OR "DUMMY"='@' OR "DUMMY"='A' OR "DUMMY"='B' OR
              "DUMMY"='C' OR "DUMMY"='D' OR "DUMMY"='E' OR "DUMMY"='F' OR "DUMMY"='G'
              OR "DUMMY"='H' OR "DUMMY"='I' OR "DUMMY"='J' OR "DUMMY"='K' OR
              "DUMMY"='L' OR "DUMMY"='M' OR "DUMMY"='N' OR "DUMMY"='O' OR "DUMMY"='P'
              OR "DUMMY"='Q' OR "DUMMY"='R' OR "DUMMY"='S' OR "DUMMY"='T' OR
              "DUMMY"='U' OR "DUMMY"='V' OR "DUMMY"='W' OR "DUMMY"='X' OR "DUMMY"='Y'
              OR "DUMMY"='Z' OR "DUMMY"='[' OR "DUMMY"='\' OR "DUMMY"=']' OR
              "DUMMY"='^' OR "DUMMY"='_' OR "DUMMY"='`' OR "DUMMY"='a' OR "DUMMY"='b'
              OR "DUMMY"='c' OR "DUMMY"='d'))

PS. I’m not sure, but as far as i remember, when I tested it on 10.2, it didn’t work with specifying a query block.
And I have never seen such usage.

Strange moving filter predicates from index to table

Posted on April 24, 2014 by Sayan Malakshinov Posted in CBO, curious, oracle 2,711 Page views 9 Comments

It seems strange to me:
When all needed columns are in the index, filter predicates are expectedly applied to the index

select a,b from xt_test where a=1 and
(:b is null or b = :b)

-----------------------------------------------------------------------------------------
| Id  | Operation        | Name       | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |            |      1 |        |      1 |00:00:00.01 |       2 |
|*  1 |  INDEX RANGE SCAN| PK_XT_TEST |      1 |      1 |      1 |00:00:00.01 |       2 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("A"=1)
       filter((:B IS NULL OR "B"=:B))

But if I add another column “PAD”, the filter moves to the table filters:

select a,b,pad from xt_test where a=1
and (:b is null or b = :b)

---------------------------------------------------------------------------------------
| Id  | Operation                   | Name       | Starts | E-Rows | A-Rows | Buffers |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |            |      1 |        |      1 |       4 |
|*  1 |  TABLE ACCESS BY INDEX ROWID| XT_TEST    |      1 |      1 |      1 |       4 |
|*  2 |   INDEX RANGE SCAN          | PK_XT_TEST |      1 |     10 |     10 |       2 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter((:B IS NULL OR "B"=:B))
   2 - access("A"=1)

As workaround we can use something like that:

select--+ NO_ELIMINATE_JOIN(t) NO_ELIMINATE_JOIN(t2@sel$2) gather_plan_statistics
   a,b,pad
from xt_test t
where t.rowid in ( select t2.rowid
                   from xt_test t2
                   where a=1
                     and (:b is null or b = :b)
                 );

Plan hash value: 1464320522

--------------------------------------------------------------------------------------
| Id  | Operation                   | Name       | Starts | E-Rows | A-Rows |Buffers |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |            |      1 |        |      1 |      3 |
|   1 |  NESTED LOOPS               |            |      1 |      1 |      1 |      3 |
|*  2 |   INDEX RANGE SCAN          | PK_XT_TEST |      1 |      1 |      1 |      2 |
|   3 |   TABLE ACCESS BY USER ROWID| XT_TEST    |      1 |      1 |      1 |      1 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("A"=1)
       filter((:B IS NULL OR "B"=:B))

Full test case

[sourcecode language=”sql”]
create table xt_test(a,b,pad,constraint pk_xt_test primary key(a,b))
as select
mod(rownum,10) a
,rownum b
,rpad(rownum,10) pad
from dual
connect by level<=100;
call dbms_stats.gather_table_stats(”,’XT_TEST’);
var b number;
exec :b:=1;
select/*+ gather_plan_statistics */
a,b,pad
from xt_test
where a=1 and (:b is null or b = :b);
select * from table(dbms_xplan.display_cursor(”,”,’allstats last’));

select/*+ gather_plan_statistics */
a,b
from xt_test
where a=1 and (:b is null or b = :b);
select * from table(dbms_xplan.display_cursor(”,”,’allstats last’));

select–+ NO_ELIMINATE_JOIN(t) NO_ELIMINATE_JOIN(t2@sel$2) gather_plan_statistics
a,b,pad
from xt_test t
where t.rowid in ( select t2.rowid
from xt_test t2
where a=1
and (:b is null or b = :b)
);
select * from table(dbms_xplan.display_cursor(”,”,’allstats last’));
[/sourcecode]

[collapse]

Update:
I just forgot to mention that there is another workaround – to force concatenation:

SQL> select--+ use_concat(or_predicates(2))
  2     a,b,pad
  3  from xt_test where a=1 and (:b is null or b = :b);

         A          B PAD
---------- ---------- ----------------------------------------
         1          1 1

Plan hash value: 3582916188

----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name       | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers | Reads  |
----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |            |      1 |        |     3 (100)|      1 |00:00:00.01 |       2 |      2 |
|   1 |  CONCATENATION                |            |      1 |        |            |      1 |00:00:00.01 |       2 |      2 |
|   2 |   TABLE ACCESS BY INDEX ROWID | XT_TEST    |      1 |      1 |     1   (0)|      1 |00:00:00.01 |       2 |      2 |
|*  3 |    INDEX UNIQUE SCAN          | PK_XT_TEST |      1 |      1 |     0   (0)|      1 |00:00:00.01 |       1 |      1 |
|*  4 |   FILTER                      |            |      1 |        |            |      0 |00:00:00.01 |       0 |      0 |
|   5 |    TABLE ACCESS BY INDEX ROWID| XT_TEST    |      0 |     10 |     2   (0)|      0 |00:00:00.01 |       0 |      0 |
|*  6 |     INDEX RANGE SCAN          | PK_XT_TEST |      0 |     10 |     1   (0)|      0 |00:00:00.01 |       0 |      0 |
----------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("A"=1 AND "B"=:B)
   4 - filter(:B IS NULL)
   6 - access("A"=1)
       filter(LNNVL("B"=:B))

Unresolved quiz: Avoiding in-list iterator

Posted on March 14, 2013 by Sayan Malakshinov Posted in CBO, oracle, query optimizing 3,726 Page views 11 Comments

A couple days ago i had very interesting quiz, which is not resolved yet.
Look at this simplified query:

  select *
  from xt1,xt2
  where
       xt1.b=10
   and xt1.a=xt2.a
   and xt2.b in (1,2);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------
Plan hash value: 2715236140

----------------------------------------------------------------------------------------
| Id  | Operation                     | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |        |   100 | 36900 |   501   (0)| 00:00:07 |
|   1 |  NESTED LOOPS                 |        |       |       |            |          |
|   2 |   NESTED LOOPS                |        |   100 | 36900 |   501   (0)| 00:00:07 |
|   3 |    TABLE ACCESS BY INDEX ROWID| XT1    |   100 | 31000 |   101   (0)| 00:00:02 |
|*  4 |     INDEX RANGE SCAN          | IX_XT1 |   100 |       |     1   (0)| 00:00:01 |
|   5 |    INLIST ITERATOR            |        |       |       |            |          |
|*  6 |     INDEX RANGE SCAN          | IX_XT2 |     1 |       |     3   (0)| 00:00:01 |
|   7 |   TABLE ACCESS BY INDEX ROWID | XT2    |     1 |    59 |     4   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("XT1"."B"=10)
   6 - access("XT1"."A"="XT2"."A" AND ("XT2"."B"=1 OR "XT2"."B"=2))

Full test case

[sourcecode language=”sql”]
create table xt1 as
select
level a
, mod(level,1000) b
, lpad(1,300,1) padding
from dual
connect by level<=1e5;

create index ix_xt1 on xt1(b);

create table xt2 as
select
level a
, mod(level,5) b
, lpad(1,50,1) padding
from dual
connect by level<=1e6;

alter table xt2
add constraint uq_xt2
unique (a)
using index(create index ix_xt2 on xt2(a,b));

exec dbms_stats.gather_table_stats(”,’XT1′,cascade=>true);
exec dbms_stats.gather_table_stats(”,’XT2′,cascade=>true);

explain plan for
select *
from xt1,xt2
where
xt1.b=10
and xt1.a=xt2.a
and xt2.b in (1,2);

@?/rdbms/admin/utlxpls.sql
[/sourcecode]

[collapse]

As you see, in such queries cbo always generating plans with INLIST ITERATOR, and it is reasonably in cases when there are many rows with different values of field B for most values of A, and this number is much larger than number of values in the “INLIST”. But in such case as shown, will be better to use index range scan with access by A and filter by B:

SQL> select *
  2  from xt1,xt2
  3  where
  4       xt1.b=10
  5   and xt1.a=xt2.a
  6   and xt2.b in (1,2);

no rows selected

Statistics
----------------------------------------------------------
        ...
        505  consistent gets
SQL> -- without inlist iterator:
SQL> select *
  2  from xt1,xt2
  3  where
  4       xt1.b=10
  5   and xt1.a=xt2.a
  6   and xt2.b+0 in (1,2);

no rows selected

Statistics
----------------------------------------------------------
        ...
        305  consistent gets

But how we can do it? I know 5 options:
1. Trace event 10157
2. Rewrite code. for example replacing “b in (1,2)” to “b+0 in (1,2)”
3. Changing query with “Advanced query rewrite” (DBMS_ADVANCED_REWRITE.DECLARE_REWRITE_EQUIVALENCE)
4. Recreating index from xt2(a,b) to xt2(a,1,b)
5. Changing optimizer_mode to “rule” through hint or SQL profile/baseline

But unfortunately all of them are inapplicable for the my real problem, because i cannot for some reasons rewrite query or change query with advanced rewrite, cannot recreate/add index, and can’t change optimizer_mode, because execution plan for the real query will become worst than plan generated with CBO with inlist iterator(some operations aren’t exists in RBO).

Could anybody suggest any another solution?

UPDATE #1:
This bug is fixed now in 12.2, and patch 16516751 is available now for 11.2.0.3 Solaris64.
Changes:
1. CBO can consider filters in such cases now
2. Hint NUM_INDEX_KEYS fixed and works fine

UPDATE #2:
Very interesting solution by Igor Usoltsev(in russian):
Ignored hint USE_CONCAT(OR_PREDICATES(N)) allows to avoid inlist iterator.
Example:

select--+ USE_CONCAT(OR_PREDICATES(32767))
 * from xt1,xt2
where
     xt1.b=10
 and xt1.a=xt2.a
 and xt2.b in (1,2)
/

Plan hash value: 2884586137          -- good plan:
 
----------------------------------------------------------------------------------------
| Id  | Operation                     | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |        |       |       |   401 (100)|          |
|   1 |  NESTED LOOPS                 |        |       |       |            |          |
|   2 |   NESTED LOOPS                |        |   100 | 36900 |   401   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID| XT1    |   100 | 31000 |   101   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN          | IX_XT1 |   100 |       |     1   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN           | IX_XT2 |     1 |       |     2   (0)| 00:00:01 |
|   6 |   TABLE ACCESS BY INDEX ROWID | XT2    |     1 |    59 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   4 - access("XT1"."B"=10)
   5 - access("XT1"."A"="XT2"."A")
       filter(("XT2"."B"=1 OR "XT2"."B"=2))

From 10053 trace on nonpatched 11.2.0.3: