Oracle SQL | Author Archives

REGEXP_LIKE: strange unspecified value in parameter “modifier”

Posted on July 23, 2014 by Sayan Malakshinov Posted in bug, CBO, curious, oracle 1,745 Page views Leave a comment

Today I noticed strange thing in predicate section of execution plan for simple query with regexp_like, where 3rd parameter “MODIFIER” was not specified:

SQL> select * from dual where regexp_like(dummy,'.');

D
-
X

SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------
SQL_ID  97xuqf9cmjsta, child number 0
-------------------------------------
select * from dual where regexp_like(dummy,'.')

Plan hash value: 272002086

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |       |       |     2 (100)|          |
|*  1 |  TABLE ACCESS FULL| DUAL |     1 |     2 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter( REGEXP_LIKE ("DUMMY",'.',HEXTORAW('F07FD85CFF0700006A1116
              45010000000000000000000000FC12164501000000000000000000000000000000000000
              0010000000000000001880D85CFF07000002000000000000000000000081000000') ))


20 rows selected.

It is particularly interesting that the values in HEXTORAW() are always different for different first parameters:

SQL> select * from dual where regexp_like(dummy,'x');
...
   1 - filter( REGEXP_LIKE ("DUMMY",'x',HEXTORAW('3895D330FF0700006A1116
              45010000000000000000000000FC12164501000000000000000000000000000000000000
              0011000000000000006895D330FF07000002000000000000000000000081000000') ))

SQL> select * from dual where regexp_like(dummy,'y');
...
   1 - filter( REGEXP_LIKE ("DUMMY",'y',HEXTORAW('00DA3C3FFF0700006A1116
              45010000000000000000000000FC12164501000000000000000000000000000000000000
              00110000000000000030DA3C3FFF07000002000000000000000000000081000000') ))

SQL> select * from dual where regexp_like(dummy||'','x')
...
   1 - filter( REGEXP_LIKE ("DUMMY"||'','x',HEXTORAW('70964F2FFF0700006A
              111645010000000000000000000000FC1216450100000000000000000000000000000000
              0000001100000000000000A0964F2FFF07000002000000000000000000000081000000')
               ))

I don’t know, what does it mean, but it looks like garbage from memory.
When I noticed this, I decided to check how regexp_like will work in function-based indexes:

SQL> create table xtest as
  2    select dummy||level as str
  3    from dual
  4    connect by level<=30;

Table created.

SQL> select * from xtest where case when regexp_like(str,'1') then 1 end = 1;
...
12 rows selected.

SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------
SQL_ID  7ztp0k8c1zn2h, child number 0
-------------------------------------
select * from xtest where case when regexp_like(str,'1') then 1 end = 1

Plan hash value: 4207139086

---------------------------------------------------------------------------
| Id  | Operation         | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |       |       |       |     3 (100)|          |
|*  1 |  TABLE ACCESS FULL| XTEST |    12 |   264 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(CASE  WHEN  REGEXP_LIKE
              ("STR",'1',HEXTORAW('68F9CB32FF0700006A111645010000000000000000000000FC1
              216450100000000000000000000000000000000000000110000000000000098F9CB32FF0
              7000002000000000000000000000081000000') ) THEN 1 END =1)

SQL> create index xtest_fbi on xtest(case when regexp_like(str,'1') then 1 end);

Index created.

SQL> select * from xtest where case when regexp_like(str,'1') then 1 end = 1;
...
12 rows selected.

SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------
SQL_ID  7ztp0k8c1zn2h, child number 0
-------------------------------------
select * from xtest where case when regexp_like(str,'1') then 1 end = 1

Plan hash value: 1479471124

-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |       |       |     2 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| XTEST     |    12 |   300 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | XTEST_FBI |    12 |       |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("XTEST"."SYS_NC00002$"=1)

SQL> select column_expression from user_ind_expressions e where e.index_name='XTEST_FBI';

COLUMN_EXPRESSION
-----------------------------------------------------------------------------------------
CASE  WHEN  REGEXP_LIKE ("STR",'1') THEN 1 END

As you can see it works fine, although the predicate from first execution plan differs from the FBI expression.
Then I dumped 10053 trace and noticed that the HEXTORAW(…) function appeared in “Explain Plan Dump” only, so it looks just like plan output bug.

Strange moving filter predicates from index to table

Posted on April 24, 2014 by Sayan Malakshinov Posted in CBO, curious, oracle 2,697 Page views 9 Comments

It seems strange to me:
When all needed columns are in the index, filter predicates are expectedly applied to the index

select a,b from xt_test where a=1 and
(:b is null or b = :b)

-----------------------------------------------------------------------------------------
| Id  | Operation        | Name       | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |            |      1 |        |      1 |00:00:00.01 |       2 |
|*  1 |  INDEX RANGE SCAN| PK_XT_TEST |      1 |      1 |      1 |00:00:00.01 |       2 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("A"=1)
       filter((:B IS NULL OR "B"=:B))

But if I add another column “PAD”, the filter moves to the table filters:

select a,b,pad from xt_test where a=1
and (:b is null or b = :b)

---------------------------------------------------------------------------------------
| Id  | Operation                   | Name       | Starts | E-Rows | A-Rows | Buffers |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |            |      1 |        |      1 |       4 |
|*  1 |  TABLE ACCESS BY INDEX ROWID| XT_TEST    |      1 |      1 |      1 |       4 |
|*  2 |   INDEX RANGE SCAN          | PK_XT_TEST |      1 |     10 |     10 |       2 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter((:B IS NULL OR "B"=:B))
   2 - access("A"=1)

As workaround we can use something like that:

select--+ NO_ELIMINATE_JOIN(t) NO_ELIMINATE_JOIN(t2@sel$2) gather_plan_statistics
   a,b,pad
from xt_test t
where t.rowid in ( select t2.rowid
                   from xt_test t2
                   where a=1
                     and (:b is null or b = :b)
                 );

Plan hash value: 1464320522

--------------------------------------------------------------------------------------
| Id  | Operation                   | Name       | Starts | E-Rows | A-Rows |Buffers |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |            |      1 |        |      1 |      3 |
|   1 |  NESTED LOOPS               |            |      1 |      1 |      1 |      3 |
|*  2 |   INDEX RANGE SCAN          | PK_XT_TEST |      1 |      1 |      1 |      2 |
|   3 |   TABLE ACCESS BY USER ROWID| XT_TEST    |      1 |      1 |      1 |      1 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("A"=1)
       filter((:B IS NULL OR "B"=:B))

Full test case

[sourcecode language=”sql”]
create table xt_test(a,b,pad,constraint pk_xt_test primary key(a,b))
as select
mod(rownum,10) a
,rownum b
,rpad(rownum,10) pad
from dual
connect by level<=100;
call dbms_stats.gather_table_stats(”,’XT_TEST’);
var b number;
exec :b:=1;
select/*+ gather_plan_statistics */
a,b,pad
from xt_test
where a=1 and (:b is null or b = :b);
select * from table(dbms_xplan.display_cursor(”,”,’allstats last’));

select/*+ gather_plan_statistics */
a,b
from xt_test
where a=1 and (:b is null or b = :b);
select * from table(dbms_xplan.display_cursor(”,”,’allstats last’));

select–+ NO_ELIMINATE_JOIN(t) NO_ELIMINATE_JOIN(t2@sel$2) gather_plan_statistics
a,b,pad
from xt_test t
where t.rowid in ( select t2.rowid
from xt_test t2
where a=1
and (:b is null or b = :b)
);
select * from table(dbms_xplan.display_cursor(”,”,’allstats last’));
[/sourcecode]

[collapse]

Update:
I just forgot to mention that there is another workaround – to force concatenation:

SQL> select--+ use_concat(or_predicates(2))
  2     a,b,pad
  3  from xt_test where a=1 and (:b is null or b = :b);

         A          B PAD
---------- ---------- ----------------------------------------
         1          1 1

Plan hash value: 3582916188

----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name       | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers | Reads  |
----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |            |      1 |        |     3 (100)|      1 |00:00:00.01 |       2 |      2 |
|   1 |  CONCATENATION                |            |      1 |        |            |      1 |00:00:00.01 |       2 |      2 |
|   2 |   TABLE ACCESS BY INDEX ROWID | XT_TEST    |      1 |      1 |     1   (0)|      1 |00:00:00.01 |       2 |      2 |
|*  3 |    INDEX UNIQUE SCAN          | PK_XT_TEST |      1 |      1 |     0   (0)|      1 |00:00:00.01 |       1 |      1 |
|*  4 |   FILTER                      |            |      1 |        |            |      0 |00:00:00.01 |       0 |      0 |
|   5 |    TABLE ACCESS BY INDEX ROWID| XT_TEST    |      0 |     10 |     2   (0)|      0 |00:00:00.01 |       0 |      0 |
|*  6 |     INDEX RANGE SCAN          | PK_XT_TEST |      0 |     10 |     1   (0)|      0 |00:00:00.01 |       0 |      0 |
----------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("A"=1 AND "B"=:B)
   4 - filter(:B IS NULL)
   6 - access("A"=1)
       filter(LNNVL("B"=:B))

To_char, Infinity and NaN

Posted on April 1, 2014 by Sayan Malakshinov Posted in curious, oracle 3,282 Page views Leave a comment

Funny that oracle can easily cast ‘nan’,’inf’,’infinity’,’-inf’,’-infinity’ to corresponding binary_float_infinity,binary_double_nan, but there is no any format models for to_char(binary_float_infinity,format) or to_binary_***(text_expr,format) that can output the same as to_char(binary_float_infinity)/to_binary_float(‘inf’) without format parameter:

If a BINARY_FLOAT or BINARY_DOUBLE value is converted to CHAR or NCHAR, and the input is either infinity or NaN (not a number), then Oracle always returns the pound signs to replace the value.

Little example:

SQL> select to_binary_float('inf') from dual;

TO_BINARY_FLOAT('INF')
----------------------
                   Inf

SQL> select to_binary_float('inf','9999') from dual;
select to_binary_float('inf','9999') from dual
                       *
ERROR at line 1:
ORA-01722: invalid number

SQL> select
  2     to_char(binary_float_infinity)         without_format
  3    ,to_char(binary_float_infinity,'99999') with_format
  4    ,to_char(1e6d,'99999')                  too_large
  5  from dual;

WITHOUT_F WITH_FORMAT        TOO_LARGE
--------- ------------------ ------------------
Inf       ######             ######

SQL> select to_char(0/0f) without_format, to_char(0/0f,'tme') with_format from dual;

WITHOUT_F WITH_FORMAT
--------- --------------------------------------------------------------------------
Nan       ################################################################

ps. it’s just crossposting of my old blog.

Deterministic functions, result_cache and operators

Posted on March 31, 2014 by Sayan Malakshinov Posted in deterministic functions, oracle, result_cache 8,852 Page views Leave a comment

In previous posts about caching mechanism of determinstic functions I wrote that cached results are kept only between fetch calls, but there is one exception from this rule: if all function parameters are literals, cached result will not be flushed every fetch call.
Little example with difference:

SQL> create or replace function f_deterministic(p varchar2)
  2     return varchar2
  3     deterministic
  4  as
  5  begin
  6     dbms_output.put_line(p);
  7     return p;
  8  end;
  9  /
SQL> set arrays 2 feed on;
SQL> set serverout on;
SQL> select
  2     f_deterministic(x) a
  3    ,f_deterministic('literal') b
  4  from (select 'not literal' x
  5        from dual
  6        connect by level<=10
  7       );

A                              B
------------------------------ ------------------------------
not literal                    literal
not literal                    literal
not literal                    literal
not literal                    literal
not literal                    literal
not literal                    literal
not literal                    literal
not literal                    literal
not literal                    literal
not literal                    literal

10 rows selected.

not literal
literal
not literal
not literal
not literal
not literal
not literal

As you can see, ‘literal’ was printed once, but ‘not literal’ was printed 6 times, so it was returned from cache 4 times.

Also i want to show the differences in consistency between:
1. Calling a function with determinstic and result_cache;
2. Calling an operator for function with result_cache;
3. Calling an operator for function with deterministic and result_cache;

In this example I will do updates in autonomouse transactions to emulate updates in another session during query execution:

Tables and procedures with updates

[sourcecode language=”sql”]
drop table t1 purge;
drop table t2 purge;
drop table t3 purge;

create table t1 as select 1 id from dual;
create table t2 as select 1 id from dual;
create table t3 as select 1 id from dual;

create or replace procedure p1_update as
pragma autonomous_transaction;
begin
update t1 set id=id+1;
commit;
end;
/
create or replace procedure p2_update as
pragma autonomous_transaction;
begin
update t2 set id=id+1;
commit;
end;
/
create or replace procedure p3_update as
pragma autonomous_transaction;
begin
update t3 set id=id+1;
commit;
end;
/
[/sourcecode]

[collapse]

Variant 1

[sourcecode language=”sql”]
create or replace function f1(x varchar2) return number result_cache deterministic
as
r number;
begin
select id into r from t1;
p1_update;
return r;
end;
/
[/sourcecode]

[collapse]

Variant 2

[sourcecode language=”sql”]
create or replace function f2(x varchar2) return number result_cache
as
r number;
begin
select id into r from t2;
p2_update;
return r;
end;
/
create or replace operator o2
binding(varchar2)
return number
using f2
/
[/sourcecode]

[collapse]

Variant 3

[sourcecode language=”sql”]
create or replace function f3(x varchar2) return number result_cache deterministic
as
r number;
begin
select id into r from t3;
p3_update;
return r;
end;
/
create or replace operator o3
binding(varchar2)
return number
using f3
/
[/sourcecode]

[collapse]

Test:

SQL> set arrays 2;
SQL> select
  2     f1(dummy) variant1
  3    ,o2(dummy) variant2
  4    ,o3(dummy) variant3
  5  from dual
  6  connect by level<=10;

  VARIANT1   VARIANT2   VARIANT3
---------- ---------- ----------
         1          1          1
         2          1          1
         2          1          1
         3          1          1
         3          1          1
         4          1          1
         4          1          1
         5          1          1
         5          1          1
         6          1          1

10 rows selected.

SQL> /

  VARIANT1   VARIANT2   VARIANT3
---------- ---------- ----------
         7         11         11
         8         11         11
         8         11         11
         9         11         11
         9         11         11
        10         11         11
        10         11         11
        11         11         11
        11         11         11
        12         11         11

10 rows selected.

We can see that function F1 returns same results every 2 execution – it is equal to fetch size(“set arraysize 2”),
operator O2 and O3 return same results for all rows in first query execution, but in the second query executions we can see that they are incremented by 10 – it’s equal to number of rows.
What we can learn from that:
1. The use of the function F1 with result_cache and deterministic reduces function executions, but all function results are inconsistent with query;
2. Operator O2 returns consistent results, but function is always executed because we invalidating result_cache every execution;
3. Operator O3 works as well as operator O2, without considering that function is deterministic.

All tests scripts: tests.zip

When v$sesstat statistics are updated

Posted on March 21, 2014 by Sayan Malakshinov Posted in oracle, PL/SQL, runstats, statistics, troubleshooting, undocumented 2,926 Page views Leave a comment

Craig Shallahamer wrote excellent article “When is v$sesstat really updated?”.
And my today post just a little addition and correction about the difference of updating ‘Db time’ and ‘CPU used by this session’ statistics.

Test #1

In this test I want to show that the statistics will be updated after every fetch call.
I have set arraysize=2, so sql*plus will fetch by 2 rows:
(full script)

-- Result will be fetched by 2 rows:
set arraysize 2;
-- this query generates CPU consumption 
-- in the scalar subquery on fetch phase,
-- so CPU consumption will be separated 
-- into several periods between fetch calls:
with gen as (
            select/*+ materialize */
               level n, lpad(level,400) padding
            from dual
            connect by level<=200
            )
    ,stat as (
            select/*+ inline */
               sid,name,value 
            from v$mystat st, v$statname sn
            where st.statistic#=sn.statistic#
              and sn.name in ('DB time'
                             ,'CPU used by this session'
                             ,'user calls'
                             ,'recursive calls')
            )
--first rows just for avoiding SQL*Plus effect with fetching 1 row at start,
-- others will be fetched by "arraysize" rows:
select null rn,null cnt,null dbtime,null cpu,null user_calls, null recursive_calls from dual
union all -- main query:
select
   rownum rn
  ,(select count(*) from gen g1, gen g2, gen g3 where g1.n>g2.n and g1.n*0=main.n*0) cnt
  ,(select value from stat where sid*0=n*0 and name = 'DB time'                    ) dbtime
  ,(select value from stat where sid*0=n*0 and name = 'CPU used by this session'   ) cpu
  ,(select value from stat where sid*0=n*0 and name = 'user calls'                 ) user_calls
  ,(select value from stat where sid*0=n*0 and name = 'recursive calls'            ) recursive_calls
from gen main
where rownum<=10;
set arraysize 15;

Test results:

SQL> @tests/dbtime

        RN        CNT     DBTIME        CPU USER_CALLS RECURSIVE_CALLS
---------- ---------- ---------- ---------- ---------- ---------------

         1    3980000      12021      11989        200             472
         2    3980000      12021      11989        200             472
         3    3980000      12121      12089        201             472
         4    3980000      12121      12089        201             472
         5    3980000      12220      12186        202             472
         6    3980000      12220      12186        202             472
         7    3980000      12317      12283        203             472
         8    3980000      12317      12283        203             472
         9    3980000      12417      12383        204             472
        10    3980000      12417      12383        204             472

As you can see the statistics are updated after every fetch call.

Test #2

Now since we already tested simple sql query, I want to do a little bit more complicated test with PL/SQL:
I’m going to write single PL/SQL block with next algorithm:
1. Saving stats
2. Executing some pl/sql code with CPU consumption
3. Getting statistics difference
4. Starting query from first test
5. Fetch 10 rows
6. Getting statistics difference
7. Fetch next 10 rows
8. Getting statistics difference
9. Fetch next 10 rows
10. Getting statistics difference
And after executing this block, i want to check statistics.

Full script:

set feed off;

-- saving previous values
column st_dbtime      new_value prev_dbtime      noprint;
column st_cpu_time    new_value prev_cputime     noprint;
column st_user_calls  new_value prev_user_calls  noprint;
column st_recur_calls new_value prev_recur_calls noprint;

select   max(decode(sn.NAME,'DB time'                  ,st.value))*10 st_dbtime
        ,max(decode(sn.NAME,'CPU used by this session' ,st.value))*10 st_cpu_time
        ,max(decode(sn.NAME,'user calls'               ,st.value))    st_user_calls
        ,max(decode(sn.NAME,'recursive calls'          ,st.value))    st_recur_calls
from v$mystat st, v$statname sn
where st.statistic#=sn.statistic# 
  and sn.name in ('DB time','CPU used by this session'
                 ,'user calls','recursive calls'
                 )
/
-- variable for output from pl/sql block: 
var output varchar2(4000);

prompt Executing test...;
----- main test:
declare
   cnt int;
   st_dbtime      number; 
   st_cpu_time    number; 
   st_user_calls  number; 
   st_recur_calls number; 
   cursor c is 
      with gen as (select/*+ materialize */
                     level n, lpad(level,400) padding
                   from dual
                   connect by level<=200)
      select
          rownum rn
        , (select count(*) from gen g1, gen g2, gen g3 where g1.n>g2.n and g1.n*0=main.n*0) cnt
      from gen main
      where rownum<=60;
   
   type ctype is table of c%rowtype;
   c_array ctype;
   
   procedure SnapStats(descr varchar2:=null)
   is
      st_new_dbtime      number;
      st_new_cpu_time    number;
      st_new_user_calls  number;
      st_new_recur_calls number;
   begin
      select   max(decode(sn.NAME,'DB time'                 ,st.value))*10 st_dbtime
              ,max(decode(sn.NAME,'CPU used by this session',st.value))*10 st_cpu_time
              ,max(decode(sn.NAME,'user calls'              ,st.value))    st_user_calls
              ,max(decode(sn.NAME,'recursive calls'         ,st.value))    st_recur_calls
          into st_new_dbtime,st_new_cpu_time,st_new_user_calls,st_new_recur_calls
      from v$mystat st, v$statname sn
      where st.statistic#=sn.statistic#
        and sn.name in ('DB time','CPU used by this session'
                       ,'user calls','recursive calls'
                       );
      if descr is not null then
         :output:= :output || descr ||':'||chr(10)
                || 'sesstat dbtime:     ' || (st_new_dbtime      - st_dbtime      )||chr(10)
                || 'sesstat cputime:    ' || (st_new_cpu_time    - st_cpu_time    )||chr(10)
                || 'sesstat user calls: ' || (st_new_user_calls  - st_user_calls  )||chr(10)
                || 'sesstat recur calls:' || (st_new_recur_calls - st_recur_calls )||chr(10)
                || '======================================'||chr(10);
      end if;
      st_dbtime      := st_new_dbtime     ;
      st_cpu_time    := st_new_cpu_time   ;
      st_user_calls  := st_new_user_calls ;
      st_recur_calls := st_new_recur_calls;
   end;
   
begin
   -- saving previous stats:
   SnapStats;

   -- generating cpu load:
   for i in 1..1e7 loop
      cnt:=cnt**2+cnt**1.3-cnt**1.2;
   end loop;
   -- getting new stats:
   SnapStats('After pl/sql loop');
   
   open c;
   SnapStats('After "open c"');
   fetch c bulk collect into c_array limit 10;
   SnapStats('After fetch 10 rows');
   fetch c bulk collect into c_array limit 10;
   SnapStats('After fetch 20 rows');
   fetch c bulk collect into c_array limit 10;
   SnapStats('After fetch 30 rows');
   close c;
   SnapStats('After close c');
end;
/ 

prompt 'Delta stats after statement(ms):';
select   max(decode(sn.NAME,'DB time'                 ,st.value))*10
          - &&prev_dbtime      as delta_dbtime
        ,max(decode(sn.NAME,'CPU used by this session',st.value))*10
          - &&prev_cputime     as delta_cpu_time
        ,max(decode(sn.NAME,'user calls'              ,st.value))  
          - &&prev_user_calls  as delta_user_calls
        ,max(decode(sn.NAME,'recursive calls'         ,st.value))  
          - &&prev_recur_calls as delta_recur_calls
from v$mystat st, v$statname sn
where st.statistic#=sn.statistic# 
  and sn.name in ('DB time','CPU used by this session'
                 ,'user calls','recursive calls'
                 )
/
prompt 'Test results:';
col output format a40;
print output;
set feed off;

Output:

SQL> @tests/dbtime2

Executing test...
'Delta stats after statement(ms):'

DELTA_DBTIME DELTA_CPU_TIME DELTA_USER_CALLS DELTA_RECUR_CALLS
------------ -------------- ---------------- -----------------
       18530          18460                5                33

Test results:
OUTPUT
----------------------------------------
After pl/sql loop:
sesstat dbtime:     0
sesstat cputime:    4350
sesstat user calls: 0
sesstat recur calls:2
======================================
After "open c":
sesstat dbtime:     0
sesstat cputime:    20
sesstat user calls: 0
sesstat recur calls:4
======================================
After fetch 10 rows:
sesstat dbtime:     0
sesstat cputime:    4680
sesstat user calls: 0
sesstat recur calls:2
======================================
After fetch 20 rows:
sesstat dbtime:     0
sesstat cputime:    4680
sesstat user calls: 0
sesstat recur calls:2
======================================
After fetch 30 rows:
sesstat dbtime:     0
sesstat cputime:    4690
sesstat user calls: 0
sesstat recur calls:2
======================================
After close c:
sesstat dbtime:     0
sesstat cputime:    0
sesstat user calls: 0
sesstat recur calls:3
======================================

We can notice that “CPU time” is updated at the same time as recursive calls, but “DB time” is updated only with “User calls”. Although this difference is not so important(because in most cases we can use other statistics in sum), but i think, if you want to instrument some code, it gives reason to check out desirable statistics for update time.

Single SQL vs SQL+PL/SQL

Posted on February 28, 2014 by Sayan Malakshinov Posted in oracle, PL/SQL, PL/SQL optimization, query optimizing 3,202 Page views 1 Comment

Everyone knows Tom Kyte’s mantra:

You should do it in a single SQL statement if at all possible.

But we all know that “Every rule has an exception”
There are many different cases when pl/sql with sql can be more efficient than only sql, and i dont want to catalog them. I just want to show a couple examples of such exceptions:

1. Running totals by several dimensions

Simple example from forum:

select dt,
       dim1,
       dim2,
       val,
       sum(val) over(partition by dim1 order by dt) dim1_cumulative_sum,
       sum(val) over(partition by dim2 order by dt) dim2_cumulative_sum,
       sum(val) over(partition by dim1, dim2 order by dt) dim1_dim2_cumulative_sum
  from mg_t
 order by dt;

This query will be very hard for big data sets, so we can do it efficiently with pl/sql:

create or replace function xf_to_drop return xt2_to_drop pipelined
is
   type tt  is table of number index by pls_integer;
   type tt2 is table of tt index by pls_integer;
   dim1_c tt;
   dim2_c tt;
   dim12_c tt2;
begin
   for r in (
            select dt,
                   dim1,
                   dim2,
                   val
              from mg_t
             order by dt
   )
   loop
      dim1_c(r.dim1):=case when dim1_c.exists(r.dim1) then dim1_c(r.dim1) else 0 end + r.val;
      dim2_c(r.dim1):=case when dim2_c.exists(r.dim1) then dim2_c(r.dim1) else 0 end + r.val;
      dim12_c(r.dim1)(r.dim2):=case
                                  when dim12_c.exists(r.dim1)
                                   and dim12_c(r.dim1).exists(r.dim2)
                                  then dim12_c(r.dim1)(r.dim2)
                                  else 0
                               end + r.val;
      pipe row (xt1_to_drop( r.dt
                            ,r.dim1
                            ,r.dim2
                            ,r.val
                            ,dim1_c(r.dim1)
                            ,dim1_c(r.dim1)
                            ,dim12_c(r.dim1)(r.dim2)
                           ));
   end loop;
end;
/

Full example

[sourcecode language=”sql”]
create table mg_t as
select trunc(sysdate) + level/1440 dt,
trunc(3 * dbms_random.value()) dim1,
trunc(3 * dbms_random.value()) dim2,
trunc(100 * dbms_random.value()) val
from dual
connect by level <= 3e6;

create type xt1_to_drop is object(
dt date
,dim1 number
,dim2 number
,val number
,dim1_cumulative_sum number
,dim2_cumulative_sum number
,dim1_dim2_cumulative_sum number
);
create type xt2_to_drop as table of xt1_to_drop;

create or replace function xf_to_drop return xt2_to_drop pipelined
is
type tt is table of number index by pls_integer;
type tt2 is table of tt index by pls_integer;
dim1_c tt;
dim2_c tt;
dim12_c tt2;
begin
for r in (
select dt,
dim1,
dim2,
val
from mg_t
order by dt
)
loop
dim1_c(r.dim1):=case when dim1_c.exists(r.dim1) then dim1_c(r.dim1) else 0 end + r.val;
dim2_c(r.dim1):=case when dim2_c.exists(r.dim1) then dim2_c(r.dim1) else 0 end + r.val;
dim12_c(r.dim1)(r.dim2):=case
when dim12_c.exists(r.dim1)
and dim12_c(r.dim1).exists(r.dim2)
then dim12_c(r.dim1)(r.dim2)
else 0
end + r.val;
pipe row (xt1_to_drop( r.dt,r.dim1,r.dim2,r.val,dim1_c(r.dim1),dim1_c(r.dim1),dim12_c(r.dim1)(r.dim2)));
end loop;
end;
/
exec for r in (select * from table(xf_to_drop)) loop null; end loop;
[/sourcecode]

[collapse]

2. Finding connected components

Assume that we have big table with many-to-many relationship:

create table test (clientid NUMBER(10), accountid NUMBER(10));

How we can find all connected groups?

This example also taken from our russian forum and there was very good and simple sql-only solution, but it’s not efficient on big data sets:

select min(group_member_id) as group_max_id, accountid, clientid
  from  (select clientid as group_member_id
         , connect_by_root accountid as accountid
         , connect_by_root clientid  as clientid
      from test
      connect by nocycle decode(accountid, prior accountid, 1, 0)
                       + decode(clientid,  prior clientid,  1, 0)
                       = 1
  ) a
  group by accountid, clientid
  order by group_max_id, accountid
/

This pure SQL solution is for the cases when ClientId and AccountId are different entities. If they are the same entities in your case, you need to use UNION ALL:
select min(group_member_id) as group_max_id, accountid, clientid
from (select clientid as group_member_id
, connect_by_root accountid as accountid
, connect_by_root clientid as clientid
from test
connect by nocycle decode(accountid, prior accountid, 1, 0)
+ decode(clientid, prior clientid, 1, 0)
= 1
) a
group by accountid, clientid
order by group_max_id, accountid
/

select min(group_member_id) as group_max_id, accountid, clientid
  from  (select clientid as group_member_id
         , connect_by_root accountid as accountid
         , connect_by_root clientid  as clientid
      from (select accountid, clientid from test union all select clientid,accountid from test)
      connect by nocycle decode(accountid, prior accountid, 1, 0)
                       + decode(clientid,  prior clientid,  1, 0)
                       = 1
  ) a
  group by accountid, clientid
  order by group_max_id, accountid
/

We can try to remember algorithms courses and adopt one of the several algorithms for connected components:

Based on weighted quick-union algorithm

[sourcecode language=”sql”]
declare
type int_array is table of pls_integer index by pls_integer;
type arr_elems is table of sys.ku$_objnumset index by pls_integer;
root int_array;
root_elems arr_elems;

n int;
clients int_array;
accounts int_array;

l integer:=dbms_utility.get_time();

procedure print(v in varchar2) is
begin
dbms_output.put_line(to_char((dbms_utility.get_time-l)/100,’0999.99′)||’ ‘||v);
l:=dbms_utility.get_time();
end;

function get_root(n int) return pls_integer is
begin
if root.exists(n) then
return root(n);
else
return null;
end if;
end;

procedure update_root(old_root pls_integer,new_root pls_integer) is
i pls_integer;
elem pls_integer;
cnt_old pls_integer;
cnt_new pls_integer;
begin
if old_root!=new_root then
–root_elems(new_root):=root_elems(new_root) multiset union all root_elems(old_root);
cnt_old:=root_elems(old_root).count;
cnt_new:=root_elems(new_root).count;
root_elems(new_root).extend(cnt_old);
for i in 1..cnt_old
loop
elem := root_elems(old_root)(i);
root(elem):=new_root;
root_elems(new_root)(cnt_new+i):=elem;
end loop;
root_elems(old_root).delete;
end if;
end;

procedure add_elem(p_root pls_integer, p_elem pls_integer) is
begin
if not root_elems.exists(p_root) then
root_elems(p_root):=sys.ku$_objnumset(p_elem);
else
root_elems(p_root).extend();
root_elems(p_root)(root_elems(p_root).count):=p_elem;
end if;
end;

procedure add_link(clientid pls_integer,accountid pls_integer) is
r1 pls_integer;
r2 pls_integer;
new_root pls_integer;
begin
r1:=get_root(clientid);
r2:=get_root(accountid);

if r1 is null or r2 is null then
new_root := coalesce(r1,r2,clientid);
if r1 is null then add_elem(new_root,clientid ); root(clientid) :=new_root; end if;
if r2 is null then add_elem(new_root,accountid); root(accountid):=new_root; end if;
else
new_root := least(r1,r2);
root(clientid) :=new_root;
root(accountid):=new_root;
update_root(greatest(r1,r2),new_root);
end if;
end;

function str_format(p int) return varchar2 is
begin
return utl_lms.format_message(‘(%d, %d) = group #%d’
,clients(p)
,accounts(p)
,get_root(clients(p))
);
end;
begin
print(‘start’);
select clientid,accountid
bulk collect into clients,accounts
from test
— where rownum<=1000
;
print(‘fetched’);
n:=clients.count;
dbms_output.put_line(‘count=’||n);
for i in 1..n loop
add_link(clients(i),accounts(i));
end loop;
print(‘processed’);
—
/*
for i in 1..n loop
dbms_output.put_line(str_format(i));
end loop;
— */
end;
[/sourcecode]

[collapse]

We can also try even more interesting special algorithms for parallel processing: CONNECTED COMPONENTS ALGORITHMS
FOR MESH-CONNECTED PARALLEL COMPUTERS

Friday prank: select from join join join

Posted on February 21, 2014 by Sayan Malakshinov Posted in curious, oracle 2,845 Page views 1 Comment

Valid funny queries 🙂

select the join from join join join using(the,some) 
/
select some join from left join join right using(some,the) 
/
select 1 join from join join join join join using(the) on 1=1
/
select the some from join 
where the=some( the(select some from join) 
               ,the(select the  from join)
               ) 
/

tables

[sourcecode language=”sql”]
create table join as select 1 some,1 the from dual;
create table left as select 1 some,1 the from dual;
create table right as select 1 some,1 the from dual;
[/sourcecode]

[collapse]

🙂

Little addition 🙂

update two tables set join=2;
select join from two tables;
select first, second, random from two tables join three tables on 1=1;

Another one 🙂

select first
     , case when the=some(some,some) then join end true
from two tables join three tables using(random);

….
and…

select random columns from two tables join four tables on the=some(some,some);

Inlist iterator again

Posted on February 21, 2014 by Sayan Malakshinov Posted in curious, oracle, query optimizing 3,138 Page views Leave a comment

Several months ago I wrote about avoiding inlist iterator, but this post about how to force inlist iterator in queries like that:

select *
from xt_test1 t1
    ,xt_test2 t2
where 
     t1.a = :a 
 and t2.id in (a,b)

i.e. when we need to get rows from big table using index by list of values from another table.
A couple of workarounds

Oracle 12c: scalar subqueries

Posted on February 11, 2014 by Sayan Malakshinov Posted in 12c, CBO, oracle, undocumented 2,828 Page views Leave a comment

We already know that the CBO transformation engine in 12c can unnest scalar subqueries from select-list.
So it’s not very surprising, that CBO is now able to add scalar subqueries costs to total query cost (even if “_optimizer_unnest_scalar_sq” = false):

Before 12.1

[sourcecode language=”sql” highlight=”15,17″]
SQL> explain plan for
2 select
3 (select count(*) from XT_TEST) cnt
4 from dual;

Explained.

PLAN_TABLE_OUTPUT
—————————————————————————
Plan hash value: 2843533371

—————————————————————————
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
—————————————————————————
| 0 | SELECT STATEMENT | | 1 | 2 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | INDEX FAST FULL SCAN| IX_TEST_A | 90792 | 50 (0)| 00:00:01 |
| 3 | FAST DUAL | | 1 | 2 (0)| 00:00:01 |
—————————————————————————

10 rows selected.
[/sourcecode]

[collapse]

12.1

[sourcecode language=”sql” highlight=”19,21″]
SQL> alter session set "_optimizer_unnest_scalar_sq"=false;

Session altered.

SQL> explain plan for
2 select
3 (select count(*) from XT_TEST) cnt
4 from dual;

Explained.

PLAN_TABLE_OUTPUT
—————————————————————————
Plan hash value: 2843533371

—————————————————————————
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
—————————————————————————
| 0 | SELECT STATEMENT | | 1 | 52 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | INDEX FAST FULL SCAN| IX_TEST_A | 90792 | 50 (0)| 00:00:01 |
| 3 | FAST DUAL | | 1 | 2 (0)| 00:00:01 |
—————————————————————————

10 rows selected.
[/sourcecode]

[collapse]

But it’s interesting that correlated subquery can reference now to a column from parent tables more
than one level above:

Before 12.1

[sourcecode language=”sql” highlight=”10,11″]
SQL> with t1 as (select/*+ materialize */ 1 a from dual)
2 ,t2 as (select/*+ materialize */ 2 b from dual)
3 ,t3 as (select/*+ materialize */ 3 c from dual)
4 select
5 (select s from (select sum(b*c) s from t2,t3 where c>t1.a and c>b)) s
6 from t1;
(select s from (select sum(b*c) s from t2,t3 where c>t1.a and c>b)) s
*
ERROR at line 5:
ORA-00904: "T1"."A": invalid identifier
[/sourcecode]

[collapse]

12.1

[sourcecode language=”sql”]
SQL> with t1 as (select/*+ materialize */ 1 a from dual)
2 ,t2 as (select/*+ materialize */ 2 b from dual)
3 ,t3 as (select/*+ materialize */ 3 c from dual)
4 select
5 (select s from (select sum(b*c) s from t2,t3 where c>t1.a and c>b)) s
6 from t1;

S
———-
6
[/sourcecode]

[collapse]

SYS_OP_MAP_NONNULL is in the documentation now

Posted on February 10, 2014 by Sayan Malakshinov Posted in 12c, documentation, oracle, undocumented 4,412 Page views Leave a comment

Interesting, that SYS_OP_MAP_NONNULL appeared in the Oracle 12c documentation: Choosing Indexes for Materialized Views

Lazy tip: By the way, with length limitations, we can also use documented dump function:

SQL> with
  2    t(a,b) as (
  3               select *
  4               from table(ku$_vcnt(null,'FF','A'))
  5                   ,table(ku$_vcnt(null,'FF','B'))
  6              )
  7  select
  8      a,b
  9     ,case when sys_op_map_nonnull(a) = sys_op_map_nonnull(b) then '=' else '!=' end comp1
 10     ,case when dump(a,1017)          = dump(b,1017)          then '=' else '!=' end comp2
 11     ,sys_op_map_nonnull(a) s_o_m_n_a
 12     ,sys_op_map_nonnull(b) s_o_m_n_b
 13     ,dump(a,  17) dump_a
 14     ,dump(b,  17) dump_b -- it is preferably sometimes to use 1017 - for charset showing
 15  from t;

A     B     COMP1 COMP2 S_O_M_N_A  S_O_M_N_B  DUMP_A                DUMP_B
----- ----- ----- ----- ---------- ---------- --------------------- ---------------------
            =     =     FF         FF         NULL                  NULL
      FF    !=    !=    FF         464600     NULL                  Typ=1 Len=2: F,F
      B     !=    !=    FF         4200       NULL                  Typ=1 Len=1: B
FF          !=    !=    464600     FF         Typ=1 Len=2: F,F      NULL
FF    FF    =     =     464600     464600     Typ=1 Len=2: F,F      Typ=1 Len=2: F,F
FF    B     !=    !=    464600     4200       Typ=1 Len=2: F,F      Typ=1 Len=1: B
A           !=    !=    4100       FF         Typ=1 Len=1: A        NULL
A     FF    !=    !=    4100       464600     Typ=1 Len=1: A        Typ=1 Len=2: F,F
A     B     !=    !=    4100       4200       Typ=1 Len=1: A        Typ=1 Len=1: B

9 rows selected.