Oracle SQL
  • LICENSE

Monthly Archives: February 2014

Single SQL vs SQL+PL/SQL

Posted on February 28, 2014 by Sayan Malakshinov Posted in oracle, PL/SQL, PL/SQL optimization, query optimizing 3,203 Page views 1 Comment

Everyone knows Tom Kyte’s mantra:

You should do it in a single SQL statement if at all possible.

But we all know that “Every rule has an exception”
There are many different cases when pl/sql with sql can be more efficient than only sql, and i dont want to catalog them. I just want to show a couple examples of such exceptions:

1. Running totals by several dimensions

Simple example from forum:

select dt,
       dim1,
       dim2,
       val,
       sum(val) over(partition by dim1 order by dt) dim1_cumulative_sum,
       sum(val) over(partition by dim2 order by dt) dim2_cumulative_sum,
       sum(val) over(partition by dim1, dim2 order by dt) dim1_dim2_cumulative_sum
  from mg_t
 order by dt;

This query will be very hard for big data sets, so we can do it efficiently with pl/sql:

create or replace function xf_to_drop return xt2_to_drop pipelined
is
   type tt  is table of number index by pls_integer;
   type tt2 is table of tt index by pls_integer;
   dim1_c tt;
   dim2_c tt;
   dim12_c tt2;
begin
   for r in (
            select dt,
                   dim1,
                   dim2,
                   val
              from mg_t
             order by dt
   )
   loop
      dim1_c(r.dim1):=case when dim1_c.exists(r.dim1) then dim1_c(r.dim1) else 0 end + r.val;
      dim2_c(r.dim1):=case when dim2_c.exists(r.dim1) then dim2_c(r.dim1) else 0 end + r.val;
      dim12_c(r.dim1)(r.dim2):=case
                                  when dim12_c.exists(r.dim1)
                                   and dim12_c(r.dim1).exists(r.dim2)
                                  then dim12_c(r.dim1)(r.dim2)
                                  else 0
                               end + r.val;
      pipe row (xt1_to_drop( r.dt
                            ,r.dim1
                            ,r.dim2
                            ,r.val
                            ,dim1_c(r.dim1)
                            ,dim1_c(r.dim1)
                            ,dim12_c(r.dim1)(r.dim2)
                           ));
   end loop;
end;
/
Full example

[sourcecode language=”sql”]
create table mg_t as
select trunc(sysdate) + level/1440 dt,
trunc(3 * dbms_random.value()) dim1,
trunc(3 * dbms_random.value()) dim2,
trunc(100 * dbms_random.value()) val
from dual
connect by level <= 3e6;

create type xt1_to_drop is object(
dt date
,dim1 number
,dim2 number
,val number
,dim1_cumulative_sum number
,dim2_cumulative_sum number
,dim1_dim2_cumulative_sum number
);
create type xt2_to_drop as table of xt1_to_drop;

create or replace function xf_to_drop return xt2_to_drop pipelined
is
type tt is table of number index by pls_integer;
type tt2 is table of tt index by pls_integer;
dim1_c tt;
dim2_c tt;
dim12_c tt2;
begin
for r in (
select dt,
dim1,
dim2,
val
from mg_t
order by dt
)
loop
dim1_c(r.dim1):=case when dim1_c.exists(r.dim1) then dim1_c(r.dim1) else 0 end + r.val;
dim2_c(r.dim1):=case when dim2_c.exists(r.dim1) then dim2_c(r.dim1) else 0 end + r.val;
dim12_c(r.dim1)(r.dim2):=case
when dim12_c.exists(r.dim1)
and dim12_c(r.dim1).exists(r.dim2)
then dim12_c(r.dim1)(r.dim2)
else 0
end + r.val;
pipe row (xt1_to_drop( r.dt,r.dim1,r.dim2,r.val,dim1_c(r.dim1),dim1_c(r.dim1),dim12_c(r.dim1)(r.dim2)));
end loop;
end;
/
exec for r in (select * from table(xf_to_drop)) loop null; end loop;
[/sourcecode]

[collapse]

2. Finding connected components

Assume that we have big table with many-to-many relationship:

create table test (clientid NUMBER(10), accountid NUMBER(10));

How we can find all connected groups?

This example also taken from our russian forum and there was very good and simple sql-only solution, but it’s not efficient on big data sets:

select min(group_member_id) as group_max_id, accountid, clientid
  from  (select clientid as group_member_id
         , connect_by_root accountid as accountid
         , connect_by_root clientid  as clientid
      from test
      connect by nocycle decode(accountid, prior accountid, 1, 0)
                       + decode(clientid,  prior clientid,  1, 0)
                       = 1
  ) a
  group by accountid, clientid
  order by group_max_id, accountid
/

This pure SQL solution is for the cases when ClientId and AccountId are different entities. If they are the same entities in your case, you need to use UNION ALL:
select min(group_member_id) as group_max_id, accountid, clientid
from (select clientid as group_member_id
, connect_by_root accountid as accountid
, connect_by_root clientid as clientid
from test
connect by nocycle decode(accountid, prior accountid, 1, 0)
+ decode(clientid, prior clientid, 1, 0)
= 1
) a
group by accountid, clientid
order by group_max_id, accountid
/

select min(group_member_id) as group_max_id, accountid, clientid
  from  (select clientid as group_member_id
         , connect_by_root accountid as accountid
         , connect_by_root clientid  as clientid
      from (select accountid, clientid from test union all select clientid,accountid from test)
      connect by nocycle decode(accountid, prior accountid, 1, 0)
                       + decode(clientid,  prior clientid,  1, 0)
                       = 1
  ) a
  group by accountid, clientid
  order by group_max_id, accountid
/

We can try to remember algorithms courses and adopt one of the several algorithms for connected components:

Based on weighted quick-union algorithm

[sourcecode language=”sql”]
declare
type int_array is table of pls_integer index by pls_integer;
type arr_elems is table of sys.ku$_objnumset index by pls_integer;
root int_array;
root_elems arr_elems;

n int;
clients int_array;
accounts int_array;

l integer:=dbms_utility.get_time();

procedure print(v in varchar2) is
begin
dbms_output.put_line(to_char((dbms_utility.get_time-l)/100,’0999.99′)||’ ‘||v);
l:=dbms_utility.get_time();
end;

function get_root(n int) return pls_integer is
begin
if root.exists(n) then
return root(n);
else
return null;
end if;
end;

procedure update_root(old_root pls_integer,new_root pls_integer) is
i pls_integer;
elem pls_integer;
cnt_old pls_integer;
cnt_new pls_integer;
begin
if old_root!=new_root then
–root_elems(new_root):=root_elems(new_root) multiset union all root_elems(old_root);
cnt_old:=root_elems(old_root).count;
cnt_new:=root_elems(new_root).count;
root_elems(new_root).extend(cnt_old);
for i in 1..cnt_old
loop
elem := root_elems(old_root)(i);
root(elem):=new_root;
root_elems(new_root)(cnt_new+i):=elem;
end loop;
root_elems(old_root).delete;
end if;
end;

procedure add_elem(p_root pls_integer, p_elem pls_integer) is
begin
if not root_elems.exists(p_root) then
root_elems(p_root):=sys.ku$_objnumset(p_elem);
else
root_elems(p_root).extend();
root_elems(p_root)(root_elems(p_root).count):=p_elem;
end if;
end;

procedure add_link(clientid pls_integer,accountid pls_integer) is
r1 pls_integer;
r2 pls_integer;
new_root pls_integer;
begin
r1:=get_root(clientid);
r2:=get_root(accountid);

if r1 is null or r2 is null then
new_root := coalesce(r1,r2,clientid);
if r1 is null then add_elem(new_root,clientid ); root(clientid) :=new_root; end if;
if r2 is null then add_elem(new_root,accountid); root(accountid):=new_root; end if;
else
new_root := least(r1,r2);
root(clientid) :=new_root;
root(accountid):=new_root;
update_root(greatest(r1,r2),new_root);
end if;
end;

function str_format(p int) return varchar2 is
begin
return utl_lms.format_message(‘(%d, %d) = group #%d’
,clients(p)
,accounts(p)
,get_root(clients(p))
);
end;
begin
print(‘start’);
select clientid,accountid
bulk collect into clients,accounts
from test
— where rownum<=1000
;
print(‘fetched’);
n:=clients.count;
dbms_output.put_line(‘count=’||n);
for i in 1..n loop
add_link(clients(i),accounts(i));
end loop;
print(‘processed’);
—
/*
for i in 1..n loop
dbms_output.put_line(str_format(i));
end loop;
— */
end;
[/sourcecode]

[collapse]

We can also try even more interesting special algorithms for parallel processing: CONNECTED COMPONENTS ALGORITHMS
FOR MESH-CONNECTED PARALLEL COMPUTERS

Friday prank: select from join join join

Posted on February 21, 2014 by Sayan Malakshinov Posted in curious, oracle 2,846 Page views 1 Comment

Valid funny queries πŸ™‚

select the join from join join join using(the,some) 
/
select some join from left join join right using(some,the) 
/
select 1 join from join join join join join using(the) on 1=1
/
select the some from join 
where the=some( the(select some from join) 
               ,the(select the  from join)
               ) 
/ 

tables

[sourcecode language=”sql”]
create table join as select 1 some,1 the from dual;
create table left as select 1 some,1 the from dual;
create table right as select 1 some,1 the from dual;
[/sourcecode]

[collapse]
πŸ™‚

Little addition πŸ™‚

update two tables set join=2;
select join from two tables;
select first, second, random from two tables join three tables on 1=1;

Another one πŸ™‚

select first
     , case when the=some(some,some) then join end true
from two tables join three tables using(random);

….
and…

select random columns from two tables join four tables on the=some(some,some);
friday prank

Inlist iterator again

Posted on February 21, 2014 by Sayan Malakshinov Posted in curious, oracle, query optimizing 3,138 Page views Leave a comment

Several months ago I wrote about avoiding inlist iterator, but this post about how to force inlist iterator in queries like that:

select *
from xt_test1 t1
    ,xt_test2 t2
where 
     t1.a = :a 
 and t2.id in (a,b)

i.e. when we need to get rows from big table using index by list of values from another table.
A couple of workarounds

inlist iterator

Oracle 12c: scalar subqueries

Posted on February 11, 2014 by Sayan Malakshinov Posted in 12c, CBO, oracle, undocumented 2,829 Page views Leave a comment

We already know that the CBO transformation engine in 12c can unnest scalar subqueries from select-list.
So it’s not very surprising, that CBO is now able to add scalar subqueries costs to total query cost (even if “_optimizer_unnest_scalar_sq” = false):

Before 12.1

[sourcecode language=”sql” highlight=”15,17″]
SQL> explain plan for
2 select
3 (select count(*) from XT_TEST) cnt
4 from dual;

Explained.

PLAN_TABLE_OUTPUT
—————————————————————————
Plan hash value: 2843533371

—————————————————————————
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
—————————————————————————
| 0 | SELECT STATEMENT | | 1 | 2 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | INDEX FAST FULL SCAN| IX_TEST_A | 90792 | 50 (0)| 00:00:01 |
| 3 | FAST DUAL | | 1 | 2 (0)| 00:00:01 |
—————————————————————————

10 rows selected.
[/sourcecode]

[collapse]

12.1

[sourcecode language=”sql” highlight=”19,21″]
SQL> alter session set "_optimizer_unnest_scalar_sq"=false;

Session altered.

SQL> explain plan for
2 select
3 (select count(*) from XT_TEST) cnt
4 from dual;

Explained.

PLAN_TABLE_OUTPUT
—————————————————————————
Plan hash value: 2843533371

—————————————————————————
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
—————————————————————————
| 0 | SELECT STATEMENT | | 1 | 52 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | INDEX FAST FULL SCAN| IX_TEST_A | 90792 | 50 (0)| 00:00:01 |
| 3 | FAST DUAL | | 1 | 2 (0)| 00:00:01 |
—————————————————————————

10 rows selected.
[/sourcecode]

[collapse]

But it’s interesting that correlated subquery can reference now to a column from parent tables more
than one level above:
Before 12.1

[sourcecode language=”sql” highlight=”10,11″]
SQL> with t1 as (select/*+ materialize */ 1 a from dual)
2 ,t2 as (select/*+ materialize */ 2 b from dual)
3 ,t3 as (select/*+ materialize */ 3 c from dual)
4 select
5 (select s from (select sum(b*c) s from t2,t3 where c>t1.a and c>b)) s
6 from t1;
(select s from (select sum(b*c) s from t2,t3 where c>t1.a and c>b)) s
*
ERROR at line 5:
ORA-00904: "T1"."A": invalid identifier
[/sourcecode]

[collapse]

12.1

[sourcecode language=”sql”]
SQL> with t1 as (select/*+ materialize */ 1 a from dual)
2 ,t2 as (select/*+ materialize */ 2 b from dual)
3 ,t3 as (select/*+ materialize */ 3 c from dual)
4 select
5 (select s from (select sum(b*c) s from t2,t3 where c>t1.a and c>b)) s
6 from t1;

S
———-
6
[/sourcecode]

[collapse]

scalar subqueries

SYS_OP_MAP_NONNULL is in the documentation now

Posted on February 10, 2014 by Sayan Malakshinov Posted in 12c, documentation, oracle, undocumented 4,417 Page views Leave a comment

Interesting, that SYS_OP_MAP_NONNULL appeared in the Oracle 12c documentation: Choosing Indexes for Materialized Views

Lazy tip: By the way, with length limitations, we can also use documented dump function:

SQL> with
  2    t(a,b) as (
  3               select *
  4               from table(ku$_vcnt(null,'FF','A'))
  5                   ,table(ku$_vcnt(null,'FF','B'))
  6              )
  7  select
  8      a,b
  9     ,case when sys_op_map_nonnull(a) = sys_op_map_nonnull(b) then '=' else '!=' end comp1
 10     ,case when dump(a,1017)          = dump(b,1017)          then '=' else '!=' end comp2
 11     ,sys_op_map_nonnull(a) s_o_m_n_a
 12     ,sys_op_map_nonnull(b) s_o_m_n_b
 13     ,dump(a,  17) dump_a
 14     ,dump(b,  17) dump_b -- it is preferably sometimes to use 1017 - for charset showing
 15  from t;

A     B     COMP1 COMP2 S_O_M_N_A  S_O_M_N_B  DUMP_A                DUMP_B
----- ----- ----- ----- ---------- ---------- --------------------- ---------------------
            =     =     FF         FF         NULL                  NULL
      FF    !=    !=    FF         464600     NULL                  Typ=1 Len=2: F,F
      B     !=    !=    FF         4200       NULL                  Typ=1 Len=1: B
FF          !=    !=    464600     FF         Typ=1 Len=2: F,F      NULL
FF    FF    =     =     464600     464600     Typ=1 Len=2: F,F      Typ=1 Len=2: F,F
FF    B     !=    !=    464600     4200       Typ=1 Len=2: F,F      Typ=1 Len=1: B
A           !=    !=    4100       FF         Typ=1 Len=1: A        NULL
A     FF    !=    !=    4100       464600     Typ=1 Len=1: A        Typ=1 Len=2: F,F
A     B     !=    !=    4100       4200       Typ=1 Len=1: A        Typ=1 Len=1: B

9 rows selected.
photo Sayan Malakshinov

Oracle ACE Pro Oracle ACE Pro Alumni

DEVVYOracle Database Developer Choice Award winner

Oracle performance tuning expert

UK / Cambridge

LinkedIn   Twitter
sayan@orasql.org

Recent Posts

  • Oracle Telegram Bot
  • Partition Pruning and Global Indexes
  • Interval Search: Part 4. Dynamic Range Segmentation – interval quantization
  • Interval Search Series: Simplified, Advanced, and Custom Solutions
  • Interval Search: Part 3. Dynamic Range Segmentation – Custom Domain Index

Popular posts

Recent Comments

  • Oracle SQL | Interval Search: Part 4. Dynamic Range Segmentation – interval quantization on Interval Search: Part 3. Dynamic Range Segmentation – Custom Domain Index
  • Oracle SQL | Interval Search: Part 4. Dynamic Range Segmentation – interval quantization on Interval Search: Part 2. Dynamic Range Segmentation – Simplified
  • Oracle SQL | Interval Search: Part 4. Dynamic Range Segmentation – interval quantization on Interval Search: Optimizing Date Range Queries – Part 1
  • Oracle SQL | Interval Search Series: Simplified, Advanced, and Custom Solutions on Interval Search: Part 2. Dynamic Range Segmentation – Simplified
  • Oracle SQL | Interval Search: Part 2. Dynamic Range Segmentation – Simplified on Interval Search: Part 3. Dynamic Range Segmentation – Custom Domain Index

Blogroll

  • Alex Fatkulin
  • Alexander Anokhin
  • Andrey Nikolaev
  • Charles Hooper
  • Christian Antognini
  • Coskan Gundogar
  • David Fitzjarrell
  • Igor Usoltsev
  • Jonathan Lewis
  • Karl Arao
  • Mark Bobak
  • Martin Bach
  • Martin Berger
  • Neil Chandler
  • Randolf Geist
  • Richard Foote
  • Riyaj Shamsudeen
  • Tanel Poder
  • Timur Akhmadeev
  • Valentin Nikotin
Β©Sayan Malakshinov. Oracle SQL