Oracle SQL | Yearly Archives: 2017

External Tables Part 1 – Project Columns All vs Referenced

Posted on October 25, 2017 by Roger MacNicol Posted in External tables, oracle, SmartScan 2,066 Page views Leave a comment

I normally blog about table scans on Oracle native data but Oracle also supports a wide variety of features for scanning external tables and I need to cover these too. One of the ways I learn new things is being sent a bug report and saying to myself “Oh! I didn’t know you could do that”. So today I’m going to start with the grammar:

Alter Table &lt;xt> Project Columns [ All | Referenced ]

Continue reading→

PL/SQL functions: Iterate and keys for associative arrays

Posted on October 2, 2017 by Sayan Malakshinov Posted in oracle, PL/SQL, PL/SQL optimization 2,611 Page views Leave a comment

Unfortunately associative arrays still require more “coding”:
we still can’t use “indices of” or “values of” in simple FOR(though they are available for FORALL for a long time), don’t have convinient iterators and even function to get all keys…
That’s why I want to show my templates for such things like iterator and keys function. You can adopt these functions and create them on schema level.

declare 
   type numbers  is table of number;
   type anumbers is table of number index by pls_integer;
   a anumbers;
   i pls_integer;

   function iterate( idx in out nocopy pls_integer, arr in out nocopy anumbers) 
      return boolean
   as pragma inline;
   begin
      if idx is null 
         then idx:=arr.first; 
         else idx:=arr.next(idx);
      end if;
      return idx is not null;
   end;

  function keys(a in out nocopy anumbers) return numbers as
     res numbers:=numbers();
     idx number;
     pragma inline;
  begin
     while iterate(idx,a) loop
        res.extend;
        res(res.count):=idx;
     end loop;
     return res;
  end;

begin
   a(1):=10;
   a(3):=30;
   a(5):=50;
   a(8):=80;
   -- iterate:
   while iterate(i,a) loop
      dbms_output.put_line(a(i));
   end loop;

   -- keys:
   for i in 1..keys(a).count loop
      dbms_output.put_line(a(keys(a)(i)));
   end loop;

end;

How to group connected elements (or pairs)

Posted on September 29, 2017 by Sayan Malakshinov Posted in oracle, PL/SQL, PL/SQL optimization, query optimizing, SQL 1,837 Page views 7 Comments

I see quite often when developers ask questions about connected components:

Table “MESSAGES” contains fields “SENDER” and “RECIPIENT”, which store clients id.
How to quickly get all groups of clients who are connected even through other clients if the table has X million rows?
So for this table, there should be 4 groups:

(1, 2, 4, 8, 16)
(3, 6, 12)
(5, 10, 20)
(7, 14)
(9, 18)

SENDER	RECIPIENT
1	2
2	4
3	6
4	8
5	10
6	12
7	14
8	16
9	18
10	20

Of course, we can solve this problem using SQL only (model, recursive subquery factoring or connect by with nocycle), but such solutions will be too slow for huge tables.

Example of SQL solution

[sourcecode language=”sql”]
with
t(sender,recipient) as (select level,level*2 from dual connect by level<=10)
, v1 as (select rownum id,t.* from t)
, v2 as (select id, account
from v1
unpivot (
account for x in (sender,recipient)
))
, v3 as (
select
id
,account
,dense_rank()over(order by account) account_n
,count(*)over() cnt
from v2)
, v4 as (
select distinct grp,account
from v3
model
dimension by (id,account_n)
measures(id grp,account,cnt)
rules
iterate(1e6)until(iteration_number>cnt[1,1])(
grp[any,any] = min(grp)[any,cv()]
,grp[any,any] = min(grp)[cv(),any]
)
)
select
listagg(account,’,’)within group(order by account) s
from v4
group by grp
[/sourcecode]

[collapse]

In such situations it’s much better to adopt standard algorithms like Quick-find or Weighted quick-union for PL/SQL.
The first time I wrote such solution about 5 years ago and I even posted here one of the latest solutions, but all of them were not universal, so I’ve created the package today with a couple of functions for most common problems: XT_CONNECTED_COMPONENTS

It contains 2 functions based on Weighted quick-find quick-union algorithm:

function get_strings(cur in sys_refcursor, delim varchar2:=’,’) return strings_array pipelined;
It accepts a cursor and returns found connected components as table of varchar2(v_size). You can change v_size in the package definition.
Input cursor should contain one Varchar2 column with linked strings, for example: ‘a,b,c’.
You can also specify list delimiter, by default it is comma.
Examples:

select * from table(xt_connected_components.get_strings( cursor(select ELEM1||','||ELEM2 from TEST));
select * 
from
 table(
   xt_connected_components.get_strings( 
     cursor(select 'a,b,c' from dual union all
            select 'd,e,f' from dual union all
            select 'e,c'   from dual union all
            select 'z'     from dual union all
            select 'X,Y'   from dual union all
            select 'Y,Z'   from dual)));
COLUMN_VALUE
-----------------------------------------
STRINGS('X', 'Y', 'Z')
STRINGS('a', 'b', 'c', 'd', 'e', 'f')
STRINGS('z')

function get_numbers(cur in sys_refcursor) return numbers_array pipelined;
This function also returns connected components, but for numbers.
Input cursor should contain two columns with linked numbers.
Examples:

select * 
from table(
        xt_connected_components.get_numbers( 
          cursor(
            select sender_id, recipient_id from messages
        )));
select * 
from
  table(
    xt_connected_components.get_numbers( 
       cursor(
          select level   account1
               , level*2 account2 
          from dual 
          connect by level<=10
    )));
SQL> select *
  2  from
  3    table(
  4      xt_connected_components.get_numbers(
  5         cursor(
  6            select level   account1
  7                 , level*2 account2
  8            from dual
  9            connect by level<=10
 10*     )))
SQL> /

COLUMN_VALUE
------------------------
NUMBERS(1, 2, 4, 8, 16)
NUMBERS(3, 6, 12)
NUMBERS(5, 10, 20)
NUMBERS(7, 14)
NUMBERS(9, 18)

How to install:
Download all files from Github and execute “@install” in SQL*Plus or execute them in another tool in the following order:
xt_connected_components_types.sql
xt_connected_components_pkg.sql
xt_connected_components_bdy.sql

Download URL: https://github.com/xtender/xt_scripts/tree/master/extra/xt_connected_components

Ampersand instead of colon for bind variables

Posted on September 27, 2017 by Sayan Malakshinov Posted in curious, oracle, SQL, SQL*Plus, undocumented 2,703 Page views 1 Comment

I’ve troubleshooted one query today and I was very surprised that bind variables in this query were specified with &ampersand instead of :colon! I have never seen this before and I couldn’t find anything about this in documentation…
Unfortunately SQL*Plus doesn’t support ampersand yet, even if you disable define (“set define off”),
so I’ve tested such behaviour with this code:

set def off serverout on
exec declare s varchar2(1); begin execute immediate 'select 1 from dual where dummy=&var' into s using 'X'; dbms_output.put_line(s); end;

And it really works! //at least on 11.2.0.2 and 12.2.0.1

SQL> set def off serverout on
SQL> exec declare s varchar2(1); begin execute immediate 'select 1 from dual where dummy=&var' into s using 'X'; dbms_output.put_line(s); end;
1

PL/SQL procedure successfully completed.

SQL> select substr(sql_text,1,40) stext,sql_id,executions,rows_processed from v$sqlarea a where sql_text like '%dual%&var';

STEXT                                 SQL_ID        EXECUTIONS ROWS_PROCESSED
------------------------------------- ------------- ---------- --------------
select 1 from dual where dummy=&var   ckkw4u3atxz02          3              3

SQL> select * from table(dbms_xplan.display_cursor('ckkw4u3atxz02'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------
SQL_ID  ckkw4u3atxz02, child number 0
-------------------------------------
select 1 from dual where dummy=&var

Plan hash value: 272002086

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |       |       |     2 (100)|          |
|*  1 |  TABLE ACCESS FULL| DUAL |     1 |     2 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("DUMMY"=:VAR)


18 rows selected.

Update: Btw, it works for SQL only, not for PL/SQL:

SQL> var v varchar2(1);
SQL> begin &v = 'Z'; end;
  2  /
begin &v = 'Z'; end;
      *
ERROR at line 1:
ORA-06550: line 1, column 7:
PLS-00103: Encountered the symbol "&" when expecting one of the following:

SQL> exec &v := 'X';
BEGIN &v := 'X'; END;

      *
ERROR at line 1:
ORA-06550: line 1, column 7:
PLS-00103: Encountered the symbol "&" when expecting one of the following:
The symbol "&" was ignored.

SQL> exec :v := 'X';

PL/SQL procedure successfully completed.

SQL> select * from dual where dummy=&v
  2  ;

D
-
X

And we can can use mixed placeholders:

SQL> select * from dual where dummy=&v and &v=:v;

D
-
X

Simple regexp to check that string contains word1 and word2 and doesn’t contain word3

Posted on September 14, 2017 by Sayan Malakshinov Posted in oracle, SQL 1,824 Page views 1 Comment

with tst as (
  select 'qwe word1 asd ...............' s from dual
  union all
  select 'qwe word1 asd word2 .........' s from dual
  union all
  select 'qwe word1 asd word2 zxc word3' s from dual
  union all
  select 'qwe word2 asd word1 zxc word4' s from dual
)
select
  s
  ,regexp_replace(s, '(word1)|(word2)|(word3)|(.)','`\3') subst
  ,case when regexp_like(regexp_replace(s, '(word1)|(word2)|(word3)|(.)','`\3') , '^`+$') then 'matched' end tst2
from tst
where 1=1
--and regexp_like(regexp_replace(s, '(word1)|(word2)|(word3)|(.)','`\3') , '^`+$')

Book advice: ORACLE SQL & PL/SQL Golden Diary by Asim Chowdhury

Posted on August 10, 2017 by Sayan Malakshinov Posted in oracle 1,692 Page views 1 Comment

I’ve reviewed this book recently, and I highly recommend it as it has almost all that needed to become strong Oracle developer. You can check at least the table of contents:

ORACLE SQL & PL/SQL Golden Diary: by Asim Chowdhury
New Book Demystifies Complex Cross-Version Oracle Problem Solving

Compiled by veteran computer scientist and data modeler, Asim Chowdhury, ‘ORACLE SQL & PL/SQL Golden Diary: Refactoring, Interoperability of Versions & Integration of related concepts for High Performance’ is the first book on the market that comprehensively allows data architects to unravel any concepts in SQL and PL/sql till oracle 12c. It’s poised to remove much confusion from the many versions of Oracle SQL now on the market; a Godsend for the computer science industry.

The beginners guide to Oracle Table Scans

Posted on August 2, 2017 by Roger MacNicol Posted in adaptive serial direct path reads, cell_offload, inmemory, oracle, SmartScan, TurboScan 1,873 Page views Leave a comment

I was asked a question yesterday that reminded me there are always people completely new to the topic who need an introduction – somewhere to start before the other articles make sense. So, here’s my brief write-up of everything you need to know about the basic of Oracle Table Scans.

Oracle has four main ways of scanning a table: the pre-9ir2 table scan, the 9ir2 TurboScan, the 11.1.0.1 Exadata SmartScan, and the 12.1.0.1 In-Memory Scan. Before we summarize each one, the other fundamental piece of information is the Oracle dictum that all blocks much be self-describing: a table scan routine should be able to inspect a block and understand what object it belongs, whether it needs an undo applying, and how the data is laid out without reference to any external structures or secondary storage.

The original table scan routine

Oracle uses a “dataflow” query engine which means a query plan is built from nodes like a sausage machine that have three basic operations: Open, Next, Close. ‘Open’ means you ask the next node in the chain to prepare to do some work including acquiring any resources it may need, ‘Next’ means you fetch one unit of work from your child e.g. a row, and ‘Close’ means to tell your child node to shut down and release any resources it may be holding. You build a query by connecting the right kinds of nodes together in the order you want: one node just sorts, another groups, another does hash joins. The end of the sausage machine is the node seen on query plans as “Table Access Full”

This node would ask the data layer to fetch a block from disk then get rows one at a time from the data layer. This is the work horse table scan: it can scan any kind of data and do SCN manipulations like row versions but it is not the fastest way to scan a table.

9ir2 TurboScan

In 9ir2 we introduced a much faster way of scanning tables called TurboScan. The data layer function which had been handing out rows one at a time was replaced by one that stays in a tight loop retrieving rows from disk and pushing them into a callback supplied by “Table Access Full”. An automation tool was used to generate several versions of this routine that optimized out common choices that has to be made: does the user need rowids to be projected? do they need predicates applying? is the data compressed or? is the data column-major or row-major? etc etc Every time a CPU reaches a branch in the code it tries to guess which side of the branch will be taken but if it guess wrong there can be a considerable stall during which no work gets done. By removing most of the branches, the code runs much much more quickly.

TurboScan is used for all queries which do not use RAW datatypes and which do not need special SCN processing.

Both pre-9ir2 scan and TurboScan can use the buffer cache to get blocks (typically small to medium tables) or use Direct Read to get blocks (typically medium to large tables).

See: When bloggers get it wrong – part 1

TurboScan can be disabled for triage purposes by setting:

SQL> alter session set events='12099 trace name context forever, level 1';

or specifically you can disable it only for HCC tables by setting:

SQL> alter session set "_arch_comp_dbg_scan"=1;

Exadata SmartScan

In 11.1.0.1 we introduced Exadata SmartScan intelligent storage. This is where a thin layer of database processing is embedded in the storage cells and the table scan routine offloads simple search criteria and a list of the columns it needs to storage and the storage cells pre-process the blocks to remove rows that fail the search criteria and remove columns which are not needed by the table scan. If all the rows are removed, the block doesn’t have to be sent back at all.

SmartScan can drastically reduce the amount of data returned on the Interconnect and put on the RDBMS memory bus and the space used in SGA by the returned data. An additional significant benefit is gained when the CPU fetches the reduced blocks into the CPU cache since only relevant information exists on the block there is not space wasted by unwanted columns interspersing the wanted columns meaning more relevant data can fit in memory and the CPU prefetch can do a better job of predicting which memory cache line to fetch next.

Only TurboScan Direct Read scans can use this offload capability. You can disable SmartScan for triage purposes by setting:

SQL> alter session set cell_offload_processing=FALSE;

SQL> select /*+ opt_param('cell_offload_processing','false') */  <col> from <tab> where <predicate>;

In-Memory Scans

In-Memory scans were introduced in 12.1.0.1 and brought a revolutionary increase in table scan speeds. With In-Memory scans the table or partition is loaded into a in-memory tablespace in SGA known as the inmemory-area. Data is stored in compressed columnar format typically up to 500,000 values in each columnar compression unit. This tablespace is kept transactionally consistent with the data on disk via means of an invalidation bitmap.

Just like with SmartScan, only TurboScan can use In-Memory scans with In-Memory objects. Instead of getting a block from disk, the specialized version of the scan routines fetches a column run from each column of interest, process the search criteria, then returns column runs with the failing rows removed to the “Table Access Full” node.

If any rows have been modified and committed by other users or the users own transaction has modified any rows the scan will see these rows set in the invalidation bitmap. These rows are removed from the columnar results and the additional rows required are fetched from the buffer cache before moving on to the next set of column runs. This works well because the most recently modified blocks are the ones most likely to still be in the buffer cache.

Intra-block row chaining optimization in 12.2

Posted on June 17, 2017 by Sayan Malakshinov Posted in 12c, oracle, trace, undocumented 2,137 Page views 4 Comments

I’ve wrote in previous post

4. Next row piece can be stored in the same block only with inserts. When you run update, oracle will place new row piece into another block.

But it’s not valid anymore 🙂 Since 12.2 Oracle optimizes updates too.
You can check it on 12.2 and previous version using example 4 from previous post:

Test 4

[sourcecode language=”sql”]drop table test purge;
set serverout on
alter session set tracefile_identifier=’test4′;

declare
cols varchar2(32000):=’c_1 char(3)’;

procedure print_and_exec(c varchar2) as
begin
dbms_output.put_line(c);
execute immediate c;
end;
begin
for i in 2..355 loop
cols:=cols||’,c_’||i||’ char(3)’;
end loop;
print_and_exec (‘create table test(‘||cols||’)’);
print_and_exec (‘insert into test(c_1) values(null)’);
commit;
for i in 256..355 loop
execute immediate ‘update test set c_’||i||’=’||i;
end loop;
commit;
execute immediate ‘alter system flush buffer_cache’;
for r in (select ‘alter system dump datafile ‘||file#||’ block ‘||block# cmd
from (
select distinct file#,block#
from v$bh
where class#=1
and objd in (select o.data_object_id from user_objects o where object_name=’TEST’)
order by 1,2
)
)
loop
execute immediate r.cmd;
end loop;
end;
/
disc;[/sourcecode]

[collapse]

On 12.2 you will see that it creates just 5 blocks 🙂

PS. My presentation about Intra-block row chaining from RuOUG seminar: Intra-block row chaining(RuOUG)

Working around heatbeat issues caused by tracing or by regexp

Posted on May 4, 2017 by Roger MacNicol Posted in cell_offload, oracle, SmartScan, trace 1,568 Page views Leave a comment

I had noted in my first post that using the highest level of tracing caused timeout issues with the offload server heartbeat monitor. Heartbeat issues can also occur with expensive (and badly formed) regexp expressions. By default the heartbeat monitor is set to 6 seconds which is the maximum permitted to process 1MB data in the offload server and mark the task completed and is far more time than is reasonably expected to take.

Operations such as expensive tracing to disk or badly formed regexp expressions that cause that time period to be exceeded lead to this in the alert log:

State dump signal delivered to CELLOFLSRV&lt;10180> by pid - 9860, uid - 3318
Thu Mar  5 12:26:31 2015 561 msec State dump completed for CELLOFLSRV&lt;10180>
Clean shutdown signal delivered to CELLOFLSRV&lt;10180> by pid - 9860, uid - 3318
CELLOFLSRV &lt;10180> is exiting with code 1

where the restart server bounces the offload server to clear the perceived hang. Increasing the timeout via:

CellCLI> alter cell events = "immediate cellsrv.cellsrv_setparam('_cell_oflsrv_heartbeat_timeout_sec','60')"

enables the tracing to proceed without causing the restart server.

My point in writing this entry was to provide a work-around when tracing is needed but also to address a couple of blog posts I’d seen that recommend leaving it set at 60 or 90 seconds. This is not a good idea. The heartbeat exists to catch genuine but rare issues and leaving this set to an increased value will hinder the offload server restarting quickly to resume work. This is one parameter that shoud be reset to the default when the work-around is no longer needed unless otherwise directed by support.

Roger MacNicol

Why you dont want to set _partition_large_extents FALSE

Posted on May 4, 2017 by Roger MacNicol Posted in oracle, SmartScan, undocumented 2,138 Page views Leave a comment

I’ve seen some blogs recommending that _partition_large_extents be set to FALSE for a variety of space conserving reasons without the authors thinking about the negative impact this is going to have on Smart Scan. Large Extents cause an INITIAL allocation of 8 MB and a NEXT allocation of 1 MB and they have been the default for table spaces on Exadata since 11.2.0.2. You can verify that large extents are in use by a given table or partition by:

Select segment_flags 
From sys_dba_segs 
where segment_name = <table_name> 
and owner = <schema_name>;

The segment flag bit for large extents is 0x40000000.

This pattern of allocation is design to work optimally with Smart Scan because Smart Scan intrinsically works in 1 MB chunks. Reads of ASM allocation units are split into maximum 1 MB chunks to be passed to the filter processing library to have their blocks sliced and diced to create the synthetic blocks that contain only the rows and columns of interest to return to the table scan driver. When less than 1 MB gets allocated at a time to a segment and then the next contiguous blocks gets allocated to a different segment, each separate run of blocks will be read by a different MBR. Each run will be passed separately to Smart Scan and we get sub-optimal chunks to work on increasing both the overhead of processing and the number of round trips needed to process the table scan. The design of Smart Scan is predicated on scooping up contiguous runs of data from disk for efficient processing.

This matters particularly for HCC data and for chained rows.

Continue reading→