Oracle SQL | Monthly Archives: August 2019

DMLs and the Columnar Cache on ADW

Posted on August 23, 2019 by Roger MacNicol Posted in 12.2, inmemory, oracle 1,974 Page views Leave a comment

One of the main performance technologies underlying ADW is the In-Memory format column cache and a question that has come up several times is: how does the columnar cache handle DMLs and cache invalidations. This blog post attempts to answer that question.

The Two Columnar Cache Formats

The original columnar cache was an idempotent rearrangement of Hybrid Columnar Compressed blocks into pure columnar form still in the original HCC encoding. This is known internally as “CC1” and was not applicable to row format blocks. Because this is an idempotent transformation we are able to reconstitute the original raw blocks from a CC1 cache if needed.

We then introduced the In-Memory format columnar cache where, if you had an In-Memory licence, we would run the 1 MB chunks processed by Smart Scan through the In-Memory loader and write new clean columns encoded in In-Memory formats which meant that we could then use the new SIMD based predicate evaluation and other performance improvements developed for Database In-Memory. If you do not have an In-Memory licence, the original CC1 column cache is used. Another advantage of the CC2 format is that we can load row format blocks as well as HCC format blocks into pure columnar In-Memory format.

Continue reading→

SQL: Fast ways to count unique characters in the string

Posted on August 19, 2019 by Sayan Malakshinov Posted in oracle, PL/SQL optimization, query optimizing, SQL 2,231 Page views Leave a comment

Test data:

create table t_str as
select round(dbms_random.value(1e10,9e10)) str from dual connect by level&lt;=1e5
/

PL/SQL variant:

with
  function ff(s varchar2) return varchar2 
  as
      type avarchars is table of varchar2(100) index by varchar2(1);
      st  avarchars;
      idx varchar2(1);
      res varchar2(10);
      
      function iterate( idx in out nocopy varchar2, arr in out nocopy avarchars) 
         return boolean
      as --pragma inline;
      begin
         if idx is null
            then idx:=arr.first; 
            else idx:=arr.next(idx);
         end if;
         return idx is not null;
      end;  
   begin
     for i in 1..length(s) loop
        st(substr(s,i,1)):=1;
     end loop;
     while iterate(idx,st) loop
        res:=res||idx;
     end loop;
     return res;
   end;

select min(ff(str)) res
from t_str
/

SQL-only variant:

select min(fstr)
from t_str t
     cross apply (
     select listagg(c) within group (order by 1) fstr
     from (select
            distinct substr(t.str, level, 1) c
           from dual
           connect by level <= length(t.str)
          )
     )
/

Timings:

SQL> create table t_str as
  2  select round(dbms_random.value(1e10,9e10)) str from dual connect by level<=1e5
  3  /

Table created.

Elapsed: 00:00:00.55
SQL> with
  2    function ff(s varchar2) return varchar2
  3    as
  4        type avarchars is table of varchar2(100) index by varchar2(1);
  5        st  avarchars;
  6        idx varchar2(1);
  7        res varchar2(10);
  8
  9        function iterate( idx in out nocopy varchar2, arr in out nocopy avarchars)
 10           return boolean
 11        as --pragma inline;
 12        begin
 13           if idx is null
 14              then idx:=arr.first;
 15              else idx:=arr.next(idx);
 16           end if;
 17           return idx is not null;
 18        end;
 19     begin
 20       for i in 1..length(s) loop
 21          st(substr(s,i,1)):=1;
 22       end loop;
 23       while iterate(idx,st) loop
 24          res:=res||idx;
 25       end loop;
 26       return res;
 27     end;
 28
 29  select min(ff(str)) res
 30  from t_str
 31  /

RES
--------------------------------------------------------------
0123

Elapsed: 00:00:00.48
SQL> select min(fstr) res2
  2  from t_str t
  3       cross apply (
  4       select listagg(c) within group (order by 1) fstr
  5       from (select
  6              distinct substr(t.str, level, 1) c
  7             from dual
  8             connect by level <= length(t.str)
  9            )
 10       )
 11  /

RES2
--------------------------------------------------------------
0123

Elapsed: 00:00:01.01

And much easier variant if you need your strings contain digits only:

select min(translate('0123456789', translate('z0123456789','z'||str,'z'), chr(0)))
from t_str

Thoughts on Big Data SQL SmartScan

Posted on August 18, 2019 by Roger MacNicol Posted in cell_offload, External tables, oracle, SmartScan 1,654 Page views Leave a comment

Just a few thoughts I’ve been meaning to blog for a while:

1. Number of columns that can be offloaded

Exadata

Very early on in the Exadata Smart Scan implementation a restriction was added to prevent offloading more than 255 columns. This was done because of performance issues observed by customers. Smart Scan works on 1 MB chunks at a time and needs all the row pieces up to the highest #segcol referenced to be present in the 1 MB chunk that is being processed. When rows with more than 255 columns are first inserted all the row pieces are contiguous and hence likely to be present in the 1 MB chunk.

However, if the DBA has not provided sufficient freespace (PCTFREE) in the block when it is first loaded subsequent DMLs may end up splitting row pieces and moving pieces to blocks in the freelist that have more space available. PCTFREE specifies how much empty space to leave for future updates when a block is first populated.

Continue reading→

Oracle scheduler: how to find jobs with wrong nls_env

Posted on August 2, 2019 by Sayan Malakshinov Posted in oracle 1,657 Page views Leave a comment

select nls_env
      ,count(*) cnt
      ,xmlcast(xmlagg(xmlelement(job_name, job_name||',')).extract('//text()') as clob) jobs
from dba_scheduler_jobs 
group by nls_env;