Oracle SQL | Tag Archives: Roger MacNicol

Improvements to HCC with wide tables in 12.2

Posted on May 4, 2017 by Roger MacNicol Posted in oracle, SmartScan 1,726 Page views Leave a comment

HCC Compression Unit Sizing

Since the beginning Oracle has provided four compression levels to offer a trade-off between the compression ratio and various other factors including table scans and the performance of single-row retrieval. I can not emphasize enough that the various trade offs mean that YMMV with the different levels and you should always test what works best with your application and hardware when choosing the level. Historically people have rarely used Query Low since the fast compression with reduced compression ratio means that the extra disk I/O is slower than the cost of decompression with Query High. The one time that Query Low makes sense on spinning disks is if you still have a significant number of row retrieval operations (including from index access joins).

NMVe Flash

X5 introduced NVMe technology which means that the extra I/O from Query Low is faster than ZLIB decompression which makes Query Low beneficial. So we needed to reassess the sizing of Compression Units. from 11.2.0.1 to 12.1.2.4 the sizing guidelines are as follows:

Continue reading→

Examining the new Columnar Cache with v$cell_state

Posted on May 4, 2017 by Roger MacNicol Posted in oracle, SmartScan 1,912 Page views Leave a comment

12.1.0.2 introduced the new Columnar Flash Cache where 1MB of blocks that are all in HCC format are rewritten so as to make each column from each CU contiguous. This works by first writing all the block headers to an array, then writing all the CU headers to an array, finally writing all the Column 1 column-level-CUs, then writing all the Column2 column-level-CUs etc.

The flash cache hash table maintains a simple mapping of column numbers to 64KB flash pages so, for any given query, we can simply do asynchronous disk I/O of the minimum set of 64KB pages required to cover the referenced columns.

Within the “flashcache” cell stats there is a new nested cell stat called “columnarcache” that enables you to track how it is working.

> set long 50000000
> set pagesize 10000
> select xmltype(STATISTICS_VALUE).getclobval(2,2) from v$cell_state;

XMLTYPE(STATISTICS_VALUE).GETCLOBVAL(2,2)
--------------------------------------------------------------------------------
....

  <stats type="flashcache">
    <stats type="columnarcache">
      <stat name="columnar_cache_size">0</stat>
      <stat name="columnar_cache_keep_size">0</stat>
      <stat name="columnar_cache_keep_read">0</stat>
      <stat name="columnar_cache_keep_read_bytes">0</stat>
      <stat name="columnar_attempted_read">0</stat>
      <stat name="columnar_attempted_read_bytes">0</stat>
      <stat name="columnar_cache_hits">0</stat>
      <stat name="columnar_cache_hits_read_bytes">0</stat>
      <stat name="columnar_cache_hits_saved_bytes">0</stat>
      <stat name="columnar_cache_pop">0</stat>
      <stat name="columnar_cache_pop_bytes">0</stat>
      <stat name="columnar_cache_pop_throttled">0</stat>
      <stat name="columnar_cache_pop_invalid_buf">0</stat>
      <stat name="columnar_cache_pop_size_align">0</stat>
      <stat name="columnar_cache_pop_findchdrfailure_mainchdr">0</stat>
      <stat name="columnar_cache_pop_findchdrfailure_childchdr">0</stat>
    </stats>
  </stats>

I typically spool the output of this to wade through it an editor but if we want to monitor how it is working with some workload, we need to extract individual stats, for example I bounced the cells and verified the cache was empty:

> select xmlcast(xmlquery('/cell_stats/stats/stats/stat[@name="columnar_cache_size"]' passing xmltype(STATISTICS_VALUE) returning content) as varchar2(200) ) "COLUMNAR CACHE SIZE"
from v$cell_state
where xmlexists('/cell_stats/stats/stats[@type="columnarcache"]' passing xmltype(STATISTICS_VALUE));

COLUMNAR CACHE SIZE
--------------------------------------------------------------------------------
0

I am using the 1GB TPC-H schema which takes a little more 400MB on disk when compressed with Query Low:

SQL> select sum(bytes) from user_segments where SEGMENT_NAME in ('SUPPLIER','PARTSUPP','LINEITEM','ORDERS','PART','CUSTOMER');

SUM(BYTES)
----------
 420675584

and checking the columnar cache again shows about half of the data has been rewritten into columnar cache format instead of caching raw blocks:

SQL> select xmlcast(xmlquery('/cell_stats/stats/stats/stat[@name="columnar_cache_size"]' passing xmltype(STATISTICS_VALUE) returning content) as varchar2(200) ) "COLUMNAR CACHE SIZE"
from v$cell_state
where xmlexists('/cell_stats/stats/stats[@type="columnarcache"]' passing xmltype(STATISTICS_VALUE));

COLUMNAR CACHE SIZE
--------------------------------------------------------------------------------
179306496

So let’s look at how the cache helped:

select
   xmlcast(
      xmlquery(
         '/cell_stats/stats/stats/stat[@name="columnar_cache_hits_read_bytes"]' 
         passing xmltype(STATISTICS_VALUE) 
         returning content
      ) as varchar2(200) 
   ) "HIT READ BYTES"
from v$cell_state
where xmlexists('/cell_stats/stats/stats[@type="columnarcache"]' passing xmltype(STATISTICS_VALUE));

 
HIT READ BYTES
--------------------------------------------------------------------------------
1909456896

select
   xmlcast(
      xmlquery(
         '/cell_stats/stats/stats/stat[@name="columnar_cache_hits_saved_bytes"]'
         passing xmltype(STATISTICS_VALUE)
         returning content
      ) as varchar2(200) 
   ) "HIT SAVED BYTES"
from v$cell_state
where xmlexists('/cell_stats/stats/stats[@type="columnarcache"]' passing xmltype(STATISTICS_VALUE));

 
HIT SAVED BYTES
--------------------------------------------------------------------------------
1128267776

which shows we were able to eliminate about 35% of the disk I/O for this query run!

We could, of course, have gotten that information more simply with the regular stat “cell physical IO bytes saved by columnar cache” but I wanted to show how to pull values from v$cell_state for use in scripts.

Many people only use Query High compression as they find the increased disk I/O from Query Low more than offsets the savings from cheaper decompression costs. However, with the columnar cache in place, those trade-offs have changed. It may be worth re-evaluating the decision as to when user Query Low vs. Query High particularly on CPU-bound cells.

Addendum: performance bug 20525311 affecting the columnar cache with selective predicates is fixed in the next rpm.

Roger MacNicol
Oracle Data Storage Technology

Controlling the offload of specific operators

Posted on May 4, 2017 by Roger MacNicol Posted in cell_offload, oracle, SmartScan 2,013 Page views Leave a comment

One of the joys of regexp is that you can write a pattern that is painfully expensive to match and offloading these to the cell can cause significant impact on other users and overall throughput (including heartbeat issues). If you have a user who is prone to writing bad regexp expressions you as DBA can prevent regexp (or any other operator) from being offloaded to the cells.

Let’s take a very simple example using a cut down version of TPC-H Query 16 and a NOT LIKE predicate:

SQL> explain plan for select p_brand, p_type, p_size
from part
where p_brand <> 'Brand#45'
and p_type not like 'MEDIUM POLISHED%'
and p_size in (49, 14, 23, 45, 19, 3, 36, 9)
group by p_brand, p_type, p_size;

SQL> select * FROM TABLE(DBMS_XPLAN.DISPLAY);

  |*  3 |    TABLE ACCESS STORAGE FULL| PART | 29833 |  1048K|    |   217   (2)| 00:00:01 |  1 |  8 
------------------------------------------------------------------------------------------------------------
      3 - storage(("P_SIZE"=3 OR "P_SIZE"=9 OR "P_SIZE"=14 OR "P_SIZE"=19 
      OR "P_SIZE"=23 OR "P_SIZE"=36 OR "P_SIZE"=45 OR "P_SIZE"=49) 
      AND "P_BRAND"<>'Brand#45' AND "P_TYPE" NOT LIKE 'MEDIUM POLISHED%')

Here we see all the predicates get offloaded as expected. So, for example, to stop NOT LIKE being offloaded we would need to find the operator in v$sqlfn_metadata

SQL> column descr format a18
SQL> select func_id, descr, offloadable from v$sqlfn_metadata where descr like '%LIKE%';

   FUNC_ID DESCR              OFF
---------- ------------------ ---
        26  LIKE              YES
        27  NOT LIKE          YES
        99  LIKE              NO
       120  LIKE              YES
       121  NOT LIKE          YES
       ...
       524  REGEXP_LIKE       YES
       525  NOT REGEXP_LIKE   YES
       537  REGEXP_LIKE       YES
       538  NOT REGEXP_LIKE   YES

we can ignore all but the two basic LIKE operators in this case, so to disable the offload of our LIKE predicates we use:

   FUNC_ID DESCR              OFF
---------- ------------------ ---
        26  LIKE              YES
        27  NOT LIKE          YES
        99  LIKE              NO
       120  LIKE              YES
       121  NOT LIKE          YES
       ...
       524  REGEXP_LIKE       YES
       525  NOT REGEXP_LIKE   YES
       537  REGEXP_LIKE       YES
       538  NOT REGEXP_LIKE   YES

we can ignore all but the two basic LIKE operators in this case, so to disable the offload of our LIKE predicates we use:

SQL> alter session set cell_offload_parameters="OPT_DISABLED={26,27};";

and we see this reflected in the offloadable column in v$sqlfn_metadata.

SQL> select func_id, descr, offloadable from v$sqlfn_metadata where descr like '%LIKE%';

   FUNC_ID DESCR              OFF
---------- ------------------ ---
        26  LIKE              NO
        27  NOT LIKE          NO
        99  LIKE              NO
       120  LIKE              YES
       121  NOT LIKE          YES

To re-enable them you would use:

SQL> alter session set cell_offload_parameters="OPT_DISABLED={};";

One thing to note about this param is that it doesn’t work like events (whose settings are additive), here it replaces the previous value and so every operator you want disabled has to be included in the same alter session (and the param is limited to 255 maximum characters limiting the number of operators that can be disabled). With the offload of LIKE and NOT LIKE disabled we can see the impact on the plan:

SQL> explain plan for select p_brand, p_type, p_size
from part
where p_brand <> 'Brand#45'
and p_type not like 'MEDIUM POLISHED%'
and p_size in (49, 14, 23, 45, 19, 3, 36, 9)
group by p_brand, p_type, p_size;

SQL> select * FROM TABLE(DBMS_XPLAN.DISPLAY); 

 |*  3 |    TABLE ACCESS STORAGE FULL| PART | 29833 |  1048K|    |   217   (2)| 00:00:01 |  1 |  8 
------------------------------------------------------------------------------------------------------------

     3 - storage(("P_SIZE"=3 OR "P_SIZE"=9 OR "P_SIZE"=14 OR "P_SIZE"=19 OR "P_SIZE"=23
     OR "P_SIZE"=36 OR "P_SIZE"=45 OR "P_SIZE"=49) AND "P_BRAND"<>'Brand#45')

and the NOT LIKE is no longer in the storage filter. Now lets say that you as DBA are faced with a more complex problem and want to halt all complex processing on the cells temporarily. There is a parameter that will disable everything except the simple comparison operators and NULL checks:

SQL> alter session set "_cell_offload_complex_processing"=FALSE;

Now lets see what happens:

SQL> explain plan for select p_brand, p_type, p_size
from part
where p_brand <> 'Brand#45'
and p_type not like 'MEDIUM POLISHED%'
and p_size in (49, 14, 23, 45, 19, 3, 36, 9)
group by p_brand, p_type, p_size;

SQL> select * FROM TABLE(DBMS_XPLAN.DISPLAY); 

 |*  3 |    TABLE ACCESS STORAGE FULL| PART | 29833 |  1048K|    |   217   (2)| 00:00:01 |  1 |  8 
------------------------------------------------------------------------------------------------------------

    3 - filter(("P_SIZE"=3 OR "P_SIZE"=9 OR "P_SIZE"=14 OR "P_SIZE"=19 OR "P_SIZE"=23
    OR "P_SIZE"=36 OR "P_SIZE"=45 OR "P_SIZE"=49) AND "P_BRAND"<>'Brand#45'
    AND "P_TYPE" NOT LIKE 'MEDIUM POLISHED%')

Well we got no storage predicates at all and we didn’t expect that because we had one simple predicate namely p_brand != 'Brand#45' and the IN predicate had been rewritten to a series of OR’ed comparisons so what happened? This parameter only permits simple predicates that are linked by AND’s and can be attached directly to one column. Disjuncts are not pushable so they are normally evaluated by an eva tree or by pcode neither of which are sent to the cell with this parameter set to FALSE. So why wasn’t our one simple predicate offloaded. Well, note where it is in the explain plan. It comes after the rewritten the IN and since the predicates are sorted by the optimizer on effectiveness we stop looking as soon as we see one that can’t be offloaded. Let’s remove the IN and see what happens:

SQL> explain plan for select  p_brand, p_type, p_size
from part
where p_brand <> 'Brand#45'
and p_type not like 'MEDIUM POLISHED%';

|*  2 |   TABLE ACCESS STORAGE FULL| PART |   190K|  6686K|   217   (2)| 00:00:01 | 1 | 8 |
---------------------------------------------------------------------------------------------------

    2 - storage("P_BRAND"<>'Brand#45')
       filter("P_BRAND"<>'Brand#45' AND "P_TYPE" NOT LIKE 'MEDIUM POLISHED%')

as expected the simple predicate is now offloaded. If you look at v$sqlfn_metadata you’ll see this param is reflected in the offloadable column:

SQL> select func_id, descr, offloadable from v$sqlfn_metadata where descr like '%LIKE%';

   FUNC_ID DESCR              OFF
---------- ------------------ ---
        26  LIKE              NO
        27  NOT LIKE          NO
        99  LIKE              NO
       120  LIKE              NO
       ...
       121  NOT LIKE          NO
       524  REGEXP_LIKE       NO
       525  NOT REGEXP_LIKE   NO
       537  REGEXP_LIKE       NO
       538  NOT REGEXP_LIKE   NO

I hope you never need any of this in real life but it’s good to have it in the toolbag.

Alter Table Shrink Space and SmartScan

Posted on May 4, 2017 by Roger MacNicol Posted in oracle, SmartScan 1,831 Page views Leave a comment

Over the years, updates can cause rows to become highly fragmented sapping performance on Exadata table scans.

The offload server and hence SmartScan get data to process 1 MB at a time. Because of this, SmartScan is only able to process row pieces that are available in the current 1 MB chunk that it processes at a time. Unlike RDBMS table scans, SmartScan is not able to initiate disk I/Os to retrieve further row pieces and, even if it could, it in unlikely that they would be present on the same cell. When SmartScan finds it needs a row for a projected column that is not present in the blocks available to it, it will apply predicate on the row pieces it does have but if those predicate pass, it has to return the row unprocessed for the RDBMS to fetch the missing row pieces from the buffer cache.

Continue reading→

What you need to know about Offload Quarantines

Posted on March 24, 2016 by Roger MacNicol Posted in oracle, SmartScan 1,943 Page views 1 Comment

Several questions have couple up in the last few weeks about offload quarantines which means a blog post on this topic is overdue. We work hard to stress test every new rpm that is released but on rare occasions customers can encounter an issue with the thin database layer that exists in the offload server. This layer is known externally as “Smart Scan” and internally as “FPLIB” (a.k.a. Filter Projection Library).

A crash in the thin database layer could because of either an issue with some aspect of the sql_id (for example, with the predicates) or because of an issue with the data on some region of disk (for example, with the OLTP compression symbol table). The worst, and rarest, form of crashes are where striping leads to every offload server failing simultaneously: these are known colloquially as “Railroad Crashes”). The most important thing is to make sure the retry mechanism doesn’t immediately resubmit the query and re-crash the offload server causing a halt to the business operating. In a hospital, the floor nurse would call a code and the crash team would come running with a crash cart to stabilize the patient. Two members of my family are nurses and I’m reminded that nurses are a lot like technical support engineers in that while doing their job they sometimes have to deal with abuse from frustrated patients (customers): please remember that both groups work hard to resolve your issues and be kind to them!

What are Quarantines?

The option of calling a crash cart is not available to us here so starting in early 11.2.0.3, we created a quarantine system where, after a crash, the exception handler remembers both the sql_id and the disk region being processed and creates a persistent quarantine for both. When a sql_id or a disk region is quarantined any Smart Scan operations on them will be executed in passthru mode.

Currently it is hard to see when this is happening, but an upcoming release has a new stat to make this easier. If an operation has been quarantined you can see its effects by monitoring:

cell num bytes in passthru due to quarantine

Listing Quarantines

When a quarantine has been created you can look at it in detail using CellCLI:

CellCLI> list quarantine 1 detail
     name:                      1
         clientPID:             12798
         crashReason:           ORA-600[17099]
         creationTime:          2011-02-07T17:18:13-08:00
         dbUniqueID:            2022407934
         dbUniqueName:          YAMA
         incidentID:            16
         planLineID:            37
         quarantineReason:      Crash
         quarantineType:        "SQL PLAN"
         remoteHostName:        sclbndb02.us.oracle.com
         rpmVersion:            OSS_11.2.0.3.0_LINUX.X64_1012
         sqlID:                 1jw05wutgpfyf
         sqlPlanHashValue:      3386023660

The ‘list detail’ version of the command gives us everything we would need to know about exactly what has been quarantined and why it was quarantined. CellCLI also supports manually creating a quarantine using the attributes shown by ‘list detail’.

Quarantine Escalation

This is the topic that has caused the most confusion: if three new quarantines are generated within a 24 hour period the quarantine is escalated to a database quarantine. Using the ‘list detail’ option we would then see:

quarantinePlan:         SYSTEM
quarantineReason:       Escalated
quarantineType:         DATABASE

Note: the number of quarantines in 24 hours before escalation is configurable via a cellinit param: please contact Technical Support if you feel you have a valid need to change this.

The final level of escalation is where if more than one database has been escalated to a database quarantine, the system will escalate to a complete offload quarantine where Smart Scan is disabled completely and all I/O goes through regular block I/O. I’m glad to say that I’ve have never seen this happen.

Dropping Quarantines

The next question is how and when are the quarantines removed. Any quarantine can be removed manually using CellCLI. Quarantines are also automatically dropped by certain operations:

Plan Step Quarantines:
- Dropped on rpm upgrades
Disk Region Quarantines
- Dropped on rpm upgrades, or on successful writes to the quarantined disk region
Database quarantines
- Dropped on rpm upgrades
Offload quarantines
- Dropped on rpm upgrades

What about CDB?

In 12.1, we changed the architecture of cellsrv to support multiple RDBMS versions running at the same time by introducing the concept of offload servers. When a new rpm is installed it typically contains offload servers for 11.2.0.4, 12.1.0.1, 12.1.0.2, (and 12.2.0.1). This is known internally as multi-DB. Any given operation is tagged with the RDBMS version it is coming from and routed to the offload server for that version. A crash in Smart Scan typically means that only the offload server has to restart and not the central cellsrv that maintains Storage Index and does Smart IO. A side effect of this is that all operations for that RDBMS version can revert to Block IO while the offload server restarts minimizing disruption.

The architecture change necessitated a change to the way quarantines are created, checked, and dropped. In multi-DB, installation of a new rpm no longer drops all quarantines. Instead, system created quarantines now record the offload server’s rpm version. Manually created quarantines can also optionally specify offload rpm they are to effect. In multi-DB, a quarantine is observed if the offload rpm specified matches the actual offload rpm the operation will be sent to or if no offload rpm is specified regardless of offloadgroup.
Multi-DB quarantines are dropped if the matching offload rpm is uninstalled or a new rpm installed for that offload version. Multi-DB quarantines with no offload rpm specified must be dropped manually.

Please let me know if you have any questions.

Roger