Oracle SQL | Tag Archives: Columnar Cache

Many people know that in 12.1.0.2 we introduced a ground-breaking columnar cache that rewrote 1 MB chunks of HCC format blocks in the flash cache into pure columnar form in a way that allowed us to only do the I/O for the columns needed but also to recreate the original block when that
was required.

This showed up in stats as “cell physical IO bytes saved by columnar cache”.

But in 12.1.0.2 we had also introduced Database In-Memory (or DBIM) that rewrote heap blocks into pure columnar form in memory. That project introduced:

new columnar formats optimized for query performance
a new way of compiling predicates that supported better columnar execution
the ability to run predicates against columns using SIMD instructions which could execute the predicate against multiple rows simultaneously

so it made perfect sense to rework the columnar cache in 12.2 to take advantage of the new In-Memory optimizations.

Quick reminder of DBIM essentials

In 12.2, tables have to be marked manually for DBIM using the INMEMORY keyword:

SQL> Alter Table INMEMORY

When a scan on a table tagged as INMEMORY is queried the background process is notified to start loading it into the INMEMORY area. This design was adopted so that the user’s scans are not impeded by loading. Subsequent scans that come along will check what percentage is currently loaded in memory and make a rule based decision:

For Exadata

Greater than 80% populated, use In-Memory and the Buffer Cache for the rest
Less than 80% populated, use Smart Scan
The 80% cutoff between BC and DR is configurable with an undocumented underscore parameter
Note: if an In-Memory scan is selected even for partially populated, Smart Scan is not used

For non-Exadata

Greater than 80% populated, use In-Memory and the Buffer Cache for the rest
Less than 80% populated, use In-Memory and Direct Read for the rest
Note: this requires that segment to be check pointed first
The 80% cutoff between BC and DR is configurable with an undocumented underscore parameter

While DBIM can provides dramatic performance improvements it is limited by the amount of usable SGA on the system that can be set aside for the In-Memory area. After that performance becomes that of disk access to heap blocks from the flash cache or disk groups. What was needed was a way to increase the In-Memory area so that cooler segments could still benefit from the In-Memory formats without using valuable RAM which is often a highly constrained resource.

Cellmemory

Cellmemory works in a similar way to the 12.1.0.2 columnar cache in that 1 MB of HCC formatted blocks are cached in columnar form automatically without the DBA needing to be involved or do anything. This means columnar cache scans are cached after Smart Scan has processed the blocks rather than before as happens with ineligible scans. The 12.1.0.2 columnar cache format simply takes all the column-1 compression units (CUs) and stores them contiguously, then all the column-2 CUs stored contiguously etc. so that each column can be read directly from the cache without reading unneeded columns. This happens during Smart Scan so that the reformated blocks are returned to the cell server along with the query results for that 1 MB chunk.

In 12.2, eligible scans continue to be cached in the 12.1.0.2 format columnar cache format after Smart Scan has processed the blocks so that columnar disk I/O is available immediately. The difference is that if and only if the In-Memory area is in use on the RDBMS node (i.e. the In-Memory feature is already in use in that database), the address of the beginning of the columnar cache entry is added to a queue so that a background process running at a lower priority can read those cache entries and invoke the DBIM loader to rewrite the cache entry into In-Memory formatted data.

Unlike the columnar cache which has multiple column-1 CUs, the IMC format creates a single column using the new formats that use semantic compression and support SIMD etc on the compressed intermediate compressed data. By default a second round of compression using LZO is then applied. When 1 MB of HCC ZLIB compressed blocks are rewritten in this way they typically take around 1.2 MB (YMMV obviously).

Coming up

Up-coming blog entries will cover:

Overriding the default behaviour with DDL
New RDBMS stats for Cellmemory
New cellsrv stats for Cellmemory
Flash cache pool changes
Tracing Cellmemory

12.1.0.2 introduced the new Columnar Flash Cache where 1MB of blocks that are all in HCC format are rewritten so as to make each column from each CU contiguous. This works by first writing all the block headers to an array, then writing all the CU headers to an array, finally writing all the Column 1 column-level-CUs, then writing all the Column2 column-level-CUs etc.

The flash cache hash table maintains a simple mapping of column numbers to 64KB flash pages so, for any given query, we can simply do asynchronous disk I/O of the minimum set of 64KB pages required to cover the referenced columns.

Within the “flashcache” cell stats there is a new nested cell stat called “columnarcache” that enables you to track how it is working.

> set long 50000000
> set pagesize 10000
> select xmltype(STATISTICS_VALUE).getclobval(2,2) from v$cell_state;

XMLTYPE(STATISTICS_VALUE).GETCLOBVAL(2,2)
--------------------------------------------------------------------------------
....

  <stats type="flashcache">
    <stats type="columnarcache">
      <stat name="columnar_cache_size">0</stat>
      <stat name="columnar_cache_keep_size">0</stat>
      <stat name="columnar_cache_keep_read">0</stat>
      <stat name="columnar_cache_keep_read_bytes">0</stat>
      <stat name="columnar_attempted_read">0</stat>
      <stat name="columnar_attempted_read_bytes">0</stat>
      <stat name="columnar_cache_hits">0</stat>
      <stat name="columnar_cache_hits_read_bytes">0</stat>
      <stat name="columnar_cache_hits_saved_bytes">0</stat>
      <stat name="columnar_cache_pop">0</stat>
      <stat name="columnar_cache_pop_bytes">0</stat>
      <stat name="columnar_cache_pop_throttled">0</stat>
      <stat name="columnar_cache_pop_invalid_buf">0</stat>
      <stat name="columnar_cache_pop_size_align">0</stat>
      <stat name="columnar_cache_pop_findchdrfailure_mainchdr">0</stat>
      <stat name="columnar_cache_pop_findchdrfailure_childchdr">0</stat>
    </stats>
  </stats>

I typically spool the output of this to wade through it an editor but if we want to monitor how it is working with some workload, we need to extract individual stats, for example I bounced the cells and verified the cache was empty:

> select xmlcast(xmlquery('/cell_stats/stats/stats/stat[@name="columnar_cache_size"]' passing xmltype(STATISTICS_VALUE) returning content) as varchar2(200) ) "COLUMNAR CACHE SIZE"
from v$cell_state
where xmlexists('/cell_stats/stats/stats[@type="columnarcache"]' passing xmltype(STATISTICS_VALUE));

COLUMNAR CACHE SIZE
--------------------------------------------------------------------------------
0

I am using the 1GB TPC-H schema which takes a little more 400MB on disk when compressed with Query Low:

SQL> select sum(bytes) from user_segments where SEGMENT_NAME in ('SUPPLIER','PARTSUPP','LINEITEM','ORDERS','PART','CUSTOMER');

SUM(BYTES)
----------
 420675584

and checking the columnar cache again shows about half of the data has been rewritten into columnar cache format instead of caching raw blocks:

SQL> select xmlcast(xmlquery('/cell_stats/stats/stats/stat[@name="columnar_cache_size"]' passing xmltype(STATISTICS_VALUE) returning content) as varchar2(200) ) "COLUMNAR CACHE SIZE"
from v$cell_state
where xmlexists('/cell_stats/stats/stats[@type="columnarcache"]' passing xmltype(STATISTICS_VALUE));

COLUMNAR CACHE SIZE
--------------------------------------------------------------------------------
179306496

So let’s look at how the cache helped:

select
   xmlcast(
      xmlquery(
         '/cell_stats/stats/stats/stat[@name="columnar_cache_hits_read_bytes"]' 
         passing xmltype(STATISTICS_VALUE) 
         returning content
      ) as varchar2(200) 
   ) "HIT READ BYTES"
from v$cell_state
where xmlexists('/cell_stats/stats/stats[@type="columnarcache"]' passing xmltype(STATISTICS_VALUE));

 
HIT READ BYTES
--------------------------------------------------------------------------------
1909456896

select
   xmlcast(
      xmlquery(
         '/cell_stats/stats/stats/stat[@name="columnar_cache_hits_saved_bytes"]'
         passing xmltype(STATISTICS_VALUE)
         returning content
      ) as varchar2(200) 
   ) "HIT SAVED BYTES"
from v$cell_state
where xmlexists('/cell_stats/stats/stats[@type="columnarcache"]' passing xmltype(STATISTICS_VALUE));

 
HIT SAVED BYTES
--------------------------------------------------------------------------------
1128267776

which shows we were able to eliminate about 35% of the disk I/O for this query run!

We could, of course, have gotten that information more simply with the regular stat “cell physical IO bytes saved by columnar cache” but I wanted to show how to pull values from v$cell_state for use in scripts.

Many people only use Query High compression as they find the increased disk I/O from Query Low more than offsets the savings from cheaper decompression costs. However, with the columnar cache in place, those trade-offs have changed. It may be worth re-evaluating the decision as to when user Query Low vs. Query High particularly on CPU-bound cells.

Addendum: performance bug 20525311 affecting the columnar cache with selective predicates is fixed in the next rpm.

Tag Archives: Columnar Cache

What’s new in 12.2 CELLMEMORY Part 1

Quick reminder of DBIM essentials

For Exadata

For non-Exadata

Cellmemory

Coming up

Examining the new Columnar Cache with v$cell_state

Roger MacNicol
Oracle Data Storage Technology

Quick reminder of DBIM essentials

For Exadata

For non-Exadata

Cellmemory

Coming up

Roger MacNicolOracle Data Storage Technology

Roger MacNicol
Oracle Data Storage Technology