Oracle SQL | Why are HCC stats being bumped when Smart Scanning row major data in 12.2

In 12.2, there is a stat “cell blocks pivoted” that points to a new optimization. When Smart Scan processes data, it has to create a new synthetic block that only contains the columns that are needed and the rows that pass the offloaded predicates.

If Smart Scan has to create a new block, why would you create a new row-major block when we can just as easily create uncompressed HCC block? It is roughly the same amount of work but once the synthetic block is returned to the RDBMS for processing, columnar data is 10X cheaper to process when the query uses rowsets. SO where does the name of this stat come from? Simple, it works just like a spreadsheet PIVOT operation, swapping rows for columns.

So what is a “rowset”? It simply means that a step in a query plan will take in a set of rows before processing and then process that entire batch of rows in one go before fetching more rows; this is significantly more efficient than processing one row at a time. You can see when a query uses rowsets via:

select * from table(dbms_xplan.display_cursor('<sql_id>',1,'-note +projection'));

in which case you will see projection metadata with details such as “(rowsets=200)”.

When Smart Scan sees that the query on the RDBMS is using rowsets, it tries to create a columnar output block instead of a row-major output block and when this arrives on the RDBMS, the table scan will see an HCC block and bump the HCC stats even though the table being scanned doesn’t use HCC. Let’s see how this works in practice:

I created a TPC-H Scale 1 database in NOCOMPRESS format:

SQL> select COMPRESSION,COMPRESS_FOR from user_tables where table_name='SUPPLIER';

COMPRESS COMPRESS_FOR
-------- ------------------------------
DISABLED

and run Query 6:

SQL> select sum(l_extendedprice * l_discount) as revenue
     from   lineitem
     where  l_shipdate >= to_date( '1994-01-01',  'YYYY-MM-DD')
        and l_shipdate < add_months(to_date( '1994-01-01',  'YYYY-MM-DD'),  12) 
        and l_discount between .06 - 0.01 and .06 + 0.01
        and l_quantity < 24;
SQL> select name,value from v$sysstat where name = 'cell blocks pivoted'; 

NAME                          VALUE
------------------------------ ----------
cell blocks pivoted                  8261

Just to show how this is tied to rowsets, let’s disable this optimization and recheck the stat:

SQL> alter session set "_rowsets_enabled"=false;
SQL> select sum(l_extendedprice * l_discount) as revenue
     from   lineitem
     where  l_shipdate >= to_date( '1994-01-01',  'YYYY-MM-DD')
        and l_shipdate < add_months(to_date( '1994-01-01',  'YYYY-MM-DD'),  12) 
        and l_discount between .06 - 0.01 and .06 + 0.01
        and l_quantity < 24; 

SQL> select name,value from v$sysstat where name = 'cell blocks pivoted';

NAME                           VALUE
------------------------------ ----------
cell blocks pivoted                  8261

I hope this helps.

—
Roger