This document is my attempt to bring together the available options that can be used to determine the root cause of an issue in order to create a roadmap to help support engineers narrow down the cause of concern.
It is a living document and will be edited and amended as time goes by. Please do check back again in the future.
Warning: these parameters should only be used in conjunction with an Oracle Support Engineer and are not intended for DBAs to self-triage; also they should not be left set after triage without discussion with an Oracle Support Engineer.
The Basics
- Check if the issue reproduces without SmartScan
alter session set cell_offload_processing=FALSE;
orselect /*+ opt_param(‘cell_offload_processing’,’false') */ <col> from <tab>;
This completely turns off Smart Scan, RDBMS will act like non-Exadata and do its own disk I/O through the buffer cache or Direct Read
- Check if the issue reproduces in cell pass thru mode
alter session set “_kcfis_cell_passthru_enabled”=TRUE;
This still uses SmartScan but turns off the smarts, the blocks are read by the offload server and returned unprocessed
- Check if it reproduces in emulation mode
alter session set “_rdbms_internal_fplib_enabled”=TRUE;
alter session set “_serial_direct_read”=TRUE;
This mode runs the copy of Smart Scan linked into RDBMS to see if the issue stems from offload but not the SmartScan part of it. Be aware: bug fixes and patches are delivered to RDBMS and to the offload server independently – one may have fixes that the other does not and vice versa.
- There is NO point in trying
— alter session set “_kcfis_rdbms_blockio_enabled”=TRUE;
It simply forces the ‘file intelligent storage’ layer to divert to the ‘direct file’ layer i.e. regular block I/O – this achieves exactly the same thing as ‘cell_offload_processing=FALSE’ but in a round about way.
Triaging issues with Storage Index
- Check if the issue reproduces with storage index disabled
alter session set “_kcfis_storageidx_disabled”=TRUE;
This will completely disable Storage Index and all chunks will be processed by SmartScan without SI filtering happening.
- Check if the issue reproduces in Diagnostic mode
alter session set “_ kcfis_storageidx_diag_mode “=1;
This will run the query both with and without SI and then compare to make sure SI would have returned the same result for that chunk.
- Check whether SI Min/Max processing is the issue:
alter cell offloadgroupEvents = “immediate cellsrv.cellsrv_setparam(‘_cell_pred_enable_fp_preprocess’, ‘FALSE’)”;
- Check whether any Set Memerbship metadata stored in SI is the issue (this only works in conjunction with the IM format columnar cache a.k.a CC2).
alter session set “_kcfis_storageidx_set_membership_disabled”=FALSE;
Triaging issues with Flash Cache
- If the object is to bypass FC for a single table, the correct way to eliminate FC as a cause is to disable caching for that segment and cause any cached blocks to be flushed.
alter table <foo> storage( cell_flash_cache NONE);
In order to resume default behaviour one would use:
alter table <foo> storage(cell_flash_cache DEFAULT );
- If the goal is to completely bypass the FC layer we need to change the caching policy of the griddisk which will flush the current contents and prevent both write-thru and write-back caching.
cellcli>ALTER GRIDDISK grid_disk_name CACHINGPOLICY=”none”;
In order to resume normal caching policy, one would use:
cellcli>ALTER GRIDDISK grid_disk_name CACHINGPOLICY=”default”;
- Note: the parameter “_kcfis_kept_in_cellfc_enabled” is NOT the correct way to bypass FC because in many cases the disk I/O must go through FC anyway.
Triaging issues with the Columnar Cache
- Check if the columnar cache is the cause of the issue:
alter session set “_enable_columnar_cache”=0;
and to turn it back on again with default behaviour:
alter session set “_enable_columnar_cache”=1;
Note: do not use “_kcfis_cellcache_disabled” – that is not the correct way to triage this.
- Check if the IM format (a.k.a. CC2) columnar cache is the cause of the issue by forcing version 1 format to be used:
alter session set “_enable_columnar_cache”=33; — 0x01 + 0x20
- Check whether using the columnar cache with row-major blocks is the cause of the issue
alter session set “_enable_columnar_cache”=16385; — 0x01 + 0x4000
- Check whether using the columnar cache with Hybrid Columnar blocks is the cause of the issue
alter session set “_enable_columnar_cache”=32769; — 0x01 + 0x8000
Tracing offload processing
Please see:
- Using trace events in an offload server
- Tracing Hybrid Columnar Compression Offload
- Tracing an Offload Group
Triaging PCODE processing
- PCODE is our new byte code for evaluating predicates and aggregates – to go back to the old way use:
alter session set “_kdz_pcode_flags” = 1;
and to turn it back on again:
alter session set “_kdz_pcode_flags” = 0;
Useful Debug Scan values
- Disable LOB predicate pushdown to Smart Scan:
alter session set “_dbg_scan”=1;
- Disable rowset function evaluation in Smart Scan:
alter session set “_dbg_scan”=4096;
- Disable aggregation pushdown to Smart Scan:
alter session set “_dbg_scan”=8192;
- Disable Hybrid IM scan – this is where In-Memory is interleaved with Smart Scan
alter session set “_dbg_scan”=131072;