select s.module ,ceil(max(s.rows_processed/s.fetches)) rows_per_fetch from v$sql s where s.rows_processed>100 and s.executions >1 and s.fetches >1 and s.module is not null and s.command_type = 3 -- SELECTs only and s.program_id = 0 -- do not account recursive queries from stored procs and s.parsing_schema_id!=0 -- <> SYS group by s.module order by rows_per_fetch desc nulls last /
Revisiting buffer cache Very Large Object Threshold
If you turn on NSMTIO tracing you will see references to VLOT:
qertbFetch:[MTT < OBJECT_SIZE < VLOT]: Checking cost to read from caches (local/remote) and checking storage reduction factors (OLTP/EHCC Comp)
I had said you could ignore VLOT and Frits Hoogland pointed out that tracing showed it had some impact, so let me clarify:
VLOT is the absolute upper bound that cached reads can even be considered.
This defaults to 500% of the number of buffers in the cache i.e.
_very_large_object_threshold = 500
While this number is not used in any calculations, it is used in two places as a cutoff to consider those calculations
1) Can we consider using Automatic Big Table Caching (a.k.a. DWSCAN) for this object?
2) Should we do a cost analysis for Buffer Cache scan vs Direct Read scan on tables larger than the MTT?
The logic for tables above the calculated medium table threshold (MTT) and that are NOT part of searched DMLs and are NOT on Exadata with statistics based storage reduction factor enabled (_statistics_based_srf_enabled) is:
- If _serial_direct_read == ALWAYS, use Direct Read
- If _serial_direct_read == NEVER, use Buffer Cache
- If _serial_direct_read == AUTO and #blocks in table < VLOT, use cost model
- Else use Direct Read “qertbFetch:DirectRead:[OBJECT_SIZE>VLOT]”
In practice 5X buffer cache is so large the cost based decision will come to the same conclusion anyway – the default VLOT simply saves time spent doing the analysis.
For example, I got a quick count of the number of blocks in non-partitioned TPC_H Scale 1 lineitem
select segment_name,sum(blocks),sum(bytes) from user_extents where segment_name='LINEITEM'
and created my buffer cache to be exactly the same size. With this setup, setting _very_large_object_threshold=100
bypassed the cost model and went straight to DR scan, while setting it to 200 forced the use of the cost model.
The moral of this is that the default value of VLOT rarely changes the decisions made unless you reduce VLOT to a much smaller multiplier of the cache size and can start to see it cause a few more of your larger buffer cache scans move to direct read when they are no longer eligible for cost analysis. If you wish to stop some of the largest buffer cache scans from happening you would need to set _very_large_object_threshold
less than 200.
Random thoughts on block sizes
I heard “Oracle only tests on 8k and doesn’t really test 16k”
I heard someone assert that one reason you should only use 8k block sizes is that, and I quote, “Oracle only tests on 8k and doesn’t really test 16k”. I tried googling that rumour and tracked it back to AskTom. Also as disks and memory get bigger and CPUs get faster it is natural to ask if 8k is now “too small”.
So here are my thoughts:
1. A quick scan of the data layer regression tests showed a very large number running on 16k blocks
2. Oracle typically runs it DW stress tests on 16k blocks
So, clearly, the assertion is untrue but I did spot some areas where 32k testing could be improved. I also added a note to AskTom clarifying testing on 16k.
Does this mean I should look at using 16k blocks for me DW
Whoa, not so fast. Just because table scans typically run slightly faster on 16k blocks and compression algorithms typically get slightly better compression on 16k blocks does not mean your application will see improvements
1. Multiple block sizes mean added work for the DBA
2. Databases do a lot more than just scan tables – e.g. row based writes and reads could perform worse
3. Most apps have low hanging fruit that will give you far better ROI on your time than worrying about block sizes (if you don’t believe me, attend one of Jonathan Lewis’s excellent index talks).
Most applications run fine on 8k because it is a good trade off between different access paths and, in general, 8k is still the right choice for most applications.
What about ‘N’ row pieces
In general Oracle’s block layout tries to ensure row pieces are split on column boundaries but in the case of very wide columns we will split in the middle of a column if too much space would be wasted by aligning with column boundaries. When a column is split in the middle it creates what is known as an ‘N’ row piece.
Rows are split by default at 255 column boundaries assuming the row piece fits in the block If you have a table with very wide rows or some very wide inline columns, smaller block sizes will result both in rows being split more often and in columns being split in the middle. At a minimum the number of block gets to retrieve a single row will likely increase. This is one case to consider where a 16k block size may be worth investigating.
The curious case of 70X compression
We had case of a customer legitimately point out that ZLIB would give much better than 70X compression on his very repetitive data but COMPRESS FOR QUERY HIGH was stuck at 70X. And, as it turns out, this is a factor of block size.
Hybrid Columnar Compression (HCC) takes up to 8 MB of raw data and compresses it down to one CU (Compression Unit) subject to row limits depending on compression type. For example COMPRESS FOR ARCHIVE LOW (which is also ZLIB) is limited to 8,000 rows in a CU. But there is another limit which you may deduce from block dumps which is that HCC will only allow one head piece row per block because HCC rowids are the RDBA of the head piece plus the row offset into the CU. So if you have incredibly compressible data where 8,000 rows compress down to less 16k moving to a larger block size can be bad news because you will end up with wasted space at the end of each block.
Summary
There are typically better things to spend time on than worrying about block size but if you wish to run your DW on 16k blocks they are thoroughly tested just like 8k blocks are.
External Tables Part 1 – Project Columns All vs Referenced
I normally blog about table scans on Oracle native data but Oracle also supports a wide variety of features for scanning external tables and I need to cover these too. One of the ways I learn new things is being sent a bug report and saying to myself “Oh! I didn’t know you could do that”. So today I’m going to start with the grammar:
Alter Table <xt> Project Columns [ All | Referenced ]
PL/SQL functions: Iterate and keys for associative arrays
Unfortunately associative arrays still require more “coding”:
we still can’t use “indices of” or “values of” in simple FOR(though they are available for FORALL for a long time), don’t have convinient iterators and even function to get all keys…
That’s why I want to show my templates for such things like iterator and keys function. You can adopt these functions and create them on schema level.
declare type numbers is table of number; type anumbers is table of number index by pls_integer; a anumbers; i pls_integer; function iterate( idx in out nocopy pls_integer, arr in out nocopy anumbers) return boolean as pragma inline; begin if idx is null then idx:=arr.first; else idx:=arr.next(idx); end if; return idx is not null; end; function keys(a in out nocopy anumbers) return numbers as res numbers:=numbers(); idx number; pragma inline; begin while iterate(idx,a) loop res.extend; res(res.count):=idx; end loop; return res; end; begin a(1):=10; a(3):=30; a(5):=50; a(8):=80; -- iterate: while iterate(i,a) loop dbms_output.put_line(a(i)); end loop; -- keys: for i in 1..keys(a).count loop dbms_output.put_line(a(keys(a)(i))); end loop; end;
How to group connected elements (or pairs)
I see quite often when developers ask questions about connected components:
Table “MESSAGES” contains fields “SENDER” and “RECIPIENT”, which store clients id. How to quickly get all groups of clients who are connected even through other clients if the table has X million rows? So for this table, there should be 4 groups:
|
|
Of course, we can solve this problem using SQL only (model, recursive subquery factoring or connect by with nocycle), but such solutions will be too slow for huge tables.
with t(sender,recipient) as (select level,level*2 from dual connect by level<=10) , v1 as (select rownum id,t.* from t) , v2 as (select id, account from v1 unpivot ( account for x in (sender,recipient) )) , v3 as ( select id ,account ,dense_rank()over(order by account) account_n ,count(*)over() cnt from v2) , v4 as ( select distinct grp,account from v3 model dimension by (id,account_n) measures(id grp,account,cnt) rules iterate(1e6)until(iteration_number>cnt[1,1])( grp[any,any] = min(grp)[any,cv()] ,grp[any,any] = min(grp)[cv(),any] ) ) select listagg(account,',')within group(order by account) s from v4 group by grp
In such situations it’s much better to adopt standard algorithms like Quick-find or Weighted quick-union for PL/SQL.
The first time I wrote such solution about 5 years ago and I even posted here one of the latest solutions, but all of them were not universal, so I’ve created the package today with a couple of functions for most common problems: XT_CONNECTED_COMPONENTS
It contains 2 functions based on Weighted quick-find quick-union algorithm:
- function get_strings(cur in sys_refcursor, delim varchar2:=’,’) return strings_array pipelined;
It accepts a cursor and returns found connected components as table of varchar2(v_size). You can change v_size in the package definition.
Input cursor should contain one Varchar2 column with linked strings, for example: ‘a,b,c’.
You can also specify list delimiter, by default it is comma.
Examples:select * from table(xt_connected_components.get_strings( cursor(select ELEM1||','||ELEM2 from TEST)); select * from table( xt_connected_components.get_strings( cursor(select 'a,b,c' from dual union all select 'd,e,f' from dual union all select 'e,c' from dual union all select 'z' from dual union all select 'X,Y' from dual union all select 'Y,Z' from dual))); COLUMN_VALUE ----------------------------------------- STRINGS('X', 'Y', 'Z') STRINGS('a', 'b', 'c', 'd', 'e', 'f') STRINGS('z')
- function get_numbers(cur in sys_refcursor) return numbers_array pipelined;
This function also returns connected components, but for numbers.
Input cursor should contain two columns with linked numbers.
Examples:select * from table( xt_connected_components.get_numbers( cursor( select sender_id, recipient_id from messages ))); select * from table( xt_connected_components.get_numbers( cursor( select level account1 , level*2 account2 from dual connect by level<=10 ))); SQL> select * 2 from 3 table( 4 xt_connected_components.get_numbers( 5 cursor( 6 select level account1 7 , level*2 account2 8 from dual 9 connect by level<=10 10* ))) SQL> / COLUMN_VALUE ------------------------ NUMBERS(1, 2, 4, 8, 16) NUMBERS(3, 6, 12) NUMBERS(5, 10, 20) NUMBERS(7, 14) NUMBERS(9, 18)
How to install:
Download all files from Github and execute “@install” in SQL*Plus or execute them in another tool in the following order:
xt_connected_components_types.sql
xt_connected_components_pkg.sql
xt_connected_components_bdy.sql
Download URL: https://github.com/xtender/xt_scripts/tree/master/extra/xt_connected_components
Ampersand instead of colon for bind variables
I’ve troubleshooted one query today and I was very surprised that bind variables in this query were specified with &ersand instead of :colon! I have never seen this before and I couldn’t find anything about this in documentation…
Unfortunately SQL*Plus doesn’t support ampersand yet, even if you disable define (“set define off”),
so I’ve tested such behaviour with this code:
set def off serverout on exec declare s varchar2(1); begin execute immediate 'select 1 from dual where dummy=&var' into s using 'X'; dbms_output.put_line(s); end;
And it really works! //at least on 11.2.0.2 and 12.2.0.1
SQL> set def off serverout on SQL> exec declare s varchar2(1); begin execute immediate 'select 1 from dual where dummy=&var' into s using 'X'; dbms_output.put_line(s); end; 1 PL/SQL procedure successfully completed. SQL> select substr(sql_text,1,40) stext,sql_id,executions,rows_processed from v$sqlarea a where sql_text like '%dual%&var'; STEXT SQL_ID EXECUTIONS ROWS_PROCESSED ------------------------------------- ------------- ---------- -------------- select 1 from dual where dummy=&var ckkw4u3atxz02 3 3 SQL> select * from table(dbms_xplan.display_cursor('ckkw4u3atxz02')); PLAN_TABLE_OUTPUT -------------------------------------------------------------------------- SQL_ID ckkw4u3atxz02, child number 0 ------------------------------------- select 1 from dual where dummy=&var Plan hash value: 272002086 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 2 (100)| | |* 1 | TABLE ACCESS FULL| DUAL | 1 | 2 | 2 (0)| 00:00:01 | -------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("DUMMY"=:VAR) 18 rows selected.
Update: Btw, it works for SQL only, not for PL/SQL:
SQL> var v varchar2(1); SQL> begin &v = 'Z'; end; 2 / begin &v = 'Z'; end; * ERROR at line 1: ORA-06550: line 1, column 7: PLS-00103: Encountered the symbol "&" when expecting one of the following: SQL> exec &v := 'X'; BEGIN &v := 'X'; END; * ERROR at line 1: ORA-06550: line 1, column 7: PLS-00103: Encountered the symbol "&" when expecting one of the following: The symbol "&" was ignored.
SQL> exec :v := 'X'; PL/SQL procedure successfully completed. SQL> select * from dual where dummy=&v 2 ; D - X
And we can can use mixed placeholders:
SQL> select * from dual where dummy=&v and &v=:v; D - X
Simple regexp to check that string contains word1 and word2 and doesn’t contain word3
with tst as ( select 'qwe word1 asd ...............' s from dual union all select 'qwe word1 asd word2 .........' s from dual union all select 'qwe word1 asd word2 zxc word3' s from dual union all select 'qwe word2 asd word1 zxc word4' s from dual ) select s ,regexp_replace(s, '(word1)|(word2)|(word3)|(.)','`\3') subst ,case when regexp_like(regexp_replace(s, '(word1)|(word2)|(word3)|(.)','`\3') , '^`+$') then 'matched' end tst2 from tst where 1=1 --and regexp_like(regexp_replace(s, '(word1)|(word2)|(word3)|(.)','`\3') , '^`+$')
Book advice: ORACLE SQL & PL/SQL Golden Diary by Asim Chowdhury
I’ve reviewed this book recently, and I highly recommend it as it has almost all that needed to become strong Oracle developer. You can check at least the table of contents:
ORACLE SQL & PL/SQL Golden Diary: by Asim Chowdhury
New Book Demystifies Complex Cross-Version Oracle Problem SolvingCompiled by veteran computer scientist and data modeler, Asim Chowdhury, ‘ORACLE SQL & PL/SQL Golden Diary: Refactoring, Interoperability of Versions & Integration of related concepts for High Performance’ is the first book on the market that comprehensively allows data architects to unravel any concepts in SQL and PL/sql till oracle 12c. It’s poised to remove much confusion from the many versions of Oracle SQL now on the market; a Godsend for the computer science industry.
The beginners guide to Oracle Table Scans
I was asked a question yesterday that reminded me there are always people completely new to the topic who need an introduction – somewhere to start before the other articles make sense. So, here’s my brief write-up of everything you need to know about the basic of Oracle Table Scans.
Oracle has four main ways of scanning a table: the pre-9ir2 table scan, the 9ir2 TurboScan, the 11.1.0.1 Exadata SmartScan, and the 12.1.0.1 In-Memory Scan. Before we summarize each one, the other fundamental piece of information is the Oracle dictum that all blocks much be self-describing: a table scan routine should be able to inspect a block and understand what object it belongs, whether it needs an undo applying, and how the data is laid out without reference to any external structures or secondary storage.
The original table scan routine
Oracle uses a “dataflow” query engine which means a query plan is built from nodes like a sausage machine that have three basic operations: Open, Next, Close. ‘Open’ means you ask the next node in the chain to prepare to do some work including acquiring any resources it may need, ‘Next’ means you fetch one unit of work from your child e.g. a row, and ‘Close’ means to tell your child node to shut down and release any resources it may be holding. You build a query by connecting the right kinds of nodes together in the order you want: one node just sorts, another groups, another does hash joins. The end of the sausage machine is the node seen on query plans as “Table Access Full”
This node would ask the data layer to fetch a block from disk then get rows one at a time from the data layer. This is the work horse table scan: it can scan any kind of data and do SCN manipulations like row versions but it is not the fastest way to scan a table.
9ir2 TurboScan
In 9ir2 we introduced a much faster way of scanning tables called TurboScan. The data layer function which had been handing out rows one at a time was replaced by one that stays in a tight loop retrieving rows from disk and pushing them into a callback supplied by “Table Access Full”. An automation tool was used to generate several versions of this routine that optimized out common choices that has to be made: does the user need rowids to be projected? do they need predicates applying? is the data compressed or? is the data column-major or row-major? etc etc Every time a CPU reaches a branch in the code it tries to guess which side of the branch will be taken but if it guess wrong there can be a considerable stall during which no work gets done. By removing most of the branches, the code runs much much more quickly.
TurboScan is used for all queries which do not use RAW datatypes and which do not need special SCN processing.
Both pre-9ir2 scan and TurboScan can use the buffer cache to get blocks (typically small to medium tables) or use Direct Read to get blocks (typically medium to large tables).
See: When bloggers get it wrong – part 1
TurboScan can be disabled for triage purposes by setting:
SQL> alter session set events='12099 trace name context forever, level 1';
or specifically you can disable it only for HCC tables by setting:
SQL> alter session set "_arch_comp_dbg_scan"=1;
Exadata SmartScan
In 11.1.0.1 we introduced Exadata SmartScan intelligent storage. This is where a thin layer of database processing is embedded in the storage cells and the table scan routine offloads simple search criteria and a list of the columns it needs to storage and the storage cells pre-process the blocks to remove rows that fail the search criteria and remove columns which are not needed by the table scan. If all the rows are removed, the block doesn’t have to be sent back at all.
SmartScan can drastically reduce the amount of data returned on the Interconnect and put on the RDBMS memory bus and the space used in SGA by the returned data. An additional significant benefit is gained when the CPU fetches the reduced blocks into the CPU cache since only relevant information exists on the block there is not space wasted by unwanted columns interspersing the wanted columns meaning more relevant data can fit in memory and the CPU prefetch can do a better job of predicting which memory cache line to fetch next.
Only TurboScan Direct Read scans can use this offload capability. You can disable SmartScan for triage purposes by setting:
SQL> alter session set cell_offload_processing=FALSE;
or
SQL> select /*+ opt_param('cell_offload_processing','false') */ <col> from <tab> where <predicate>;
In-Memory Scans
In-Memory scans were introduced in 12.1.0.1 and brought a revolutionary increase in table scan speeds. With In-Memory scans the table or partition is loaded into a in-memory tablespace in SGA known as the inmemory-area. Data is stored in compressed columnar format typically up to 500,000 values in each columnar compression unit. This tablespace is kept transactionally consistent with the data on disk via means of an invalidation bitmap.
Just like with SmartScan, only TurboScan can use In-Memory scans with In-Memory objects. Instead of getting a block from disk, the specialized version of the scan routines fetches a column run from each column of interest, process the search criteria, then returns column runs with the failing rows removed to the “Table Access Full” node.
If any rows have been modified and committed by other users or the users own transaction has modified any rows the scan will see these rows set in the invalidation bitmap. These rows are removed from the columnar results and the additional rows required are fetched from the buffer cache before moving on to the next set of column runs. This works well because the most recently modified blocks are the ones most likely to still be in the buffer cache.