Oracle SQL | Tag Archives: oracle

DMLs and the Columnar Cache on ADW

Posted on August 23, 2019 by Roger MacNicol Posted in 12.2, inmemory, oracle 1,987 Page views Leave a comment

One of the main performance technologies underlying ADW is the In-Memory format column cache and a question that has come up several times is: how does the columnar cache handle DMLs and cache invalidations. This blog post attempts to answer that question.

The Two Columnar Cache Formats

The original columnar cache was an idempotent rearrangement of Hybrid Columnar Compressed blocks into pure columnar form still in the original HCC encoding. This is known internally as “CC1” and was not applicable to row format blocks. Because this is an idempotent transformation we are able to reconstitute the original raw blocks from a CC1 cache if needed.

We then introduced the In-Memory format columnar cache where, if you had an In-Memory licence, we would run the 1 MB chunks processed by Smart Scan through the In-Memory loader and write new clean columns encoded in In-Memory formats which meant that we could then use the new SIMD based predicate evaluation and other performance improvements developed for Database In-Memory. If you do not have an In-Memory licence, the original CC1 column cache is used. Another advantage of the CC2 format is that we can load row format blocks as well as HCC format blocks into pure columnar In-Memory format.

Continue reading→

Thoughts on Big Data SQL SmartScan

Posted on August 18, 2019 by Roger MacNicol Posted in cell_offload, External tables, oracle, SmartScan 1,656 Page views Leave a comment

Just a few thoughts I’ve been meaning to blog for a while:

1. Number of columns that can be offloaded

Exadata

Very early on in the Exadata Smart Scan implementation a restriction was added to prevent offloading more than 255 columns. This was done because of performance issues observed by customers. Smart Scan works on 1 MB chunks at a time and needs all the row pieces up to the highest #segcol referenced to be present in the 1 MB chunk that is being processed. When rows with more than 255 columns are first inserted all the row pieces are contiguous and hence likely to be present in the 1 MB chunk.

However, if the DBA has not provided sufficient freespace (PCTFREE) in the block when it is first loaded subsequent DMLs may end up splitting row pieces and moving pieces to blocks in the freelist that have more space available. PCTFREE specifies how much empty space to leave for future updates when a block is first populated.

Continue reading→

Oracle scheduler: how to find jobs with wrong nls_env

Posted on August 2, 2019 by Sayan Malakshinov Posted in oracle 1,664 Page views Leave a comment

select nls_env
      ,count(*) cnt
      ,xmlcast(xmlagg(xmlelement(job_name, job_name||',')).extract('//text()') as clob) jobs
from dba_scheduler_jobs 
group by nls_env;

Laterals: is (+) documented for laterals?

Posted on July 9, 2019 by Sayan Malakshinov Posted in bug, CBO, documentation, oracle 1,557 Page views Leave a comment

I know this syntax for a long time, since when lateral() was not documented yet, but recently I found a bug: the following query successfully returns 1 row:

with a as (select level a from dual connect by level<10)
    ,b as (select 0 b from dual)
    ,c as (select 0 c from dual)
select
  *
from a,
     lateral(select * from b where a.a=b.b)(+) bb
     --left outer join c on c.c=bb.b
where a=1;

         A          B
---------- ----------
         1

But doesn’t if we uncomment “left join”:

with a as (select level a from dual connect by level<10)
    ,b as (select 0 b from dual)
    ,c as (select 0 c from dual)
select
  *
from a,
     lateral(select * from b where a.a=b.b)(+) bb
     left outer join c on c.c=bb.b
where a=1;

no rows selected

And outer apply works fine:

with a as (select level a from dual connect by level<10)
    ,b as (select 0 b from dual)
    ,c as (select 0 c from dual)
select
  *
from a
     outer apply (select * from b where a.a=b.b) bb
     left outer join c on c.c=bb.b
where a=1;

         A          B          C
---------- ---------- ----------
         1

v$sql_hint.target_level

Posted on May 28, 2019 by Sayan Malakshinov Posted in CBO, oracle, SQL, undocumented 2,031 Page views Leave a comment

Today I wanted to give a link to the description of v$sql_hint.target_level to show that no_parallel can be specified for statement or object, and though it’s pretty obvious, but surprisingly I haven’t found any articles or posts about it, so this short post describes it.
v$sql_hint.target_level is a bitset, where
1st bit set to 1 means that the hint can be specified on statement level,
2nd – on query block level,
3rd – on object level,
4th – on join level(for multiple objects).
Short example:

   select name,sql_feature
          ,class,inverse
          ,version,version_outline
          ,target_level
         ,decode(bitand(target_level,1),0,'no','yes') Statement_level
         ,decode(bitand(target_level,2),0,'no','yes') Query_block_level
         ,decode(bitand(target_level,4),0,'no','yes') Object_level
         ,decode(bitand(target_level,8),0,'no','yes') Join_level
   from v$sql_hint h;

with hints as (
   select name,sql_feature
          ,class,inverse
          ,version,version_outline
          ,target_level
         ,decode(bitand(target_level,1),0,'no','yes') Statement_level
         ,decode(bitand(target_level,2),0,'no','yes') Query_block_level
         ,decode(bitand(target_level,4),0,'no','yes') Object_level
         ,decode(bitand(target_level,8),0,'no','yes') Join_level
   from v$sql_hint h
)
select *
from hints
where statement_level='yes'
  and to_number(regexp_substr(version,'^\d+')) >= 18
order by version;

Result:

NAME              SQL_FEATURE     CLASS                VERSION  TARGET_LEVEL STATEMENT_LEVEL QUERY_BLOCK_LEVEL OBJECT_LEVEL JOIN_LEVEL
----------------- --------------- -------------------- -------- ------------ --------------- ----------------- ------------ ----------
PDB_LOCAL_ONLY    QKSFM_DML       PDB_LOCAL_ONLY       18.1.0              1 yes             no                no           no
SUPPRESS_LOAD     QKSFM_DDL       SUPPRESS_LOAD        18.1.0              1 yes             no                no           no
SYSTEM_STATS      QKSFM_ALL       SYSTEM_STATS         18.1.0              1 yes             no                no           no
MEMOPTIMIZE_WRITE QKSFM_EXECUTION MEMOPTIMIZE_WRITE    18.1.0              1 yes             no                no           no
SKIP_PROXY        QKSFM_ALL       SKIP_PROXY           18.1.0              1 yes             no                no           no
CURRENT_INSTANCE  QKSFM_ALL       CURRENT_INSTANCE     18.1.0              1 yes             no                no           no
JSON_LENGTH       QKSFM_EXECUTION JSON_LENGTH          19.1.0              1 yes             no                no           no
QUARANTINE        QKSFM_EXECUTION QUARANTINE           19.1.0              1 yes             no                no           no

Top time-consuming predicates from ASH

Posted on May 13, 2019 by Sayan Malakshinov Posted in oracle, query optimizing, SQL, statistics, troubleshooting 1,894 Page views Leave a comment

Sometimes it might be useful to analyze top time-consuming filter and access predicates from ASH, especially in cases when db load is spread evenly enough by different queries and top segments do not show anything interesting, except usual things like “some tables are requested more often than others”.
Of course, we can start from analysis of SYS.COL_USAGE$: col_usage.sql

col_usage.sql

[sourcecode language=”sql”]
col owner format a30
col oname format a30 heading "Object name"
col cname format a30 heading "Column name"
accept owner_mask prompt "Enter owner mask: ";
accept tab_name prompt "Enter tab_name mask: ";
accept col_name prompt "Enter col_name mask: ";

SELECT a.username as owner
,o.name as oname
,c.name as cname
,u.equality_preds as equality_preds
,u.equijoin_preds as equijoin_preds
,u.nonequijoin_preds as nonequijoin_preds
,u.range_preds as range_preds
,u.like_preds as like_preds
,u.null_preds as null_preds
,to_char(u.timestamp, ‘yyyy-mm-dd hh24:mi:ss’) when
FROM
sys.col_usage$ u
, sys.obj$ o
, sys.col$ c
, all_users a
WHERE a.user_id = o.owner#
AND u.obj# = o.obj#
AND u.obj# = c.obj#
AND u.intcol# = c.col#
AND a.username like upper(‘&owner_mask’)
AND o.name like upper(‘&tab_name’)
AND c.name like upper(‘&col_name’)
ORDER BY a.username, o.name, c.name
;
col owner clear;
col oname clear;
col cname clear;
undef tab_name col_name owner_mask;
[/sourcecode]

[collapse]

But it’s not enough, for example it doesn’t show predicates combinations. In this case we can use v$active_session_history and v$sql_plan:

Top 50 predicates

[sourcecode language=”sql”]
with
ash as (
select
sql_id
,plan_hash_value
,table_name
,alias
,ACCESS_PREDICATES
,FILTER_PREDICATES
,count(*) cnt
from (
select
h.sql_id
,h.SQL_PLAN_HASH_VALUE plan_hash_value
,decode(p.OPERATION
,’TABLE ACCESS’,p.OBJECT_OWNER||’.’||p.OBJECT_NAME
,(select i.TABLE_OWNER||’.’||i.TABLE_NAME from dba_indexes i where i.OWNER=p.OBJECT_OWNER and i.index_name=p.OBJECT_NAME)
) table_name
,OBJECT_ALIAS ALIAS
,p.ACCESS_PREDICATES
,p.FILTER_PREDICATES
— поля, которые могут быть полезны для анализа в других разрезах:
— ,h.sql_plan_operation
— ,h.sql_plan_options
— ,decode(h.session_state,’ON CPU’,’ON CPU’,h.event) event
— ,h.current_obj#
from v$active_session_history h
,v$sql_plan p
where h.sql_opname=’SELECT’
and h.IN_SQL_EXECUTION=’Y’
and h.sql_plan_operation in (‘INDEX’,’TABLE ACCESS’)
and p.SQL_ID = h.sql_id
and p.CHILD_NUMBER = h.SQL_CHILD_NUMBER
and p.ID = h.SQL_PLAN_LINE_ID
— если захотим за последние 3 часа:
— and h.sample_time >= systimestamp – interval ‘3’ hour
)
— если захотим анализируем предикаты только одной таблицы:
— where table_name=’&OWNER.&TABNAME’
group by
sql_id
,plan_hash_value
,table_name
,alias
,ACCESS_PREDICATES
,FILTER_PREDICATES
)
,agg_by_alias as (
select
table_name
,regexp_substr(ALIAS,’^[^@]+’) ALIAS
,listagg(ACCESS_PREDICATES,’ ‘) within group(order by ACCESS_PREDICATES) ACCESS_PREDICATES
,listagg(FILTER_PREDICATES,’ ‘) within group(order by FILTER_PREDICATES) FILTER_PREDICATES
,sum(cnt) cnt
from ash
group by
sql_id
,plan_hash_value
,table_name
,alias
)
,agg as (
select
table_name
,’ALIAS’ alias
,replace(access_predicates,’"’||alias||’".’,’"ALIAS".’) access_predicates
,replace(filter_predicates,’"’||alias||’".’,’"ALIAS".’) filter_predicates
,sum(cnt) cnt
from agg_by_alias
group by
table_name
,replace(access_predicates,’"’||alias||’".’,’"ALIAS".’)
,replace(filter_predicates,’"’||alias||’".’,’"ALIAS".’)
)
,cols as (
select
table_name
,cols
,access_predicates
,filter_predicates
,sum(cnt)over(partition by table_name,cols) total_by_cols
,cnt
from agg
,xmltable(
‘string-join(for $c in /ROWSET/ROW/COL order by $c return $c,",")’
passing
xmltype(
cursor(
(select distinct
nvl(
regexp_substr(
access_predicates||’ ‘||filter_predicates
,'("’||alias||’"\.|[^.]|^)"([A-Z0-9#_$]+)"([^.]|$)’
,1
,level
,’i’,2
),’ ‘)
col
from dual
connect by
level<=regexp_count(
access_predicates||’ ‘||filter_predicates
,'("’||alias||’"\.|[^.]|^)"([A-Z0-9#_$]+)"([^.]|$)’
)
)
))
columns cols varchar2(400) path ‘.’
)(+)
order by total_by_cols desc, table_name, cnt desc
)
select
table_name
,cols
,sum(cnt)over(partition by table_name,cols) total_by_cols
,access_predicates
,filter_predicates
,cnt
from cols
where rownum<=50
order by total_by_cols desc, table_name, cnt desc;
[/sourcecode]

[collapse]

As you can see it shows top 50 predicates and their columns for last 3 hours. Despite the fact that ASH stores just sampled data, its results are representative enough for high-load databases.
Just few details:

Column “COLS” shows “search columns”, and total_by_cols – their number of occurrences
I think it’s obvious, that this info is not unambiguous marker of the problem, because for example few full table scans can misrepresent the statistics, so sometimes you will need to analyze such queries deeper (v$sqlstats,dba_hist_sqlstat)
We need to group data by OBJECT_ALIAS within SQL_ID and plan_hash_value, because in case of index access with lookup to table(“table access by rowid”) some predicates are in the row with index access and others are in the row with table access.

Depending on the needs, we can modify this query to analyze ASH data by different dimensions, for example with additional analysis of partitioning or wait events.

Correct syntax for the table_stats hint

Posted on April 16, 2019 by Roger MacNicol Posted in adaptive serial direct path reads, CBO, hints, oracle, SmartScan, trace, troubleshooting, undocumented 2,506 Page views 3 Comments

A friend contacted me to ask why they were having problems using the table_stats hint to influence optimizer decision making and also to influence the decision to use direct read or buffer cache scan so this is just a quick blog post to clarify the syntax as it is not well documented.

table_stats(<table_name> <method> {<keyword>=<value>} )

Method is one of: DEFAULT, SET, SCALE, SAMPLE

Keyword is one of: BLOCKS, ROWS, ROW_LENGTH

Continue reading→

Top-N again: fetch first N rows only vs rownum

Posted on December 30, 2018 by Sayan Malakshinov Posted in adaptive serial direct path reads, CBO, oracle, query optimizing, SQL, troubleshooting 7,484 Page views Leave a comment

Three interesting myths about rowlimiting clause vs rownum have recently been posted on our Russian forum:

TopN query with rownum<=N is always faster than "fetch first N rows only" (ie. row_number()over(order by ...)<=N)
“fetch first N rows only” is always faster than rownum<=N
“SORT ORDER BY STOPKEY” stores just N top records during sorting, while “WINDOW SORT PUSHED RANK” sorts all input and stores all records sorted in memory.

Interestingly that after Vyacheslav posted first statement as an axiom and someone posted old tests(from 2009) and few people made own tests which showed that “fetch first N rows” is about 2-3 times faster than the query with rownum, the final decision was that “fetch first” is always faster.

First of all I want to show that statement #3 is wrong and “WINDOW SORT PUSHED RANK” with row_number works similarly as “SORT ORDER BY STOPKEY”:
It’s pretty easy to show using sort trace:
Let’s create simple small table Tests1 with 1000 rows where A is in range 1-1000 (just 1 block):

create table test1(a not null, b) as
  select level, level from dual connect by level<=1000;

alter session set max_dump_file_size=unlimited;
ALTER SESSION SET EVENTS '10032 trace name context forever, level 10';

ALTER SESSION SET tracefile_identifier = 'rownum';
select * from (select * from test1 order by a) where rownum<=10;

ALTER SESSION SET tracefile_identifier = 'rownumber';
select * from test1 order by a fetch first 10 rows only;

And we can see from the trace files that both queries did the same number of comparisons:

rownum:

[sourcecode language=”sql” highlight=”7″]
—– Current SQL Statement for this session (sql_id=bbg66rcbt76zt) —–
select * from (select * from test1 order by a) where rownum<=10

—- Sort Statistics ——————————
Input records 1000
Output records 10
Total number of comparisons performed 999
Comparisons performed by in-memory sort 999
Total amount of memory used 2048
Uses version 1 sort
—- End of Sort Statistics ———————–
[/sourcecode]

[collapse]

row_number

[sourcecode language=”sql” highlight=”7″]
—– Current SQL Statement for this session (sql_id=duuy4bvaz3d0q) —–
select * from test1 order by a fetch first 10 rows only

[collapse]

Ie. each row (except first one) was compared with the biggest value from top 10 values and since they were bigger than top 10 value, oracle doesn’t compare it with other TopN values.

And if we change the order of rows in the table both of these queries will do the same number of comparisons again:

from 999 to 0

[sourcecode language=”sql”]
create table test1(a not null, b) as
select 1000-level, level from dual connect by level<=1000;

alter session set max_dump_file_size=unlimited;
ALTER SESSION SET EVENTS ‘10032 trace name context forever, level 10’;

ALTER SESSION SET tracefile_identifier = ‘rownum’;
select * from (select * from test1 order by a) where rownum<=10;

ALTER SESSION SET tracefile_identifier = ‘rownumber’;
select * from test1 order by a fetch first 10 rows only;
[/sourcecode]

[collapse]

rownum

[sourcecode language=”sql” highlight=”7″]
—– Current SQL Statement for this session (sql_id=bbg66rcbt76zt) —–
select * from (select * from test1 order by a) where rownum<=10

—- Sort Statistics ——————————
Input records 1000
Output records 1000
Total number of comparisons performed 4976
Comparisons performed by in-memory sort 4976
Total amount of memory used 2048
Uses version 1 sort
—- End of Sort Statistics ———————–
[/sourcecode]

[collapse]

row_number

[sourcecode language=”sql” highlight=”7″]
—– Current SQL Statement for this session (sql_id=duuy4bvaz3d0q) —–
select * from test1 order by a fetch first 10 rows only

[collapse]

We can see that both queries required much more comparisons(4976) here, that’s because each new value is smaller than the biggest value from the topN and even smaller than lowest value, so oracle should get right position for it and it requires 5 comparisons for that (it compares with 10th value, then with 6th, 3rd, 2nd and 1st values from top10). Obviously it makes less comparisons for the first 10 rows.

Now let’s talk about statements #1 and #2:
We know that rownum forces optimizer_mode to switch to “first K rows”, because of the parameter “_optimizer_rownum_pred_based_fkr”

SQL> @param_ rownum

NAME                               VALUE  DEFLT  TYPE      DESCRIPTION
---------------------------------- ------ ------ --------- ------------------------------------------------------
_optimizer_rownum_bind_default     10     TRUE   number    Default value to use for rownum bind
_optimizer_rownum_pred_based_fkr   TRUE   TRUE   boolean   enable the use of first K rows due to rownum predicate
_px_rownum_pd                      TRUE   TRUE   boolean   turn off/on parallel rownum pushdown optimization

while fetch first/row_number doesn’t (it will be changed after the patch #22174392) and it leads to the following consequences:
1. first_rows disables serial direct reads optimization(or smartscan on Exadata), that’s why the tests with big tables showed that “fetch first” were much faster than the query with rownum.
So if we set “_serial_direct_read”=always, we get the same performance in both tests (within the margin of error).

2. In cases when index access (index full scan/index range scan) is better, CBO differently calculates the cardinality of underlying INDEX FULL(range) SCAN:
the query with rownum is optimized for first_k_rows and the cardinality of index access is equal to K rows, but CBO doesn’t reduce cardinality for “fetch first”, so the cost of index access is much higher, compare them:

rownum

[sourcecode language=”sql” highlight=”13″]
SQL> explain plan for
2 select *
3 from (select * from test order by a,b)
4 where rownum<=10;

——————————————————————————————–
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
——————————————————————————————–
| 0 | SELECT STATEMENT | | 10 | 390 | 4 (0)| 00:00:01 |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | VIEW | | 10 | 390 | 4 (0)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID| TEST | 1000K| 12M| 4 (0)| 00:00:01 |
| 4 | INDEX FULL SCAN | IX_TEST_AB | 10 | | 3 (0)| 00:00:01 |
——————————————————————————————–

Predicate Information (identified by operation id):
—————————————————

1 – filter(ROWNUM<=10)
[/sourcecode]

[collapse]

fetch first

[sourcecode language=”sql” highlight=”13″]
SQL> explain plan for
2 select *
3 from test
4 order by a,b
5 fetch first 10 rows only;

—————————————————————————————–
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
—————————————————————————————–
| 0 | SELECT STATEMENT | | 10 | 780 | | 5438 (1)| 00:00:01 |
|* 1 | VIEW | | 10 | 780 | | 5438 (1)| 00:00:01 |
|* 2 | WINDOW SORT PUSHED RANK| | 1000K| 12M| 22M| 5438 (1)| 00:00:01 |
| 3 | TABLE ACCESS FULL | TEST | 1000K| 12M| | 690 (1)| 00:00:01 |
—————————————————————————————–

Predicate Information (identified by operation id):
—————————————————

1 – filter("from$_subquery$_002"."rowlimit_$$_rownumber"<=10)
2 – filter(ROW_NUMBER() OVER ( ORDER BY "TEST"."A","TEST"."B")<=10)
[/sourcecode]

[collapse]

fetch first + first_rows

[sourcecode language=”sql” highlight=”14″]
SQL> explain plan for
2 select/*+ first_rows */ *
3 from test
4 order by a,b
5 fetch first 10 rows only;

——————————————————————————————–
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
——————————————————————————————–
| 0 | SELECT STATEMENT | | 10 | 780 | 27376 (1)| 00:00:02 |
|* 1 | VIEW | | 10 | 780 | 27376 (1)| 00:00:02 |
|* 2 | WINDOW NOSORT STOPKEY | | 1000K| 12M| 27376 (1)| 00:00:02 |
| 3 | TABLE ACCESS BY INDEX ROWID| TEST | 1000K| 12M| 27376 (1)| 00:00:02 |
| 4 | INDEX FULL SCAN | IX_TEST_AB | 1000K| | 2637 (1)| 00:00:01 |
——————————————————————————————–

Predicate Information (identified by operation id):
—————————————————

1 – filter("from$_subquery$_002"."rowlimit_$$_rownumber"<=10)
2 – filter(ROW_NUMBER() OVER ( ORDER BY "TEST"."A","TEST"."B")<=10)
[/sourcecode]

[collapse]

fetch first + index

[sourcecode language=”sql” highlight=”14″]
SQL> explain plan for
2 select/*+ index(test (a,b)) */ *
3 from test
4 order by a,b
5 fetch first 10 rows only;

Predicate Information (identified by operation id):
—————————————————

1 – filter("from$_subquery$_002"."rowlimit_$$_rownumber"<=10)
2 – filter(ROW_NUMBER() OVER ( ORDER BY "TEST"."A","TEST"."B")<=10)
[/sourcecode]

[collapse]

So in this case we can add hints “first_rows” or “index”, or install the patch #22174392.

ps. I thought to post this note later, since I hadn’t time enough to add other interesting details about the different TopN variants, including “with tie”, rank(), etc, so I’ll post another note with more details later.

Docker with Oracle database: install patches automatically

Posted on December 7, 2018 by Sayan Malakshinov Posted in Docker, oracle 1,945 Page views Leave a comment

Recently I had to install the patch for fixing cross-platform PDB transport bug onto the docker images with Oracle, so these are easy way how to do it:

1. create directory “patches” and create “install_patches.sh”:

#!/bin/bash

unzip  -u ./*.zip
CURDIR=`pwd`

for D in *; do
    if [ -d "${D}" ]; then
        echo =================================================
        echo " *** Processing patch # ${D}... "   # your processing here
        cd "${D}"
        opatch apply -silent
    fi
    cd $CURDIR
done

2. add the following commands into the dockerfile:

USER root

# Copy DB install file
COPY --chown=oracle:dba patches $INSTALL_DIR/patches

# Install DB software binaries
USER oracle
RUN chmod ug+x $INSTALL_DIR/patches/*.sh && \
    sync && \
    cd $INSTALL_DIR/patches && \
    ./install_patches.sh

3. put downloaded patches into the “patches” directory and build image.

For example, dockerfile for 18.3:

FROM oracle/database:18.3.0-ee

MAINTAINER Sayan Malakshinov <sayan@orasql.org>

USER root

# Copy patches:
COPY --chown=oracle:dba patches $INSTALL_DIR/patches

# Install patches:
USER oracle
RUN chmod ug+x $INSTALL_DIR/patches/*.sh && \
    sync && \
    cd $INSTALL_DIR/patches && \
    ./install_patches.sh

ps. I’ve also create the request for that in the official Docker github: https://github.com/oracle/docker-images/issues/1070

Compression in a well-balanced system

Posted on August 23, 2018 by Roger MacNicol Posted in oracle, SmartScan 1,773 Page views Leave a comment

Exadata is designed to present a well-balanced system between disk scan rates, cell CPU processing rates, network bandwidth, and RDBMS CPU such that on average across a lot of workloads and a lot of applications there should not be no one obvious bottleneck. Good performance engineering means avoiding a design like this:

because that would be nuts. If you change one component in a car e.g. the engine you need to rework the other components such as the chassis and the brakes. When we rev the Exadata hardware we go through the same exercise to make sure the components match to avoid bottlenecks.

One of the components in a well-balanced system that people often leave off this list is compression algorithm. With compression you are balancing storage costs as well as the disk scan rates against the CPU’s decompression rate. In a disk bound system you can alleviate the bottleneck by compressing the data down to the point that it is being scanned at the same rate that the CPUs can decompress and process it. Squeeze it too far, say with HCC “Compress For Archive High” which uses BZ2, the CPUs become a major bottleneck because the decompression costs unbalance the system (but it’s wonderful if you think you’re unlikely to ever need the data again).

Continue reading→