Oracle SQL
  • LICENSE

Tag Archives: troubleshooting

Correct syntax for the table_stats hint

Posted on April 16, 2019 by Roger MacNicol Posted in adaptive serial direct path reads, CBO, hints, oracle, SmartScan, trace, troubleshooting, undocumented 1 Comment

A friend contacted me to ask why they were having problems using the table_stats hint to influence optimizer decision making and also to influence the decision to use direct read or buffer cache scan so this is just a quick blog post to clarify the syntax as it is not well documented.

table_stats(<table_name> <method> {<keyword>=<value>} )

Method is one of: DEFAULT, SET, SCALE, SAMPLE

Keyword is one of: BLOCKS, ROWS, ROW_LENGTH
Continue reading→
oracle query optimization Roger MacNicol SmartScan troubleshooting

Lateral view decorrelation(VW_DCL) causes wrong results with rownum

Posted on February 16, 2019 by Sayan Malakshinov Posted in 12c, bug, CBO, oracle, query optimizing, rownum, troubleshooting 2 Comments

Everyone knows that rownum in inline views blocks many query transformations, for example pushing/pulling predicates, scalar subquery unnesting, etc, and many people use it for such purposes as a workaround to avoid unwanted transformations(or even CBO bugs).

Obviously, the main reason of that is different calculation of rownum:

If we pull the predicate “column_value = 3” from the following query to higher level

select * 
from (select * from table(odcinumberlist(1,1,1,2,2,2,3,3,3)) order by 1)
where rownum <= 2
  and column_value = 3;


COLUMN_VALUE
------------
           3
           3

we will get different results:

select * 
from (select *
      from (select * from table(odcinumberlist(1,1,1,2,2,2,3,3,3)) order by 1)
      where rownum <= 2
     )
where column_value = 3;

no rows selected

Doc ID 62340.1

[collapse]

But we recently encountered a bug with it: lateral view with ROWNUM returns wrong results in case of lateral view decorrelation.
Compare results of this query with and without no_decorrelation hint:

with 
 t1(a) as (select * from table(odcinumberlist(1,3)))
,t2(b) as (select * from table(odcinumberlist(1,1,3,3)))
,t(id) as (select * from table(odcinumberlist(1,2,3)))
select
  *
from t,
     lateral(select/*+ no_decorrelate */ rownum rn 
             from t1,t2 
             where t1.a=t2.b and t1.a = t.id
            )(+)
order by 1,2;

        ID         RN
---------- ----------
         1          1
         1          2
         2
         3          1
         3          2
with 
 t1(a) as (select * from table(odcinumberlist(1,3)))
,t2(b) as (select * from table(odcinumberlist(1,1,3,3)))
,t(id) as (select * from table(odcinumberlist(1,2,3)))
select
  *
from t,
     lateral(select rownum rn 
             from t1,t2 
             where t1.a=t2.b and t1.a = t.id
            )(+)
order by 1,2;

        ID         RN
---------- ----------
         1          1
         1          2
         2
         3          3
         3          4

Of course, we can draw conclusions even from these results: we can see that in case of decorrelation(query with hint) rownum was calculated before the join. But to be sure we can check optimizer’s trace 10053:

Final query after transformations:

******* UNPARSED QUERY IS *******
SELECT VALUE(KOKBF$2) "ID", "VW_DCL_76980902"."RN" "RN"
  FROM TABLE("ODCINUMBERLIST"(1, 2, 3)) "KOKBF$2",
       (SELECT ROWNUM "RN_0", VALUE(KOKBF$0) "ITEM_3"
          FROM TABLE("ODCINUMBERLIST"(1, 3)) "KOKBF$0",
               TABLE("ODCINUMBERLIST"(1, 1, 3, 3)) "KOKBF$1"
         WHERE VALUE(KOKBF$0) = VALUE(KOKBF$1)
        ) "VW_DCL_76980902"
 WHERE "VW_DCL_76980902"."ITEM_3"(+) = VALUE(KOKBF$2)
 ORDER BY VALUE(KOKBF$2), "VW_DCL_76980902"."RN"

*************************

[collapse]

I’ll modify it a bit just to make it more readable:
we can see that

select
  *
from t,
     lateral(select rownum rn 
             from t1,t2 
             where t1.a=t2.b and t1.a = t.id)(+)
order by 1,2;

was transformed to

select
  t.id, dcl.rn
from t,
     (select rownum rn 
      from t1,t2 
      where t1.a=t2.b) dcl
where dcl.a(+) = t.id
order by 1,2;

And it confirms that rownum was calculated on the different dataset (t1-t2 join) without join filter by table t.
I created SR with Severity 1 (SR #3-19117219271) more than a month ago, but unfortunately Oracle development doesn’t want to fix this bug and moreover they say that is not a bug. So I think this is a dangerous precedent and probably soon we will not be able to be sure in the calculation of rownum and old fixes…

bug cbo lateral query optimization troubleshooting

Create Quarantine

Posted on August 16, 2018 by Roger MacNicol Posted in cell_offload, oracle, SmartScan Leave a comment

First if you want don’t know what an Exadata Quarantine is read this.

Someone asked whether you can create your own Exadata Cell quarantine and, if you can, why you might ever want to do it? 

The first step when you don’t know how to do something is try HELP in cellcli

CellCLI> HELP
...
ALTER QUARANTINE
...
CREATE QUARANTINE
...
DROP QUARANTINE
...
LIST QUARANTINE

So we see we can create a quarantine, so we use HELP again:

Continue reading→
Cell Offloading cellcli Offload Quarantine oracle Roger MacNicol SmartScan troubleshooting

Shining some light on Database In-Memory vs the Exadata Columnar Cache in 12.1.0.2

Posted on August 3, 2018 by Roger MacNicol Posted in cell_offload, inmemory, oracle, SmartScan, trace Leave a comment

I posted a while back on how to use Tracing Hybrid Columnar Compression in an offload server so this is a quick follow up.

  1. I have trouble remembering the syntax for setting a regular parameter in an offload server without bouncing it. Since I need to keep this written down somewhere I thought it might be use to support folks and dbas.
  2. I forgot to show you how to specify which offload group to set the trace event

So this example should do both: 

CellCLI > alter cell offloadGroupEvents = "immediate cellsrv.cellsrv_setparam('my_parameter, 'TRUE')", offloadGroupName = "SYS_122110_160621"

this will, of course, set a parameter temporarily until the next time the offload server is bounced, but also adding it to the offload group’s init.ora will take care of that.

Cell Offloading Cellmemory inmemory oracle Roger MacNicol SmartScan traces troubleshooting

Oracle issues after upgrade to 12.2

Posted on November 24, 2017 by Sayan Malakshinov Posted in 12.2, bug, oracle 2 Comments

Sometimes it’s really hard even to create reproducible test case to send it to oracle support, especially in case of intermittent errors.
In such cases, I think it would be really great to have access to similar service requests or bugs of other oracle clients.
So while my poll about knowledge sharing is still active, I want to share a couple of bugs we have faced after upgrade to 12.2 (and one bug from Eric van Roon). I’m going to remove the bugs from this list when they become “public” or “fixed”.
If you want to add own findings into this list, you can add them into comments. To make this process easier, you can provide just symptomps, short description and the link to own post with details – I’ll add it just as a link.
Continue reading→

12.2.0.1 bind variable bug deterministic functions oracle troubleshooting undocumented oracle

Easy(lazy) way to check which programs have properly configured FetchSize

Posted on November 16, 2017 by Sayan Malakshinov Posted in oracle, SQL, statistics Leave a comment
select 
   s.module
  ,ceil(max(s.rows_processed/s.fetches)) rows_per_fetch
from v$sql s
where 
    s.rows_processed>100
and s.executions    >1
and s.fetches       >1
and s.module is not null
and s.command_type  = 3    -- SELECTs only
and s.program_id    = 0    -- do not account recursive queries from stored procs
and s.parsing_schema_id!=0 -- <> SYS
group by s.module
order by rows_per_fetch desc nulls last
/
arraysize fetchsize troubleshooting

Revisiting buffer cache Very Large Object Threshold

Posted on November 15, 2017 by Roger MacNicol Posted in adaptive serial direct path reads, oracle, SmartScan Leave a comment

If you turn on NSMTIO tracing you will see references to VLOT:

qertbFetch:[MTT < OBJECT_SIZE < VLOT]: Checking cost to read from caches (local/remote) and checking storage reduction factors (OLTP/EHCC Comp)

I had said you could ignore VLOT and Frits Hoogland pointed out that tracing showed it had some impact, so let me clarify:

VLOT is the absolute upper bound that cached reads can even be considered. 

This defaults to 500% of the number of buffers in the cache i.e.

_very_large_object_threshold = 500

While this number is not used in any calculations, it is used in two places as a cutoff to consider those calculations

1) Can we consider using Automatic Big Table Caching (a.k.a. DWSCAN) for this object?

2) Should we do a cost analysis for Buffer Cache scan vs Direct Read scan on tables larger than the MTT? 

The logic for tables above the calculated medium table threshold (MTT) and that are NOT part of searched DMLs and are NOT on Exadata with statistics based storage reduction factor enabled (_statistics_based_srf_enabled) is:

  • If _serial_direct_read == ALWAYS, use Direct Read
  • If _serial_direct_read == NEVER, use Buffer Cache
  • If _serial_direct_read == AUTO and #blocks in table < VLOT, use cost model
  • Else use Direct Read “qertbFetch:DirectRead:[OBJECT_SIZE>VLOT]”

In practice 5X buffer cache is so large the cost based decision will come to the same conclusion anyway – the default VLOT simply saves time spent doing the analysis.

For example, I got a quick count of the number of  blocks in non-partitioned TPC_H Scale 1 lineitem

select segment_name,sum(blocks),sum(bytes) from user_extents where segment_name='LINEITEM'

and created my buffer cache to be exactly the same size. With this setup, setting _very_large_object_threshold=100 bypassed the cost model and went straight to DR scan, while setting it to 200 forced the use of the cost model. 

The moral of this is that the default value of VLOT rarely changes the decisions made unless you reduce VLOT to a much smaller multiplier of the cache size and can start to see it cause a few more of your larger buffer cache scans move to direct read when they are no longer eligible for cost analysis. If you wish to stop some of the largest buffer cache scans from happening you would need to set _very_large_object_threshold less than 200.

direct path reads oracle Roger MacNicol SmartScan troubleshooting

Tracing Hybrid Columnar Compression in an offload server

Posted on May 4, 2017 by Roger MacNicol Posted in cell_offload, oracle, SmartScan, trace 3 Comments

I had previously commented on how to use the FPLIB facility in a trace event but the question came up today of how to trace HCC in an offload server.  The facility name in this case is ADVCMP (Advanced Compression) and the hierarchy is:

ADVCMP_MAIN
    ADVCMP_COMP
    ADVCMP_DECOMP

No compression occurs on the cell so we are only interested in the decompression tracing in an offload server.

So in this case the correct syntax is [facility.event] so: 

cellcli -e 'alter cell offloadgroupEvents = "trace[advcmp.advcmp_decomp.*] disk=lowest"'
oracle Roger MacNicol SmartScan traces troubleshooting

Using trace events with an offload server

Posted on May 4, 2017 by Roger MacNicol Posted in cell_offload, oracle, SmartScan, trace 2 Comments

I’ve noticed several people who were familiar with using trace events with cellsrv, were uncertain about how to use tracing the new offload server architecture. So whereas in the past you could have added tracing to SmartScan processing with:

> cellcli -e 'alter cell events = "trace[fplib.sage_data] disk=lowest, memory=lowest"'

the new syntax to turn it on is:

> cellcli -e 'alter cell offloadgroupEvents = "trace[fplib.sage_data] disk=lowest, memory=lowest"'

and to turn it off:

> cellcli -e 'alter cell offloadgroupEvents = "trace[fplib.sage_data] off"'

Do not use anything higher than “disk=medium,  memory=medium” since the higher levels of tracing will cause the heartbeat monitor to fail with timeout issues.

The SmartScan trace events available under the fplib (Filter Processing Library) facility are:

        fplib.fplib_main,  fplib.sage_cache,  fplib.sage_txn,  fplib.sage_data,  fplib.sage_index

In the cell trace directory hierarchy each offload server will have its own trace directories

diag/asm/cell/SYS_122010_150220/trace
diag/asm/cell/SYS_112331_141117/trace

In a future post I will cover how to use named offload groups to isolate the traces for a single session.

Roger MacNicol,

Data Storage Technology Group

oracle Roger MacNicol SmartScan traces troubleshooting

A couple of well-known but often forgotten things for PL/SQL developers

Posted on May 28, 2013 by Sayan Malakshinov Posted in oracle, PL/SQL, troubleshooting 2 Comments

1. Don’t forget always add NO_DATA_FOUND exception handling, when you doing “select into” in code which can be called as from PL/SQL, as from SQL.

A little example:
Suppose we need to create a function, which would call some procedure:

create or replace procedure p_nested as
  a int;
begin
  select 1 into a from dual where 1=0;
end;
/
create or replace function f_no_data_found return varchar2 as
begin
  p_nested;
  return 'ok';
end;
/

When we call this function in PL/SQL, it will raise NO_DATA_FOUND and we will see it:

SQL> exec dbms_output.put_line(f_no_data_found);
BEGIN dbms_output.put_line(f_no_data_found); END;

*
ERROR at line 1:
ORA-01403: no data found
ORA-06512: at "XTENDER.P_NESTED", line 4
ORA-06512: at "XTENDER.F_NO_DATA_FOUND", line 3
ORA-06512: at line 1

But it doesn’t when we call it in SQL, because it’s normal for SQL: it’s just like a result of scalar subquery that returns nothing – NULL:

SQL> set null "NUL"
SQL> col ndf format a10
SQL> select f_no_data_found ndf from dual;

NDF
----------
NUL

1 row selected.

So if you want the function to behave the same way in PL/SQL and SQL, just add exception handling with reraising another exception or just return null.
It must be at the level of reflexes – “select into” → “exception when no_data_found”
Otherwise, later, when code become a big and difficult, you can get unstable hidden error.

2. Exceptions raised in a declaration section or in default parameters assigning will never be handled in exception section of the same level

Let’s take a look at a very simple example:

An example of exception in default parameter assigning

SQL> create or replace function f_value_error return int is
  2  begin
  3    raise value_error;
  4    return 1;
  5  end;
  6  /

Function created.

SQL> create or replace function f(i int:=f_value_error) return varchar2 is
  2  begin
  3    return 'ok';
  4  exception when others then
  5    return dbms_utility.format_error_backtrace;
  6  end;
  7  /

Function created.

SQL> set serverout on;
SQL> begin
  2    dbms_output.put_line('From f: '||chr(10)||f);
  3    dbms_output.put_line('****************************');
  4  exception when others then
  5    dbms_output.put_line('****************************');
  6    dbms_output.put_line('From higher level:'||chr(10)||dbms_utility.format_error_backtrace);
  7    dbms_output.put_line('****************************');
  8  end;
  9  /
****************************
From higher level:
ORA-06512: at "XTENDER.F_VALUE_ERROR", line 3
ORA-06512: at line 2

****************************

PL/SQL procedure successfully completed.

[collapse]

As you can see, there are two problems:
1. an exception was handled at higher level
2. the error backtrace does not show the call of the function “F”.

If the exception was caused in the declaration, we would see the “correct” backtrace, but exception would be still handled at higher level only:

In the declaration

SQL> create or replace function f(i int:=null) return varchar2 is
  2    l_i int:=nvl(i,f_value_error);
  3  begin
  4    return 'ok';
  5  exception when others then
  6    return dbms_utility.format_error_backtrace;
  7  end;
  8  /

Function created.

SQL> set serverout on;
SQL> begin
  2    dbms_output.put_line('From f: '||chr(10)||f);
  3    dbms_output.put_line('****************************');
  4  exception when others then
  5    dbms_output.put_line('****************************');
  6    dbms_output.put_line('From higher level:'||chr(10)||dbms_utility.format_error_backtrace);
  7    dbms_output.put_line('****************************');
  8  end;
  9  /
****************************
From higher level:
ORA-06512: at "XTENDER.F_VALUE_ERROR", line 3
ORA-06512: at "XTENDER.F", line 2
ORA-06512: at line 2

****************************

PL/SQL procedure successfully completed.

[collapse]

Sometimes it’s not so dangerous, but last week I was investigating a complex case for this reason: one function when called in SQL throws strange exception, but in PL/SQL it works fine.
The exception was:

SQL> select PKG1.F(1,0,0,1275) from dual; 
select PKG1.F(1,0,0,1275) from dual 
       * 
ERROR at line 1: 
ORA-06553: PLS-801: internal error [1401]

And the function has many functions calls in default parameters initialization, so I couldn’t even find out which one contains a root problem.

oracle pl/sql troubleshooting
photo Sayan Malakshinov

Oracle ACE Pro Oracle ACE Pro

DEVVYOracle Database Developer Choice Award winner

Oracle performance tuning expert

UK / Cambridge

LinkedIn   Twitter
sayan@orasql.org

Recent Posts

  • CBO and Partial indexing
  • Slow index access “COL=:N” where :N is NULL
  • Where does the commit or rollback happen in PL/SQL code?
  • :1 and SP2-0553: Illegal variable name “1”.
  • ORA exceptions that can’t be caught by exception handler

Recent Comments

  • Oracle SGA 값을 증가 시킬 때 발생 장애 원인 – DBA의 정석 on Example of controlling “direct path reads” decision through SQL profile hints (index_stats/table_stats)
  • Oracle SQL | Oracle diagnostic events — Cheat sheet on Where does the commit or rollback happen in PL/SQL code?
  • Functions & Subqueries | Oracle Scratchpad on Deterministic function vs scalar subquery caching. Part 3
  • Materialized views state turns into compilation_error after refresh - kranar.top - Answering users questions... on Friday prank: select from join join join
  • Exadata Catalogue | Oracle Scratchpad on When bloggers get it wrong – part 1
  • Exadata Catalogue | Oracle Scratchpad on Serial Scans failing to offload
  • lateral join – decorrelation gone wrong – svenweller on Lateral view decorrelation(VW_DCL) causes wrong results with rownum
  • 255 column catalogue | Oracle Scratchpad on Intra-block row chaining optimization in 12.2
  • 255 column catalogue | Oracle Scratchpad on row pieces, 255 columns, intra-block row chaining in details
  • opt_estimate catalogue | Oracle Scratchpad on Correct syntax for the table_stats hint

Blogroll

  • Alex Fatkulin
  • Alexander Anokhin
  • Andrey Nikolaev
  • Charles Hooper
  • Christian Antognini
  • Coskan Gundogar
  • David Fitzjarrell
  • Igor Usoltsev
  • Jonathan Lewis
  • Karl Arao
  • Mark Bobak
  • Martin Bach
  • Martin Berger
  • Neil Chandler
  • Randolf Geist
  • Richard Foote
  • Riyaj Shamsudeen
  • Tanel Poder
  • Timur Akhmadeev
  • Valentin Nikotin

Meta

  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org
  • Prev
  • 1
  • 2
©Sayan Malakshinov. Oracle SQL