Oracle SQL

    External Tables Part 1 – Project Columns All vs Referenced

    Posted on October 25, 2017 by Roger MacNicol Posted in External tables, oracle, SmartScan

    I normally blog about table scans on Oracle native data but Oracle also supports a wide variety of features for scanning external tables and I need to cover these too. One of the ways I learn new things is being sent a bug report and saying to myself “Oh! I didn’t know you could do that”. So today I’m going to start with the grammar:

    Alter Table <xt> Project Columns [ All | Referenced ]  
    

    This DDL changes the value in the Property column displayed in user_external_tables:

    SQL> select property
    2    from user_external_tables
    3    where table_name='C_ET';
    
    PROPERTY
    ----------
    ALL       
    

    Here we have an external table defined using the legacy driver ORACLE_LOADER. This driver defaults to projecting all the columns in the base table rather than just those needed to satisfy the query (i.e. the referenced columns) and discarding rows that have data conversion issues up to the reject limit. 

    So for example we have a DATE column in our external table that contains dirty data that won’t convert cleanly to Oracle internal dates using the supplied mask we can either import it as a VARCHAR2 to not lose values or import it as a date but lose rows even on queries that don’t need that date column. We can change the behaviour to only project the referenced columns by 

    SQL> alter table c_et project column referenced;
     
    Table altered.
     
    SQL> 
    SQL> select property 
    2    from user_external_tables
    3    where table_name='C_ET';
     
    PROPERTY   
    ----------
    REFERENCED 
    

    The driver will now ignore unreferenced columns and if the date column is not used we will get all the rows in the external data and the query will run faster since datatype conversion is expensive but we may get an inconsistent number of rows depending on which columns are used in any given query. This is OK if the DBA knows a priori that there are no conversion errors or if they are willing to live with inconsistency in the number of rows returned.

    The big data drivers such as ORACLE_HIVE have a different default behaviour which is to only project referenced columns and to replace data with conversion errors with NULL values and i.e. they default to returning a consistent number of rows with best performance.

    Take away: In order to get consistency and fast table scan performance with ORACLE_LOADER,  the trick is to define the external table with the ‘convert_error store_null’ option and switch to ‘Project Column Referenced’. For example:

    CREATE TABLE "T_XT"
    (
      c0 varchar(10),
      c1 varchar(10)
    )
    ORGANIZATION external
    (
      TYPE oracle_loader
      DEFAULT DIRECTORY DMPDIR
      ACCESS PARAMETERS
      (
        RECORDS DELIMITED BY NEWLINE
        CHARACTERSET  al32utf8
        ESCAPE
        FIELDS CSV WITH EMBEDDED terminated by ',' enclosed by '|'
        REJECT ROWS WITH ALL NULL FIELDS
        nullif equal blanks
        truncate_columns
        convert_error store_null
        (
          "C0" CHAR(10),
          "C1" CHAR(10)
        )
      )
      location
      (
        'simple.dat'
      )
    ) REJECT LIMIT UNLIMITED
    parallel 4;
    

    If you want more information on handling conversion errors when scanning an external table please check the Utilities Guide section on each driver type

    Print PDF eBook
    external tables oracle Roger MacNicol SmartScan
    « PL/SQL functions: Iterate and keys for associative arrays
    Random thoughts on block sizes »
    photo Sayan Malakshinov

    Oracle Database Developer Choice Award winner
    Oracle ACE Oracle ACE
    Oracle performance tuning expert
    UK / Aylesbury / Transmedia Dynamics
    sayan@orasql.org

    Simple Android Oracle client

    Get it on Google Play

    Recent Posts

    • Where does the commit or rollback happen in PL/SQL code?
    • :1 and SP2-0553: Illegal variable name “1”.
    • ORA exceptions that can’t be caught by exception handler
    • Another interesting troubleshooting case
    • SQL*Plus tips #9: Reading traces and incident files

    Recent Comments

    • Oracle SQL | Shining some light on Database In-Memory vs the Exadata Columnar Cache in 12.1.0.2 on Tracing Hybrid Columnar Compression in an offload server
    • Oracle SQL | More on tracing the offload server on Tracing Hybrid Columnar Compression in an offload server
    • Oracle SQL | DMLs and the Columnar Cache on ADW on How to tell if the Exadata column cache is fully loaded
    • Oracle SQL | DMLs and the Columnar Cache on ADW on What’s new in 12.2 CELLMEMORY Part 3
    • Oracle SQL | Serial Scans failing to offload on When bloggers get it wrong – part 2
    • Oracle SQL | Serial Scans failing to offload on When bloggers get it wrong – part 1
    • Oracle SQL | DMLs and the Columnar Cache on ADW on What’s new in 12.2 CELLMEMORY Part 2
    • Oracle SQL | DMLs and the Columnar Cache on ADW on What’s new in 12.2 CELLMEMORY Part 1
    • Oracle SQL | The beginners guide to Oracle Table Scans on When bloggers get it wrong – part 1
    • Case Study | Oracle Scratchpad on Lateral view decorrelation(VW_DCL) causes wrong results with rownum

    Blogroll

    • Alex Fatkulin
    • Alexander Anokhin
    • Andrey Nikolaev
    • Charles Hooper
    • Christian Antognini
    • Coskan Gundogar
    • David Fitzjarrell
    • Igor Usoltsev
    • Jonathan Lewis
    • Karl Arao
    • Mark Bobak
    • Martin Bach
    • Martin Berger
    • Neil Chandler
    • Randolf Geist
    • Richard Foote
    • Riyaj Shamsudeen
    • Tanel Poder
    • Timur Akhmadeev
    • Valentin Nikotin

    Categories

    Aggregated by OraNA Aggregated by OraFAQ

    Meta

    • Log in
    • Entries feed
    • Comments feed
    • WordPress.org
    ©Sayan Malakshinov. Oracle SQL