Forums, mailing lists, and StackOverflow are all great resources for Oracle performance discussions, but I’ve long thought it would be useful to have a dedicated online chat/group specifically for Oracle performance specialists. A place to share news, articles, and discuss performance issues. To test the waters, I’ve created a group: https://t.me/ora_perf. If you’re interested, feel free to join! Let’s build a central hub for Oracle performance discussions.
Before diving into the main topic, let me address a frequent question I’ve received regarding the earlier parts of this series:
“You’re focusing on the rare case of date-only indexes (begin_date
, end_date
), but most real-world scenarios involve composite indexes with an ID field, like (id, begin_date, end_date)
.“
Yes, it’s true that in practice, composite indexes with an ID field are more common. And exactly such scenarios was the reason of this series. However, I intentionally started with a simplified case to focus on the date filtering mechanics. All the issues, observations, conclusions, and solutions discussed so far are equally applicable to composite indexes.
For example, many production databases have identifiers that reference tens or even hundreds of thousands of intervals. The addition of an ID-based access predicate may reduce the scanned volume for a single query, but the underlying date range filtering issues remain. These inefficiencies often go unnoticed because people don’t realize their simple queries are doing tens of LIOs when they could be doing just 3-5, with response times reduced from 100 microseconds to 2 microseconds.
Even if your queries always use an equality predicate on the ID field, you’ll still encounter challenges with huge queries with joins, such as:
select *
from IDs
join tab_with_history h
on IDs.id = h.id
and :dt between h.beg_date and h.end_date
Here, lookups for each ID against the composite index can become inefficient at scale compared to retrieving a pre-filtered slice for the target date.
To clarify, everything discussed in this series applies to composite indexes as well. The solutions can easily be extended to include ID fields by modifying just a few lines of code. Let’s now move to the main topic.
Dynamic Range Segmentation – Interval Quantization
In the earlier parts, you may have noticed a skew in my test data, with many intervals of 30 days generated for every hour. This naturally leads to the idea of reducing scan volume by splitting long intervals into smaller sub-intervals.
What is Interval Quantization?
Interval quantization is a known solution for this problem, but it often comes with drawbacks. Traditional quantization requires selecting a single fixed unit (e.g., 1 minute), which may not suit all scenarios. Using a small unit to cover all cases can lead to an explosion in the number of rows.
However, since Dynamic Range Segmentation (DRS) already handles short intervals efficiently, we can focus on quantizing only long intervals. For this example, we’ll:
- Leave intervals of up to 1 hour as-is, partitioning them into two categories: up to 15 minutes and up to 1 hour.
- Split longer intervals into sub-intervals of 1 day.
Creating a SQL Macro for Interval Splitting
To simplify the splitting of long intervals, we’ll write a SQL Macro:
create or replace function split_interval_by_days(beg_date date, end_date date)
return varchar2 sql_macro
is
begin
return q'{
select/*+ no_decorrelate */
case
when n = 1
then beg_date
else trunc(beg_date)+n-1
end as sub_beg_date
,case
when n<=trunc(end_date)-trunc(beg_date)
then trunc(beg_date)+n -1/24/60/60
else end_date
end as sub_end_date
from (select/*+ no_merge */ level n
from dual
connect by level<=trunc(end_date)-trunc(beg_date)+1
)
}';
end;
/
Source on github: https://github.com/xtender/xt_scripts/blob/master/blog/1.interval_search/drs.v2/split_interval_by_days.sql
This macro returns sub-intervals for any given range:
SQL> select * from split_interval_by_days(sysdate-3, sysdate);
SUB_BEG_DATE SUB_END_DATE
------------------- -------------------
2024-12-17 02:30:34 2024-12-17 23:59:59
2024-12-18 00:00:00 2024-12-18 23:59:59
2024-12-19 00:00:00 2024-12-19 23:59:59
2024-12-20 00:00:00 2024-12-20 02:30:34
Updating the Domain Index
1. Partitioning Changes in ODCIIndexCreate_pr
We’ll modify the partitioning structure:
partition by range(DURATION_MINUTES)
(
partition part_15_min values less than (15)
,partition part_1_hour values less than (60)
,partition part_1_day values less than (1440) --40*24*60
)
2. Initial Data Population
We’ll use the SQL Macro to populate the index table with split intervals:
-- Now populate the table.
stmt2 := q'[INSERT INTO {index_tab_name} ( beg_date, end_date, rid )
SELECT SUB_BEG_DATE as beg_date
,SUB_END_DATE as end_date
,P.rowid
FROM "{owner}"."{tab_name}" P
, split_interval_by_days(
to_date(substr(P.{col_name}, 1,19),'YYYY-MM-DD HH24:MI:SS')
,to_date(substr(P.{col_name},21,19),'YYYY-MM-DD HH24:MI:SS')
)
]';
3. Updating ODCIIndexInsert_pr
procedure ODCIIndexInsert_pr(
ia sys.ODCIIndexInfo,
rid VARCHAR2,
newval VARCHAR2,
env sys.ODCIEnv
)
IS
BEGIN
-- Insert into auxiliary table
execute immediate
'INSERT INTO '|| get_index_tab_name(ia)||' (rid, beg_date, end_date)'
||'select
:rid, sub_beg_date, sub_end_date
from split_interval_by_days(:beg_date, :end_date)'
using rid,get_beg_date(newval),get_end_date(newval);
END;
4. Query Generation in ODCIIndexStart_Pr
Update the SQL statement to account for the new partitions:
stmt := q'{
select rid from {tab_name} partition (part_15_min) p1
where :cmpval between beg_date and end_date
and end_date < :cmpval+interval'15'minute
union all
select rid from {tab_name} partition (part_1_hour) p2
where :cmpval between beg_date and end_date
and end_date < :cmpval+1/24
union all
select rid from {tab_name} partition (part_1_day ) p3
where :cmpval between beg_date and end_date
and end_date < :cmpval+1
}';
Results
SQL> select count(*) from test_table where DATE_IN_RANGE(virt_date_range,date'2012-02-01')=1;
COUNT(*)
----------
943
SQL> @last
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------
SQL_ID 17wncu9ftfzf6, child number 0
-------------------------------------
select count(*) from test_table where
DATE_IN_RANGE(virt_date_range,date'2012-02-01')=1
Plan hash value: 2131856123
-------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost | A-Rows | A-Time | Buffers |
-------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 9218 | 1 |00:00:00.01 | 15 |
| 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:00.01 | 15 |
|* 2 | DOMAIN INDEX | TEST_RANGE_INDEX | 1 | | | 943 |00:00:00.01 | 15 |
-------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("XTENDER"."DATE_IN_RANGE"("VIRT_DATE_RANGE",TO_DATE(' 2012-02-01 00:00:00', 'syyyy-mm-dd
hh24:mi:ss'))=1)
So, by applying quantization with Dynamic Range Segmentation, we reduced the number of logical reads from 30 (in the simpler version) to 15—a 2x improvement.
Conclusion
In this example, we used partitions for 15 minutes, 1 hour, and 1 day for simplicity. In practice, optimal values will depend on the actual data. While the number of rows in the index increases, the fixed maximum interval length ensures consistently efficient results.
All posts in this series: