Areas of Interest, as counted by my cat

Category: SQL Syntax

Detecting overlapping date ranges

Consider an Event table, with columns A and B defining the start and end of the period of time over which the event occurred.
Consider also a random period of time defined from S to E.
Now imagine that you want to select all event rows that overlap that specific period of time.

By overlap, we mean that case 1,2,4, and 6 should be included, but 3 and 5 should not:

image

The first solution that might occur to us is this one:

select * from Event
where B between S and E or A between S and E

At first glance it looks clear and elegant, but unfortunately it doesn’t handle case #6.
Somewhat less intuitive but still elegant and also complete is:

select * from Event
where B >= S and A <= E

Using recursive Common Table Expressions to generate lists

Say we want create a generated list of values – sequential dates, for example – as a set of rows. We can do this using recursive common table expressions.

Here’s an example of a common table expression to generate a set of dates:

with dates as
(
select date '2012-01-01' as DATE_VALUE from dual
union all
select date '2012-01-02' as DATE_VALUE from dual
union all
select date '2012-01-03' as DATE_VALUE from dual
)
select * from dates;

Using this technique, we could create a set of rows with sequential date values… but it is not very flexible. Ideally, we want a technique that expresses the intent algorithmically.

In 11g R2 and later, we can use a reference to the outer table recursively from inside the definition! This sounds crazy but it works:

with dates as
(
select date '2012-01-01' as DATE_VALUE from dual
union all
select d.DATE_VALUE + interval '1' day as DATE_VALUE
from dates d
where d.DATE_VALUE < date '2013-01-01'
)
select * from dates;

In this example, we are UNIONING the first row with a sub-select against itself, adding a new row that increments the date value by 1.

The first time I tried this technique, I got the following error:

ORA-32039: recursive WITH clause must have column alias list

Some research on the web gave me a page that explained what the problem was. The error message is actually very clear: we need to add a list of columns at the top of the WITH <alias> clause:

with dates ( DATE_VALUE ) as
(
select date '2012-01-01' as DATE_VALUE from dual
union all
select d.DATE_VALUE + interval '1' day as DATE_VALUE
from dates d
where d.DATE_VALUE < date '2013-01-01'
)
select * from dates;

And now, it just works:

SQL>
DATE_VALUE
-----------
2012-01-01
2012-01-02
2012-01-03
2012-01-04
2012-01-05
2012-01-06
2012-01-07
2012-01-08
2012-01-09
2012-01-10
2012-01-11
:

Aside: Here’s how to do the same query in PostgreSQL:

with recursive dates ( DATE_VALUE ) as
(
select date '2012-01-01' as DATE_VALUE
union all
select cast(d.DATE_VALUE + interval '1' day as date) as DATE_VALUE
from dates d
where d.DATE_VALUE < date '2013-01-01'
)
select * from dates;

The nice thing about this technique is that we can control the end date, and the increment size, by variables:

define _increment = '5';
define _start = date '2012-01-01';
define _end = date '2012-02-01';

with dates ( DATE_VALUE ) as
(
select &_start as DATE_VALUE from dual
union all
select d.DATE_VALUE + interval '&_increment' day as DATE_VALUE
from dates d
where d.DATE_VALUE < &_end
)
select * from dates;

This gives us a list of dates 5 days apart.

Using this technique in a PL/SQL block

If you’re in a PL/SQL block, trying to generate a list dynamically based on parameter values, the code will need to look slightly different:

declare
p_start_time date;
p_interval_days integer;
p_end_date date;
--
l_interval interval day to second;
begin
l_interval := numtodsinterval( p_interval_days, 'day' );
insert into My_List_Of_Dates
with
dates ( DATE_VALUE ) as
(
select
p_start_time as DATE_VALUE
from dual
union all
select
DATE_VALUE + l_interval as DATE_VALUE
from dates d
where d.END_TIME < p_end_date
)
select DATE_VALUE from dates;
end;

Testing using SQL and common table expressions

Sometimes I forget how to do simple things, like splitting a string into sections based on a delimiter.

Here’s a nice way to test your code on a range of input values, and see the interim results, in one hit:

with 
myobject as (
-- put in a range of test values:
select 'dbo.MyFunction' as NAME
union
select 'setup.ins_Contact'
union
select 'MyOtherFunction'
),
more2 as (
-- calculate interesting attributes:
select
NAME,
len( NAME ) as LENGTH_,
charindex( '.', NAME ) as POSITION
from myobject
)
-- perform the test:
select
NAME, POSITION, LENGTH_,
case when POSITION=0 then 'dbo'
else substring( NAME, 1, POSITION-1 )
end as SCHEMA_,
substring( NAME, POSITION+1 , (LENGTH_ - POSITION ) ) as OBJECT_
from more2;
;

Implementing a Proper() case function in PL/SQL

I found that I needed to convert some raw column names (i.e. THIS_IS_MY_COLUMN) into a human-readable “Proper” or “Title” case form, e.g. This Is My Column.

Scouring the web, I didn’t find out I could steal, so here’s my implementation:

   function Proper(
p_string varchar2
) return varchar2
is
l_string varchar2(200);
l_proper varchar2(200);
begin
l_proper := '';
-- change any underscores to spaces:
l_string := replace(lower(p_string), '_',' ' );

for i in 1..length(l_string)
loop
-- obviously the first character is UPPER:
if i =1 then
l_proper := l_proper || upper( substr( l_string, i, 1 ) );
else
-- if the character is preceded by a space, UPPER it:
if substr( l_string, i-1, 1 ) = ' ' then
l_proper := l_proper || upper( substr( l_string, i, 1 ) );
else
l_proper := l_proper || substr( l_string, i, 1 );
end if;
end if;
end loop;
return l_proper;
end;
 

Avoiding artificial inflation in SUM on a denormalized column value

Say you have a data model that looks like this:

In other words, time entries can be tagged against multiple tasks. (I know this seems wrong, but trust me, it made sense in the context of the system.)

The problem we had was that the complicated query behind the report of “total hours by task” was counting the hours twice or more, for any entry tagged to more than one task.

[code:c#]select T.NAME, sum(L.HOURS)
from time_x_task X
join timelog L on L.ID=X.TIME_ID
join task T on T.ID=X.TASK_ID
group by T.NAME; [/code]

After thinking about how I could alter the query so that each timelog.HOURS was included only once in the SUM(), it occurred to me that it might be better to just distribute the hours across the tasks:

[code:c#]select L.ID, T.NAME, L.HOURS, L2.cnt, (L.HOURS/L2.CNT) as ACTUAL_HRS
from time_x_task X
join task T on T.ID=X.TASK_ID
join timelog L on L.ID=X.TIME_ID
join (select count(1) as CNT, TIME_ID
from time_x_task
group by TIME_ID) L2 on L2.TIME_ID=L.ID;[/code]

Now we SUM(ACTUAL_HRS) instead of sum(HOURS).

OK, this is a contrived example, but the point I’m trying to make is that sometimes you have to throw away the original question and think about what you’re really trying to achieve.

© 2024 More Than Four

Theme by Anders NorenUp ↑