Every dimensional data warehouse needs some sort of time
component. Typically, the time component
includes at least a date dimension where days roll up to months and months roll
up to years. Database functions like YEAR()
and DATEPART() might help segment data over some time continuum, but leveraging
those functions on reports impacts query performance, user complexity and data interpretation.
Why not handle all of those days of weeks, months of quarters and weeks
of years in a centralized date dimension where query performance is optimized
and everyone uses standardized data hierarchy rules? Consider a DimDate table with date rollups.
Back in the day, I used an Excel spreadsheet with macros laying out
weekends, holidays, pay periods and such.
I used a SQL Server Integration Services (SSIS) package to load the
Excel data to my date dimension table. Each
year I would add new dates to my spreadsheet and make sure all my date rules
were properly assigned. Then it occurred
to me … there has to be something better.
Beginning with SQL Server 2005, Common Table Expression (CTE) offered that
better solution. A CTE is a query which
returns a temporary result set that can reference itself. The CTE is then used as a source for a
subsequent query. Rather than delve into
the details of CTE syntaxes, I want to explain how a recursive temporary result
set enables me to simplify my date dimension maintenance.
The following annotated query uses a recursive CTE to generate the value
of each date within a prescribed range … in my case, January 1, 2000 through
December 31, 2012.
|
-- Use variables to
facilitate parameterization DECLARE
@BeginDate DATE,
@EndDate Date
SET @BeginDate
= '01/01/2000' SET @EndDate
= '12/31/2012'
-- Notice the need
of a semicolon between the sets and the CTE ;
-- Build a CTE for
every day between 2000 and 2012 WITH cteDate
AS ( -- Start with
BeginDate
SELECT @BeginDate
AS CalendarDate
UNION ALL -- Add 1 day to each
prior date
SELECT DATEADD(DAY,1,CalendarDate)
FROM
cteDate -- Stop recursion at
EndDate
WHERE
CalendarDate <
@EndDate)
-- Now select dates
from cteDate.CalendarDate and derive new attributes SELECT -- Common attributes
use date functions to generate a smart date key -- and various date
parts (DATEPART(YEAR,CalendarDate) * 10000) + (DATEPART(MONTH,CalendarDate) * 100) + DATEPART(DAY,CalendarDate) AS DateKey,
CalendarDate,
DATEPART(YEAR,CalendarDate) AS YearID,
DATENAME(MONTH,CalendarDate) AS MonthName,
DATENAME(WEEKDAY,CalendarDate) AS DayWeekName,
DATEPART(Week,CalendarDate) AS WeekYearID, -- Custom attributes
apply business rules specific to the business
CASE WHEN DATEPART(MONTH,CalendarDate) IN(3,4,5) THEN 'Spring' WHEN DATEPART(MONTH,CalendarDate) IN(6,7,8) THEN 'Summer' WHEN DATEPART(MONTH,CalendarDate) IN(9,10,11) THEN 'Fall' WHEN DATEPART(MONTH,CalendarDate) = 12 OR DATEPART(MONTH,CalendarDate) IN(1,2) THEN 'Winter'
END AS
SeasonName
FROM cteDate -- Allow an
indefinite number of recursions as long as some -- limit is set
somewhere (e.g., EndDate)
OPTION (MAXRECURSION 0)
|
The top 10 rows returned from the query above appear below.
|
DateKey
|
CalendarDate
|
YearID
|
MonthName
|
DayWeekName
|
WeekYearID
|
SeasonName
|
|
20000101
|
2000-01-01
|
2000
|
January
|
Saturday
|
1
|
Winter
|
|
20000102
|
2000-01-02
|
2000
|
January
|
Sunday
|
2
|
Winter
|
|
20000103
|
2000-01-03
|
2000
|
January
|
Monday
|
2
|
Winter
|
|
20000104
|
2000-01-04
|
2000
|
January
|
Tuesday
|
2
|
Winter
|
|
20000105
|
2000-01-05
|
2000
|
January
|
Wednesday
|
2
|
Winter
|
|
20000106
|
2000-01-06
|
2000
|
January
|
Thursday
|
2
|
Winter
|
|
20000107
|
2000-01-07
|
2000
|
January
|
Friday
|
2
|
Winter
|
|
20000108
|
2000-01-08
|
2000
|
January
|
Saturday
|
2
|
Winter
|
|
20000109
|
2000-01-09
|
2000
|
January
|
Sunday
|
3
|
Winter
|
|
20000110
|
2000-01-10
|
2000
|
January
|
Monday
|
3
|
Winter
|
The query above should be executable in any SQL Server 2005 or later
version. Use that query as a PowerPivot
data source and you have an instant ad hoc date dimension. Use that source as a parameterized data
source in SSIS and you have a package that can maintain your enterprise data
warehouse indefinitely.
Need to update your attributes with new business rules? Simply alter the query logic and derive the
attributes using the new logic. Because
the logic is coded, you can easily manage changes using source controllers such
as Team Foundation Server (TFS).
Of course this same sort of logic can also create an intra-day time
dimension. An example of that time
dimension logic should follow.