Datasheet
21
Chapter 1: What’s in a Data Warehouse?
Figure 1-2:
Because
you bulk-
load data
into a data
warehouse,
the time
delay
gives you
a historical
perspective.
Data
warehouse
Central Store East StoreWest Store
Monitor the business
1
2
3
4
Orders placed
during store
hours (9-5)
Data for each
transaction (order)
is stored in the
application database
All transactions
(orders) for the
data are sent to
the data warehouse
at midnight; and
purges transactions
older than 30 days
The data
warehouse
retains historical
data (it doesn’t
delete it)
January
30
6
13
20
27
3
31
7
14
21
28
4
1
8
15
22
29
5
2
9
16
23
30
6
3
10
17
24
31
7
4
11
18
25
1
8
5
12
19
26
2
9
It’s Data Warehouse, Not Data Dump
An often-heard argument about what should be stored in a data warehouse
goes something like this: “If I have to take the trouble to pull out data from all
these different applications, why not just get as much as I possibly can? If I
don’t get everything, or as much as possible, I won’t be able to ask all the
business questions I might want to.”
In a commonly related story about knowledge gained from a successful data
warehouse implementation, a grocery-store chain discovered an unusually
high correlation of disposable baby diapers and beer sales during a two-
or three-hour period early every Friday evening and found out that a signifi-
cant number of people on their way home from work were buying both
these items. The store then began stocking display shelves with beer and
disposable diapers next to one another, and sales increased significantly.
Although I don’t know whether this story is true (it certainly has been told
often enough), I believe that it confuses the issue when you have to figure out
what should — and should not — be in your data warehouse. The moral of
05_407479-ch01.indd 2105_407479-ch01.indd 21 1/26/09 7:23:41 PM1/26/09 7:23:41 PM










