Datasheet
8
Part I ✦ Laying the Foundation
Reference data stores are primarily read-only and store generic data that is required by the
organization but seldom changes — similar to the reference section of the library. Examples
of reference data might be unit of measure conversion factors or ISO country codes. A refer-
ence data store is tuned for high-performance data retrieval.
A data warehouse collects large amounts of data from multiple master data stores across the
entire enterprise using an Extract-Transform-Load (ETL) process to convert the data from the
various formats and schema into a common format, designed for ease of data retrieval. Data
warehouses also serve as the archival location, storing historical data and releasing some of
the data load from the operational data stores. The data is also pre-aggregated, making
research and reporting easier, thereby reducing errors.
A common data warehouse is essential for ensuring that the entire organization researches
the same data set and achieves the same result for the same query — a critical aspect of
Sarbanes-Oxley and other regulatory requirements.
Data marts are subsets of the data warehouse with pre-aggregated data organized specifically
to serve the needs of one organizational group or one data domain.
The analysis process usually involves more than just SQL queries, and uses data cubes that
consolidate gigabytes of data into dynamic pivot tables. Business intelligence (BI) is the com-
bination of the ETL process, the data warehouse data store, and the acts of creating and
browsing cubes.
Within Bill Gates’ digital nervous system, the data warehouse serves as the memory of the
organization. It stores history and is used for data mining such as trend analysis, such as find-
ing out where (and why) an organization is doing well or is failing. The portion of the digital
nervous system that is used by an organization for thoughtful musings — slowly turning over
a problem and gaining wisdom— is the data warehouse and a BI cube.
Because the primary task of a data warehouse is data retrieval and analysis, the data-integrity
concerns present with a master data store don’t apply. Data warehouses are designed for fast
retrieval and aren’t normalized like master data stores. They are designed using a basic star
schema or snowflake design. Locks generally don’t apply, and the indexing is applied without
adversely affecting inserts or updates.
Master Data Stores Design Styles
Database designers are not limited to the relational model. Several database design styles
exist from which to choose depending on the requirements of the project.
Relational DBMSs
Relational databases are traditional databases that organize similar or related data into a sin-
gle table. Relational databases are excellent with stable data schema requirements that
include a certain minimum of is-a relationships (e.g., a customer is a contact).
Object-Oriented DBMSs
Object-oriented databases align the data structure with object-oriented application design
(OOA/D) for the purpose of persisting application objects. OOA/D is based on the concept
that an object is an instance of an object class. The class defines the properties and methods
of the object, and contains all the code for the object. Each instance of the object can have its
own internal variables and can interact with the software world outside the object on its own.
Although the terms are different, the basic concept of organizing data is similar, as shown in
Table 1-2.
05_542567 ch01.qxp 9/27/06 9:58 PM Page 8