Datasheet
This is the sort of thing you'd see in a spreadsheet, but there are a couple of big problems with this. For a
start, we have repeated information. John, for example, has his address shown three times. What
happens if he moves house? You'd have to change the address everywhere it occurs. Dave has two
addresses, but notice they are slightly different. Which one is correct? Are neither correct?
To get around these problems, we use a process called Normalization.
Normalization
This is the process of separating repeated information into separate tables. There are whole books
dedicated to database design, but we only need to look at the simplest case. A good beginner book on
database design is Database Design for Mere Mortals: A Hands On Guide to Relational Database Design, by
Michael J. Hernandez
What we need to do is split the previous table into three tables, one for each unique piece of information
–
Customers, Orders, and OrderDetails. To link the three new tables together, we create ID columns
that uniquely identify each row. For example, we could create a column called
CustomerID in the
Customers table. To link the Customers table to the Orders table, we also add this CustomerID to the
Orders table. Let's look at our tables now.
The
Customers table is as follows:
The
Orders table is as follows:
CustomerID Customer Address
1 John 15 High Street
Brumingham
England UK
2 Chris 25 Easterly Way
Cradiff
Wales UK
3 Dave 2 Middle Lane
Oxborough
England UK
Customer Address Order Date Order Item Quantity Item Cost
Dave 3 Middle
Lane
Oxborough
England
UK
01/09/2003 Thingamajig 1 8.50
249
Reading from Databases
57084_08.qxp 30/01/2004 8:02 PM Page 249