Datasheet

This is the sort of thing you'd see in a spreadsheet, but there are a couple of big problems with this. For a

start, we have repeated information. John, for example, has his address shown three times. What

happens if he moves house? You'd have to change the address everywhere it occurs. Dave has two

addresses, but notice they are slightly different. Which one is correct? Are neither correct?

To get around these problems, we use a process called Normalization.

Normalization

This is the process of separating repeated information into separate tables. There are whole books

dedicated to database design, but we only need to look at the simplest case. A good beginner book on

database design is Database Design for Mere Mortals: A Hands On Guide to Relational Database Design, by

Michael J. Hernandez

What we need to do is split the previous table into three tables, one for each unique piece of information

–

Customers, Orders, and OrderDetails. To link the three new tables together, we create ID columns

that uniquely identify each row. For example, we could create a column called

CustomerID in the

Customers table. To link the Customers table to the Orders table, we also add this CustomerID to the

Orders table. Let's look at our tables now.

The

Customers table is as follows:

The

Orders table is as follows:

CustomerID Customer Address

1 John 15 High Street

Brumingham

England UK

2 Chris 25 Easterly Way

Cradiff

Wales UK

3 Dave 2 Middle Lane

Oxborough

England UK

Customer Address Order Date Order Item Quantity Item Cost

Dave 3 Middle

Lane

Oxborough

England

01/09/2003 Thingamajig 1 8.50

249

Reading from Databases

57084_08.qxp 30/01/2004 8:02 PM Page 249