Datasheet
SQL and Relational Database Management Systems 1
The process of grouping the relevant data together, eliminating redundancies along
the way, is called
normalization
and is discussed in Chapter 2, ‘‘Fundamental SQL
Concepts and Principles.’’ It is not part of SQL
per se
, but it does impose limits on the SQL query
efficiency.
There is no theoretical limit to the number of rows a table could have, although some imple-
mentations impose restrictions. Also, there are (or at least ought to be) practical considerations
to the limits: data retrieval speed, amount of storage, and so on.
Relationships
Tables in RDBMS might or might not be related. As mentioned before, RDBMS is built upon
relationships between tables, but unlike in legacy databases (hierarchical and network), these
relations are based solely on the values in the table columns — these relationships are meaningful
in logical terms, not in low-level computer specific pointers. Let’s take the example of our fictitious
order entry database (the one that you will design, build, and use throughout the book). The
ORDER HEADER table is related to the CUSTOMER table since both of these tables have a common set of
values. The field
ORDHDR CUSTID FN (customer ID) in ORDER HEADER (and its values) corresponds
to
CUST ID N in CUSTOMER. The field CUST ID N is declared to be a primary key for the CUSTOMER
table and also is the target for a foreign key from the ORDER HEADER table (in the form of the field
ORDHDR CUSTID FN).
Primary key
The primary key holds more than one job in an RDBMS. As mentioned previously, this is an important
component of relationships. It also can carry ordinary data, such as a department number, part code,
or employee ID; but its primary role is to identify each record in a table uniquely.
In the days of legacy databases, the records were always stored in some predefined order — if
such an order had to be broken (because somebody had inserted records in a wrong order or
business rule was changed), then the whole table (and, most likely, the whole database) had to
be rebuilt. The RDBMS abolishes this fixed-order rule for the records, but it still needs some
mechanism of identifying the records uniquely, and the primary key, based on the idea of a field
(or fields) that contains set unique values, serves exactly this purpose.
In an RDBMS, the primary key is just a particular case of
candidate key
—asetof
one or more columns that has unique value for every row in this table. Each table
can have any number of candidate keys, but only one of them can be declared a primary key.
By its very nature, the primary key cannot be empty. This means that in a table with a defined
primary key, the primary key fields must contain non-null data for each record.
Although it is not a requirement to have a primary key on each and every table, it is
considered good practice to have one. In fact, many RDBMS implementations would
17