MA TE RI AL SQL and Relational Database Management Systems I PY R IG HT ED nformation may be the most valuable commodity in the modern world. It can take many different forms — such as accounting and payroll information, information about customers and orders, scientific and statistical data, graphics, or multimedia. We are virtually swamped with data, and we cannot — or at least we’d like to think about it this way — afford to lose it.
Part I SQL Basic Concepts and Principles Sufficient capacity A database’s primary function is to store large amounts of information. For example, an order management system for a medium-sized company can easily grow into gigabytes or even terabytes of data; the bigger the company, the more data it needs to store and rely upon. A company that wants to keep historical (archival) data will require even more storage space.
SQL and Relational Database Management Systems Scalability Databases must be flexible and easily adaptable to changing business needs. That primarily means the internal structure of database objects should be easily modifiable with minimum impact on other objects and processes. For example, to add a field in a pre-SQL database, you would have to bring the whole dataset offline — that is, make it inaccessible to users, modify it, change and recompile related programs, and so on.
Part I SQL Basic Concepts and Principles Selecting Your Database Software Every single DBMS on the market follows essentially the same basic principles. There is a wide variety of database products on the market, and it is very difficult for a person without a solid database background to make a decision on what would be the right product to learn or use.
SQL and Relational Database Management Systems Skills are a different story. Database expertise is a costly thing and usually is in short supply. On average, Oracle expertise is valued a little higher than comparable expertise for Microsoft SQL Server or DB2. The total cost of ownership (TCO) analysis released by vendors themselves tends to be biased, so use your best judgment and do your homework before committing your company to a particular vendor.
Part I SQL Basic Concepts and Principles the American National Standards Institute (ANSI) and the International Organization for Standardization (ISO). All three are able to run on the Windows operating system. Oracle also is available on virtually any UNIX flavor, Linux, Apple Mac OS X Server, IBM z/OS. OpenVMS; DB2 for Linux, UNIX, and Windows runs on AIX, HP-UX, Solaris, Linux, and Microsoft Windows. ANSI is a private, nonprofit organization that administers and coordinates the U.S.
SQL and Relational Database Management Systems Real-Life Database Examples To say that databases are everywhere would be an understatement. They virtually permeate our lives. Online stores, health care providers, clubs, libraries, video stores, beauty salons, travel agencies, phone companies, and government agencies such as FBI, INS, IRS, and NASA all use databases.
Part I SQL Basic Concepts and Principles See Appendix B and Appendix F for more detailed descriptions of the ACME sample database and how to install it. Health care provider database A health care provider company has multiple offices in many different states. Many doctors work for the company, and each doctor takes care of multiple patients. Some doctors just work in one office, and others work in different offices on different days.
SQL and Relational Database Management Systems Scientific database A database for genome research and related research in molecular and cellular biology can be a good example of a scientific database. It contains gene catalogs for completely sequenced genomes and some partial genomes, genome maps and organism information, and data about sequence similarities among all known genes in all organisms in the database.
Part I SQL Basic Concepts and Principles them into tables (think of a file cabinet) alleviates the problem somewhat but does not remove the major obstacles: data redundancy (the same information, or even worse, supposedly the same information, might be stored more than once in different files), slow processing speed (‘‘I know it was there somewhere . . .’’), error-prone storage, and retrieval.
SQL and Relational Database Management Systems FIGURE 1-1 Hierarchical structure ROOT CHILD 1/LEVEL 1 CHILD 1/LEVEL 2 CHILD 2/LEVEL 1 CHILD 2/LEVEL 2 CHILD 3/LEVEL 1 CHILD 1/LEVEL 2 CHILD 1/LEVEL 2 It is based on ‘‘parent/child’’ paradigm in which each parent could have many children but each child has one and only one parent. You can visualize this structure as an upside down tree, starting at the root (trunk) and branching out at many levels (see Figure 1-1).
Part I SQL Basic Concepts and Principles FIGURE 1-2 Hierarchical database example ORDER_HEADER CUSTOMER PRODUCT Everything works great as long as you are willing to put up with a somewhat nonintuitive way of retrieving information. (No matter what information is requested, you always have to start with the root, which in this example is the ORDER HEADER table.) Should you need only customers’ names, the hierarchical database would be blazingly fast — going straight from a parent table to the child one.
SQL and Relational Database Management Systems FIGURE 1-3 Network database example ORDER HEADER CUSTOMER SALESMAN PRODUCT In addition to the ability to handle a one-to-many relationship, the network database can handle many-to-many relationships. One-to-one, one-to-many, and many-to-many relationships are explained in Chapter 2, ‘‘Fundamental SQL Concepts and Principles.’’ Also, data access does not have to begin with the root.
Part I SQL Basic Concepts and Principles The common misconception is that the term ‘‘relational’’ comes from the relationships between tables. In fact, the word ‘‘relation’’ is a mathematical term that can be conditionally interpreted as ‘‘table.’’ Tables A table is a basic building unit of the relational database. It is a fairly intuitive way of organizing data and has been around for centuries. A table consists of rows and columns (called records and fields in nonrelational database jargon).
SQL and Relational Database Management Systems The process of grouping the relevant data together, eliminating redundancies along the way, is called normalization and is discussed in Chapter 2, ‘‘Fundamental SQL Concepts and Principles.’’ It is not part of SQL per se, but it does impose limits on the SQL query efficiency. There is no theoretical limit to the number of rows a table could have, although some implementations impose restrictions.
Part I SQL Basic Concepts and Principles warn you if you create a table without defining a primary key. Some purists go even further, specifying that the primary key should be meaningless in the sense that they would use some generated unique value (such as EMPLOYEE ID) instead of, say, Social Security numbers (despite that these are unique as well).
SQL and Relational Database Management Systems The development of relational databases was driven by the needs of big businesses for a medium to gather, preserve, and analyze data. In 1965, Gordon Moore, the cofounder of Intel, made his famous observation that the number of transistors per square inch on integrated circuits (IC) doubles every year. Surprisingly, this rule still holds true.
Part I SQL Basic Concepts and Principles XML is discussed in Chapter 15, ‘‘XML and SQL.’’ XML first became a part of SQL standards in 2003. SQL:2003 contains a separate document (Part 14: ‘‘XML-Related Specifications’’) describing XML standards for SQL. The document was superseded by a major revision in 2006 (mostly for XQuery support), and another revision is pending and is expected to be released sometime in 2008.
SQL and Relational Database Management Systems add 10 identical records for Wile Electronics Inc., completely redundant except for the shipping address (see Figure 1-5). The programs would still have to be changed because otherwise they may return incorrect results. FIGURE 1-4 Multiple columns to resolve multiple addresses for CUSTOMER NAME BILLADDR SHIPADDR_1 SHIPADDR_2 SHIPADDR_N SHIPADDR_10 MAGNETICS USA INC. 123 LAVACA ST. 444 PINE ST. WILE ELECTRONICS INC. 411 LONDON AVE. 232 EEL ST. 454 OAK ST. ...
Part I SQL Basic Concepts and Principles FIGURE 1-6 Primary/foreign key relationship between tables CUSTOMER ADDRESS CUST_ID_N CUST_NAME_S ADDR_ID_N ADDR_CUSTID_FN ADDR_ADDRESS_S ADDR_TYPE_S ADDR_CITY_S ADDR_STATE_S ADDR_ZIP_S ADDR_COUNTRY_S FIGURE 1-7 Resolving the problem of multiple customer addresses within a relational model CUST_ID_N 7 CUST_NAME_S WILE ELECTRONICS INC. ADDR_CUSTID_FN 7 7 7 7 7 7 7 7 7 7 7 ADDR_ADDRESS_S 411 S LONDON AVE. 454 OAK ST. 678 MAPLE AVE. 999 ELK AVE. 777 SITKA AVE.
SQL and Relational Database Management Systems computer language to access the data. The problem was that IBM already had declared its own product, called IMS, as its sole strategic database product — the company management was not convinced at all that developing new commercial software based on a relational schema was worth the money and the effort. A new database product could also potentially hurt the sales of IMS.
Part I SQL Basic Concepts and Principles developments. In 1984, the standard was redesigned to be more generic, to allow for more diversity among database products vendors. After passing through all the bureaucratic loops, it was endorsed as an American National Standards in 1986. The International Organization for Standardization ISO adopted the standard in 1987. A revised standard, commonly known as SQL-89, was published two years later. SQL-89 (SQL1.
SQL and Relational Database Management Systems SQL:1999 (SQL3) SQL:1999 represented the next step in SQL standards development. The efforts to define this standard began over a year before its predecessor — SQL-92 (SQL2) — was adopted. The new standard was developed under the guidance of both the ANSI and ISO committees, and the change introduced into the database world by SQL3 a dramatic shift from a nonrelational to a relational database model.
Part I SQL Basic Concepts and Principles As of this writing, all major database vendors (Oracle, DB2, and Microsoft SQL Server) have at least partial compliance with SQL:2003, but there is no RDBMS vendor on the market who meets the new standards in full. Each of the vendors has individual features that venture into higher levels of conformance.