Datasheet

When dealing with validity, you need to keep in mind that there are three ways an XML document can exist:
As a free-form, well-formed XML document that does not have DTD or schema associated with it
As a well-formed and valid XML document, adhering to a DTD or schema
As a well-formed document that is not valid because it does not conform to the constraints
defined by the associated DTD or schema
Now that you have a general understanding of the XML concepts, the next section examines the constituents
of an XML document.
Components of an XML Document
As mentioned earlier in this chapter, XML is a language for describing data and the structure of data. XML
data is contained in a document, which can be a file, a stream, or any other storage medium, real or virtual,
that’s capable of holding text. A proper XML document begins with the following XML declaration, which
identifies the document as an XML document and specifies the version of XML that the document’s
contents conform to:
<?xml version=”1.0”?>
The XML declaration can also include an encoding attribute that identifies the type of characters contained
in the document. For example, the following declaration specifies that the document contains characters
from the Latin-1 character set used by Windows 95, 98, and Windows Me:
<?xml version=”1.0” encoding=”ISO-8859-1”?>
The next example identifies the character set as UTF-16, which consists of 16-bit Unicode characters:
<?xml version=”1.0” encoding=”UTF-16”?>
The encoding attribute is optional if the document consists of UTF-8 or UTF-16 characters because an XML
parser can infer the encoding from the document’s first five characters:
‘<?xml’. Documents that use
other encodings must identify the encodings that they use to ensure that an XML parser can read them.
XML declarations are actually specialized forms of XML processing instructions that contain commands for
XML processors. Processing instructions are always enclosed in
<? and ?> symbols. Some browsers, such
as Internet Explorer, interpret the following processing instruction to mean that the XML document should
be formatted using a style sheet named
Books.xsl before it’s displayed:
<?xml-stylesheet type=”text/xsl” href=”Books.xsl”?>
A valid document does not ensure semantic perfection. Although XML Schema
defines stricter constraints on element and attribute content than XML DTDs do, it
cannot catch all errors. For example, you might define a price datatype that requires
two decimal places; however, you might enter 1600.00 when you meant to enter
16.00, and the schema document wouldn’t catch the error.
4
Chapter 1
04_596772 ch01.qxd 12/13/05 11:17 PM Page 4