Specifications
About XML
366 InfoMaker
<td>Mouse</td>
</tr>
</table>
Representing empty
elements
Empty elements cannot be represented in XML in the same way they are in
HTML. An empty element is one that is not used to mark up data, so in HTML,
there is no end tag. There are two ways to handle empty elements:
• Place a dummy tag immediately after the start tag. For example:
<img href="picture.jpg"></img>
• Use a slash character at the end of the initial tag:
<img href="picture.jpg"/>
This tells a parser that the element consists only of one tag.
XML is case sensitive
XML is case sensitive, which allows it to be used with non-Latin alphabets.
You must ensure that letter case matches in start and end tags:
<MyTag> and
</Mytag>
belong to two different elements.
White space
White space within tags in XML is unchanged by parsers.
All elements must be
nested
All XML elements must be properly nested. All child elements must be closed
before their parent elements close.
XML parsing
There are two major types of application programming interfaces (APIs) that
can be used to parse XML:
• Tree-based APIs map the XML document to a tree structure. The major
tree-based API is the Document Object Model (DOM) maintained by
W3C. A DOM parser is particularly useful if you are working with a
deeply-nested document that must be traversed multiple times.
For more information about the DOM parser, see the
W3C Document
Object Model page at http://www.w3c.org/DOM
.
• Event-based APIs use callbacks to report events, such as the start and end
of elements, to the calling application, and the application handles those
events. These APIs provide faster, lower-level access to the XML and are
most efficient when extracting data from an XML document in a single
traversal.
For more information about the best-known event-driven parser, SAX
(Simple API for XML), see the
SAX page at http://sax.sourceforge.net/.