Datasheet
18
Chapter 1
Error-Reporting Features
The next set of features controls the kinds of errors that Xerces reports. The feature http://apache.org
/xml/features/warn-on-duplicate-entitydef generates a warning if an entity definition is duplicated.
When validation is turned on, http://apache.org/xml/features/validation/warn-on-duplicate-
attdef causes Xerces to generate a warning if an attribute declaration is repeated. Similarly,
http://apache.org/xml/features/validation/warn-on-undeclared-elemdef causes Xerces to generate
a warning if a content model references an element that has not been declared. All three of these
properties are provided to help generate more user-friendly error messages when validation fails.
DOM-Related Features and Properties
Three features or properties affect Xerces when you’re using the DOM API. To understand the first one,
we have to make a slight digression onto the topic of ignorable whitespace.
Ignorable whitespace is the whitespace characters that occur between the end of one element and the start
of another. This whitespace is used to format XML documents to make them more readable. Here is the
book example with the ignorable whitespace shown in gray:
1: <?xml version="1.0" encoding="UTF-8"?>¶
2: <book xmlns="http://sauria.com/schemas/apache-xml-book/book"
3: xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”
4: xsi:schemaLocation=
5: "http://sauria.com/schemas/apache-xml-book/book
6: http://www.sauria.com/schemas/apache-xml-book/book.xsd"
7: version="1.0">¶
8: <title>XML Development with Apache Tools</title>¶
9: <author>Theodore W. Leung</author>¶
10: <isbn>0-7645-4355-5</isbn>¶
11: <month>December</month>¶
12: <year>2003</year>¶
13: <publisher>Wrox</publisher>¶
14: <address>Indianapolis, Indiana</address>¶
15: </book>
An XML parser can only determine that whitespace is ignorable when it’s validating. The SAX API
makes the notion of ignorable whitespace explicit by providing different callbacks for characters and
ignorableWhitespace. The DOM API doesn’t have any notion of this concept. A DOM parser must cre-
ate a DOM tree that represents the document that was parsed. The Xerces feature
http://apache.org/xml
/features/dom/include-ignorable-whitespace allows you control whether Xerces creates text nodes for
ignorable whitespace. If the feature is false, then Xerces won’t create text nodes for ignorable whites-
pace. This can save a sizable amount of memory for XML documents that have been pretty-printed or
highly indented.
Frequently we’re asked if it’s possible to supply a custom DOM implementation instead of the one pro-
vided with Xerces. Doing this is a fairly large amount of work. The starting point is the property
http://apache.org/xml/properties/dom/document-class-name, which allows you to set the name of
the class to be used as the factory class for all DOM objects. If you replace the built-in Xerces DOM with
your own DOM, then any Xerces-specific DOM features, such as deferred node expansion, are disabled,
because they are all implemented within the Xerces DOM.
01 543555 Ch01.qxd 11/5/03 9:40 AM Page 18