Datasheet
Xerces provides two features that cause startEntity and endEntity to report the beginning and end of
these two classes of entity references. The feature http://apache.org/xml/features/scanner/notify-
builtin-refs causes startEntity and endEntity to report the start and end of one of the built-in entities, and
the feature http://apache.org/xml/features/scanner/notify-char-refs makes startEntity and endEntity
report the start and end of a character reference.
The DOM has its own challenges when dealing with entities. Consider this XML file:
1: <?xml version="1.0" ?>
2: <!DOCTYPE a [
3: <!ENTITY boilerplate "insert this here">
4: ]>
5: <a>
6: <b>in b</b>
7: <c>
8: text in c but &boilerplate;
9: <d/>
10: </c>
11: </a>
When a DOM API parser constructs a DOM tree, it creates an Entity node under the DocumentType
node. The resulting DOM tree looks like this, with the DocumentType, Entity, and Text nodes shaded in
gray. The Entity node has a child, which is a text node containing the expansion text for the entity. So far,
so good.
32
Chapter 1
Text
[If]
Element
b
Text
[If]
Element
c
Text
[If]
Element
a
Text
in b
Document
Entity
Text
insert this here
Document
Type
01 543555 Ch01.qxd 11/5/03 9:40 AM Page 32