Before you
can write an XmlPyxReader
, you first need to
understand PYX syntax. PYX is a line-oriented XML syntax, developed
by Sean McGrath, which reflects XML’s SGML heritage.
PYX is based on Element Structure Information Set (ESIS), a popular
alternative syntax for SGML.
Unlike many of the terms in this book, PYX is not an acronym for anything. A pyx is is a container used in certain religious rites, and the PYX notation was developed mostly using the Python programming language.
In a line-oriented format, each XML node occurs on a new line. The XML nodes that PYX can represent include start element, end element, attribute, character data, and processing instruction. The first character of each line indicates what sort of node the line represents. Table 4-1 shows the prefix characters and what node type each represents.
PYX prefix character |
XmlNodeType value |
|
|
|
|
|
|
|
|
|
|
As you can see by the limited number of node types it contains, PYX
represents only the logical structure of an XML
document, not the physical structure. There are
no DocumentType
,
EntityReference
, Comment
, or
CDATA
XmlNodeType
s in a PYX
document. This lack of certain nodes is consistent with
PYX’s ESIS ancestry; in SGML, the separation between
document structure and document content is enforced more rigidly than
in XML.
None of this should stop you from using PYX to represent basic XML
documents. In fact, PYX’s structure makes it very
easy to parse using the XmlReader
model.
To test your
XmlPyxReader
, you’ll need a file
in PYX format. Example 4-1 shows the same purchase
order we dealt with in Chapter 2, reformatted in
PYX. A few lines are highlighted; I’ll discuss these
after the example.
(po Aid PO1456 (date Ayear 2002 Amonth 6 Aday 14 )date (address Atype shipping (name -Frits Mendels )address (street -152 Cherry St )street (city -San Francisco )city (state -CA )state (zip -94045 )zip )address (address Atype billing (name -Frits Mendels )name (street -PO Box 6789 )street (city -San Francisco )city (state -CA )state (zip -94123-6798 )zip )address (items (item Aquantity 1 AproductCode R-273 Adescription 14.4 Volt Cordless Drill AunitCost 189.95 )item (item Aquantity 1 AproductCode 1632S Adescription 12 Piece Drill Bit Set AunitCost 14.95 )item )items )po
Notice that all the data matches the data from Example 2-1, although the format is clearly very different.
Each line that begins with (
is a start element,
as in the first highlighted line:
(po
This is equivalent to the <po>
element start
tag. The next highlighted line is an attribute:
Ayear 2002
This is equivalent to year="2002
" in standard XML
syntax. After the A
, the next whitespace-delimited
word is the name of the attribute, and the rest of the line contains
the attribute value. Multiple attributes on the same element are just
listed in order, on separate lines.
Although PYX doesn’t really support XML namespaces, there’s no reason you can’t recognize them yourself. The following PYX fragment shows a way to represent namespaces in PYX:
(myElement Axmlns http://www.mynamespaceuri.com/ Axmlns:foo http://www.anothernamespaceuri.com/ )myElement
That PYX fragment is equivalent to the following XML fragment:
<myElement xmlns="http://www.mynamespaceuri.com/" xmlns:foo=" http://www.anothernamespaceuri.com/" />
The next highlighted line in Example 4-1 is an
EndElement
node:
)date
The name of the element is given after
the )
prefix character. This is equivalent to the
</date>
end tag. Note that there is no PYX
shorthand for an empty element, like <item
/>
.
The last highlighted line is text:
-Frits Mendels
After the -
, the
rest of the line contains the element’s text value.
Because only the prefix character on any line is significant, the
rest of the line can contain any characters, including the PYX prefix
characters (
, A
,
-
, )
, and ?
,
and XML reserved characters <
,
>
, and &
. CDATA sections
are thus irrelevant in PYX.
PYX is a fairly simple format, and XmlPyxReader
will be correspondingly simple. Writing a more complex
XmlReader
is certainly possible, but it would take
several chapters’ worth of examples to show all the
details. If, after reading this chapter, you’re
interested in a considerably more complex model for writing
XmlReader
subclasses, I urge you to read Ralf
Westphal’s article, “Implementing
XmlReader Classes for Non-XML Data Structures and
Formats.” You can view the article online at
http://msdn.microsoft.com/library/en-us/dndotnet/html/Custxmlread.asp
.
18.223.170.63