Some Rules of XML

XML follows the same kind of hierarchical data structuring rules that apply throughout most programming languages, and therefore XML can represent the same kind of data that we are used to dealing with in our programs. As we'll see later in the chapter, you can always build a tree data structure out of a well-formed XML file and you can always write out a tree into an XML file. When you want to send XML data to someone, the XML file form is handy. When you want to process the data, the in-memory tree-form is handy. The purpose of the Java XML API is to provide classes that make it easy to go from one form to the other, and to grab data on the way.

The XML element

Notice that all XML tags come in matched pairs of a begin tag and an end tag that surround the data they are describing, like this:

<someTagName>    some data appears here   </someTagName>

The whole thing—start tag, data, and end tag—is called an element.

You can nest elements inside elements, and the end tag for a nested element must come before the end tag of the thing that contains it. Here is an example of some XML that is not valid:

<cd>   <title>White Christmas  </cd>  </title>

It's not valid because the title field (or “element” to use the proper term) is nested inside the cd element, but there is no end tag for it before we reach the cd end tag. This proper nesting requirement makes it really easy to check if a file has properly nested XML. You can just push start tags onto a stack as they come in. When you reach an end tag, it should match the tag on the top of the stack. If it does, pop the opening tag from the stack. If the tag doesn't match, the file has badly nested XML.

XML attributes

Just as some HTML tags can have several extra arguments or “attributes,” so can XML tags. The HTML <img> tag is an example of an HTML tag with several attributes. The <img> tag has attributes that specify the name of an image file, the kind of alignment on the page, and even the width and height in pixels of the image. It might look like this:

<img  src="cover.jpg"  height="150"   width="100"  align="right">

In HTML, we can leave off the quotes around attribute values unless the values contain spaces. In XML, attribute values are always placed in quotation marks, and you must not put commas in between attributes. We could equally describe our CD inventory using attributes like this:

<cd  title="The Tubes"  artist="The Tubes"  price="22"  qty="3"> </cd>

As frequently happens in programming, a software designer can express an idea in several different ways. Some experts recommend avoiding the use of attributes in XML where possible, for technical reasons having to do with expressiveness.

XML comments

Comments have the same appearance as in HTML, and can be put in a file using this tag (everything between the two pairs of dashes is a comment):

<!-- comments  -->

Well-formed XML documents

XML tags are case-sensitive.

XML is generally much stricter about what constitutes a good document than is HTML. This strictness makes it easier for a program to read in an XML file and understand its structure. It doesn't have to check for 50 different ways of doing something. An XML document that keeps all the rules about always having a matching closing tag, all tags being properly nested, and so on is called a “well-formed “ document. There is a complete list of all the rules in the XML FAQ at www.ucc.ie/xml/.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.230.81