Tags and Elements

You give structure to an XML document using markup, which consists of elements. In turn, an XML element consists of a start tag and an end tag, except in the case of elements that are defined to be empty, which consist of only one tag.

A start tag (also called an opening tag) starts with < and ends with >. End tags (also called closing tags) begin with </ and end with >.

Tag Names

The XML specification is very specific about tag names; you can start a tag name with a letter, an underscore, or a colon. The next characters may be letters, digits, underscores, hyphens, periods, and colons (but no whitespace).

Avoid Colons in Tag Names

Although the XML 1.0 recommendation does not say so, you should definitely avoid using colons in tag names because you use a colon when specifying namespaces in XML, as I'll discuss later in this chapter.

Here are some allowable XML tags:

<DOCUMENT>

<document>

<_Record>

<customer>

<PRODUCT>


Note that because XML processors are case-sensitive, the <DOCUMENT> tag is not the same as a <document> tag. (In fact, you can even have <DOCUMENT> and <document>—and even <DoCuMeNt>—as different tags in the same document, but I really recommend against it.)

Here are the corresponding closing tags:

</DOCUMENT>
</document>
</_Record>
</customer>
</PRODUCT>

These are some tags that XML considers illegal:

<2001DOCUMENT>
<.document>
<Record Number>
<customer*name>
<PRODUCT(ID)>

Using start and end tags, you can create elements, as in this example, which has three elements, the <DOCUMENT>, <GREETING>, and <MESSAGE> elements; the <DOCUMENT> element contains the <GREETING> and <MESSAGE> elements:

<?xml version = "1.0" standalone="yes"?>
<DOCUMENT>
    <GREETING>
        Hello From XML
    </GREETING>
    <MESSAGE>
        Welcome to the wild and woolly world of XML.
    </MESSAGE>
</DOCUMENT>

You also can create elements without using end tags if the elements are explicitly declared to be empty.

Empty Elements

Empty elements have only one tag, not a start and end tag. You may be familiar with empty elements from HTML; for example, the HTML <IMG>, <LI>, <HR>, and <BR> elements are empty, which is to say that they do not enclose any content (either character data or markup).

Empty elements are represented with only one tag (in HTML, there is no closing </IMG>, </LI>, </HR>, and </BR> tags). In XML, you can declare elements to be empty in the document's DTD, as we'll see in Chapter 3.

In XML, you close an empty element with />. For example, if the <GREETING> element is empty, it might look like this in an XML document:

<?xml version = "1.0" standalone="yes"?>
<DOCUMENT>
    <GREETING TEXT = "Hello From XML" />
</DOCUMENT>

This usage might seem a little strange at first, but this is XML's way of making sure that an XML processor isn't left searching for a nonexistent closing tag. In fact, in XHTML, which is a derivation of HTML in XML, the <IMG>, <LI>, <HR>, and <BR> tags are actually used as <IMG />, <LI />, <HR />, and <BR /> (except that XHTML tags use lowercase letters). The additional / doesn't seem to give the major browsers any trouble. We'll see how to declare empty tags in Chapter 3.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.190.153.63