Summary of XML Document Markups

XML documents are composed of markup and content. Six kinds of markup can occur in an XML document: elements, attributes, comments, marked sections, entity references, and document type declarations. The following sections introduce each of these markup concepts. All the markups are case-sensitive; that means, for example, the tag <FirstName> is different from <firstname>.

Building Elements

Elements are the building blocks of an XML document. The content of an element is surrounded by a start tag and an end tag. Some elements may be empty; in which case, they have no content. If an element is not empty, it begins with the start tag, <element>, and ends with the end tag, </element>. An empty element, however, is represented with a one-tag format, such as <element/>. The <name>. . .</name> tags, in the earlier listing, are an example of an element that contains subelements.

Adding Attributes

Attributes are name-value pairs that occur inside the start tags of the element. Attributes are additional information to the element. For example, in the start tag <student gender="m">, the attribute gender with the value "m" is added to the student element. In XML, all attribute values must be quoted.

Adding Comments to Your Documents

Comments begin with "<!--" and end with "-->". Comments can contain any data except the literal string "--". You can place comments anywhere in your document. When an XML document is processed by a parser, comments are ignored. Here's an example of a comment:

<!-- This is just a comment -->

What's in the CDATA Section

In a document, a CDATA section instructs the parser to ignore markup characters within the section; in effect, markup characters within a CDATA section look like comments to the parser. CDATA stands for (unparsed) character data.

Consider a mathematical equation, or source code, as content in an XML document. It might contain characters that the XML parser would ordinarily recognize as markup (> and &, for example). To prevent this, a CDATA section can be used. Consider the following example:

							<![CDATA[
   if (a>5) (b = 3);
]]>

All character data between the start of the section, <![CDATA[, and the end of the section, ]]>, is passed directly to the application without parsing.

Understanding Document Type Declaration

As mentioned earlier, the DTD defines the rules of the XML document. A DTD has a different syntax from that of an XML document.

A DTD can be either an internal or an external file. It can be declared internally within the XML document type declaration block. An externally declared DTD can be stored in a file (here we're using a file named student.dtd), as in the following example:

<!DOCTYPE classlist [
<!ELEMENT student (name, address?, phone?, email?)>
<!ELEMENT name (firstname, middleinitial?, lastname)>
<!ELEMENT address (street, city, state, zipcode)>
<!ELEMENT phone (#PCDATA)>
<!ELEMENT email (#PCDATA)>
<!ELEMENT street (#PCDATA)>
<!ELEMENT city (#PCDATA)>
<!ELEMENT state (#PCDATA)>
<!ELEMENT zipcode (#PCDATA)>
<!ATTLIST student gender CDATA #REQUIRED>
<!ATTLIST phone areacode CDATA #REQUIRED>
]>

Elements are declared using the <!ELEMENT > tag. Special characters play an important role in DTD syntax. For example, parentheses ( ) are used to group names, and the ? character indicates that the middleinitial name is optional, and can appear once or not at all. PCDATA stands for parsed character data. Here's how to use this file in the declaration block of an XML document:

<!DOCTYPE classlist SYSTEM "student.dtd">
<classlist>

Including Entity References

In XML, entities are used to represent special characters, which are used to refer to repeated or varying text and to include the content of external files.

Every entity must have a unique name. You can define your own entity in the document declaration section. Here's an example of an entity definition:

<!DOCTYPE state[
<!ENTITY ca "California">
]>

To use an entity, you simply reference it by name. Entity references begin with an ampersand and end with a semicolon. The following is an example of using the defined entity ca:

<state> &ca; </state>

The preceding line is equivalent to writing

<state> California </state>

XML also uses predefined entities, such as &gt; (for the greater than character, >), &lt; (for the less than character, <), and &amp; (for the ampersand, &).

XML Schema

A DTD is not a typed language because every element is specified as text, and its syntax is not XML-based. An XML schema, on the other hand, is a strongly typed language in which each element can be specified with a type, such as string or integer. It also enables users to define their own types as structures of other types.

An XML schema follows the same syntax as the XML standard. This gives XML more power to handle more data semantics, but without the burden of having to learn a new language's syntax. The trend in the industry today is to use XML schemas instead of DTDs.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.2.240