Characteristics of XML Documents

An XML document must be well formed and valid. For a document to be well formed, it must contain a root element, which is unique and surrounds the whole document, and all other elements must be within the root element with no overlapping. There should be no unclosed tags. Every start tag must have an end tag. The address book example earlier is a well-formed document.

A valid document, on the other hand, must be not only well formed, but also have a DTD to which the well-formed document conforms. This means that the XML document must use only the elements that have been declared in the DTD.

Some advantages of XML technology are that it is platform and system independent; enables you to define your own tags; allows multiple displays for the same XML document; supports the Unicode standard; and is easy to understand even for people who don't have any prior knowledge of it. The main disadvantages of XML are the time and expense required in converting the existing information to XML, and that only newer software will be able to read and understand XML.

Processing XML Documents with XML Parsers

A parser is a piece of software that processes the XML document and checks whether it is valid or at least well formed. Several parsers are available today, including Microsoft's MSXML and IBM's XML4J (same as Apache's Xerces).

Supporting the Unicode Standard

XML supports documents written and authored in languages that aren't Latin-based. Like Java, XML supports the Unicode standard ISO 10646. Unicode is a standard to support most languages on the globe, which some of them have very large character sets. This means you can include structured contents in an XML document using either a single-byte character set (such as English, French, or Hebrew) or a double-byte character set (such as Japanese, Chinese, or Korean). The following declaration is used for documents with the Latin-1 character set (Western European languages):

<?xml version = "1.0" encoding="ISO-8859-1"?>

For the full XML 1.0 specification, see http://www.w3.org/XML/.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.206.25