Reserved attributes

There are some universal characteristics that elements in many different applications may share. To avoid conflict with user-defined attribute names, the prefix 'xml:' is reserved by the standard for these and other purposes.

There are only two reserved attributes in the XML core standard. They are used to identify the human languages used for the text in the document, and to indicate whether whitespace characters are used to format the XML markup, or to format the text of the document itself.

Languages

There are any number of reasons why it may be useful to identify the language used for the text contained in a particular element. The 'xml:lang' attribute name is reserved for storage of both language and sometimes also country details (as the same language may differ slightly between countries). The value of this attribute is a single token, or code, which conforms to one of three possible schemes, as outlined below and defined in RFC 1766.

The content may comprise a simple two-character language code, conforming to ISO 639 (Codes for the representation of names of languages). For example, 'en' represents English (a list of these codes appears in Chapter 33):

<para xml:lang="en">This is English text.</para>

Alternatively, the content may be a user-defined code, in which case it must begin with 'x-' (or 'X-'). For example, 'x-cardassian'. Finally, the code may be one that is registered with IANA (the Internet Assigned Numbers Authority), in which case it begins with 'i' (or 'I-'). For example, 'i-yi' (Yiddish).

It is possible for sub-codes to exist, separated from each other and from the main code by a hyphen, '-'. If the first sub-code is two letters (and is not part of a user-defined code) then the sub-code must be a country code, as defined in ISO 3166, such as 'GB' for Great Britain (see Chapter 33 for a list of country codes):

<instruction xml:lang="en-GB>Take the lift to
floor 3.</instruction>

<instruction xml:lang="en-US>Take the elevator to
floor 3.</instruction>

Note that although attribute values are case-sensitive, interpretation of these codes is not case-sensitive, so any combination of upper- and lower-case letters may be entered, though convention dictates that lower-case be used for language codes and upper-case for country codes, giving 'en-GB'.

Significant spaces

Some space characters, line-end codes and tabs may be inserted into an XML document to make the markup more presentable, but without affecting the actual content of the document. The following two examples should normally be considered equivalent, in the sense that published output should be identical:

<book><chapter><section><p>The first paragraph.</p>...


<book>
  <chapter>
    <section>
      <p>The first paragraph.</p>

Some XML-sensitive software is able (in certain circumstances) to distinguish space characters in elements that contain other elements (as in the Book, Chapter and Section elements in the example above) from spaces in elements that contain text (as in the Paragraph element examples), which is termed significant whitespace. It is normally assumed that spaces in elements of the first type are not part of the document, so can be considered to be insignificant whitespace.

Yet in some circumstances the document author may wish this space to be considered significant, in which case the xml:space attribute may be used to override the default handling. The 'xml:space' attribute has two possible values, 'default' (the assumed value when this attribute is not present) and 'preserve' (do not discard). All whitespace in an element can be explicitly made significant, even though the element may only contain child elements.

Most publishing applications are liable to reduce multiple spaces back to a single space, and replace line-end codes with spaces. The 'preserve' value may also be interpreted as an override to these actions, but this is not made explicit in the standard (Chapter 8 covers this topic in more detail).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.219.86.155