XSL Transformations

Although you can use existing markup languages to process XML, as we saw with CSS in Chapter 7, “Using XML with Existing Stylesheet Technologies (CSS),” there are better solutions for working with XML—specifically, the Extensible Stylesheet Language (XSL).

As we learned in Chapter 8, “The New Wave of Stylesheets: XSL,” there are a number of things you can do with XSL, not just limited to formatting XML for display in print or on the Web. Although XSL Formatting Objects deal with formatting XML for display, there is another Recommendation closely tied to XSL, which is XSL Transformations, or XSLT.

XSLT was created in 1999 as a spinoff Recommendation from the XSL Recommendation. The editors of the XSL draft recognized the potential for the ability to transform XML dynamically, and therefore XSLT was born. XSLT is an evolving standard; and, in fact, the W3C has already released a working draft for XSLT2.0.

XSLT enables you to create a stylesheet which, when applied to an XML document by an XSL parser, can actually manipulate the contents of the document. The potential ranges from the ability to insert HTML elements into your documents, to turning XML elements into HTML elements, to changing an element from one type of XML element to another.

The result is that you can use XSLT to create HTML documents from your XML, or you can use it to process and change XML documents from one XML vocabulary to another. The power to work with your XML documents that this creates is nearly limitless. So, now let's take a closer look at the components of XSLT, how they function, and how we can apply them to some sample XML documents.

The XML File

XSLT by itself doesn't necessarily mean much, because the goal of authoring stylesheets is to apply it to an XML document. Therefore, for the sake of discussion and illustration as we explore XSLT, Listing 9.1 shows a well-formed XML document which we will be using as the basis for our examples.

Listing 9.1. A Sample Well-Formed XML Document That Will Be Used to Clarify Examples in This Chapter
<?xml version="1.0" ?>
<address_book>
 <contact>
  <name>
   <first>Jane</first>
   <last>Doe</last>
  </name>
  <address location="office">
   <street>123 Fake Street</street>
   <city>Springfield</city>
   <state>IL</state>
   <zip>49201</zip>
  </address>
   <number type="office">708-555-1212</number>
   <number type="mobile">708-855-4848</number>
   <number type="fax">800-555-1212</number>
  </phone>
  <email>[email protected]</email>
 </contact>
 <contact>
  <name>
   <first>John</first>
   <last>Smith</last>
  </name>
  <address location="home">
   <street>205 Peaceful Lane</street>
   <city>Bloomington</city>
   <state>IN</state>
   <zip>47401</zip>
  </address>
  <address location="office">
   <street>8192 Busy Street</street>
   <city>Bloomington</city>
   <state>IN</state>
   <zip>47408</zip>
  </address>
  <phone>
   <number type="office">812-555-1212</number>
   <number type="mobile">812-855-4848</number>
   <number type="fax">800-333-0999</number>
  </phone>
  <email>[email protected]</email>
  <email>[email protected]</email>
 </contact>
</address_book>
						

The XSLT Namespace

XSLT stylesheets, like all XSL stylesheets, are simply well-formed XML documents. They are text files, and must conform to all the rules of XML. Because XSLT stylesheets are text files, it's necessary to let parsers know that the elements contained in the document are a part of the XSLT Namespace—that is, that they are not just any generic XML elements and attributes, but that they are specifically stylesheet elements and attributes.

XSLT has the following Namespace, as defined in the Recommendation:

http://www.w3.org/1999/XSL/Transform

In practice, it is a very good idea to use this Namespace with a prefix when you are authoring your stylesheets. That's because you will be mixing a number of different types of elements in your sheet, such as HTML, and using the XSLT Namespace ensures that your documents will be parsed correctly.

Every XSL stylesheet has a root element, which is called stylesheet, which is where you would define the Namespace:

<xsl:stylesheet version="1.0" 
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >

Now let's examine the elements that make up the stylesheet itself.

xls:stylesheet and xsl:transform

We mentioned that every stylesheet will have a root element called stylesheet. In fact, there are two root elements to choose from, either stylesheet or transform. Both of these elements are synonymous and therefore can be used interchangeably (although your start/end tags must still match). Some authors use the stylesheet element with XSL-FO sheets, and the transform element with XSLT, to provide some differentiation; however, there are no hard and fast rules, and it is perfectly acceptable to always use stylesheet.

The stylesheet element also has a number of attributes, which you can use with it to provide additional information, such as extensions. The most common attributes you will use, however, are

  • id

  • version

id is an attribute that allows you to specify an ID for the stylesheet. This can be useful later if you are importing stylesheets or working with multiple stylesheets.

The second is the version attribute, which must be used to specify the version of XSLT, which the stylesheet uses for processing. This attribute is required so always be sure to specify the version to avoid errors.

The stylesheet element can also have a number of elements as children. The more common of these child elements include

  • import

  • include

  • output

  • decimal-format

  • namespace-alias

  • attribute-set

  • variable

  • template

In the following sections, we will take a look at each of these child elements and their attributes and see how they are used within the stylesheet.

xsl:include and xsl:import

Cascading stylesheets are designed to “cascade,” or be linked together. Although XSL stylesheets aren't specifically designed to be linked together, there are a couple of mechanisms that can be used to link different sheets together, either by including them or importing them.

Both xsl:include and xsl:import have the same attribute:

  • href

This is simply the reference to the sheet that is being imported, in the form of a URI. It can be either a relative path (such as just the filename if the stylesheets are in the same directories) or a URL that points to the stylesheet on the server.

The only practical difference between include and import is how rules are processed. If a stylesheet is included then the rules of the included stylesheet are processed just as if they were part of the including stylesheet. If a stylesheet is imported then the stylesheet that is importing the new stylesheet takes precedence over the stylesheet being imported.

xsl:output

As we learned in Chapter 8, when an XSL stylesheet is processed, the XSL engine reads the document tree, or source tree, and applies the stylesheet rules to produce a result tree. But what happens to that result tree next is up to the processor. XSL processors are not required to output the result tree, although most do. To help you specify how that result tree is output, you can use the xsl:output element to define some characteristics of how the final document is written.

The way in which outputting is performed is defined by a number of attributes:

  • method— This attribute allows you to select which type of document is to be output, either “xml,” “html,” or “text.” XSLT will add “xhtml” as an option. The default is “xml,” although some processors may automatically select “html” if they detect that the first element in your result tree is an <html> element.

  • version— The version attribute allows you to specify the version of the output method. For example, you could use this attribute to choose between HTML 3.2 or 4.0.

  • encoding— This attribute allows you to specify a character encoding type for the document.

  • omit-xml-declaration— This attribute accepts a value of either “yes” or “no” and specifies whether or not the processor should output the XML declaration.

  • standalone— This attribute also accepts a value of “yes” or “no” and is used to specify whether the processor should output a “standalone” attribute.

  • doctype-public— This attribute is used to specify the value of the “PUBLIC” identifier if you are working with a DOCTYPE declaration.

  • doctype-system— This attribute is used to specify the value of the “SYSTEM” identifier if you are working with a DOCTYPE declaration.

  • cdata-section-elements— This attribute allows you to specify any elements in your document which should have their content output within a CDATA section. This is useful only with xml output.

  • indent— This accepts a value of “yes” or “no” to specify whether or not the output should be indented.

  • media-type— This attribute allows you to specify a MIME content type for the document.

As you can see, there is a great deal of flexibility for working with output from XSLT. We'll look at how to use output later on in the chapter when we present some stylesheet examples.

xsl:decimal-format

In the United States, it is customary to represent decimal places with a period (.) and to separate groups of numbers using a comma (,). For example, we would write one thousand dollars and 23 cents as “$1,000.23.” However, if we were dealing with the same amounts in many European countries, we might write the number as “1.000,23.” Both are examples of different decimal formats.

The decimal-format element allows you to set how you want decimal numbers to be formatted within an element or stylesheet. There are a number of attributes that you can use to define the decimal-format:

  • name— This attribute allows you to give a name to the decimal format, such as “USA,” so that you can reference the format by name.

  • decimal-separator— The decimal-separator attribute accepts a character as its value. For instance, if we wanted to specify that the separator were to be a comma, as in Europe, we would use this attribute.

  • grouping-separator— This attribute also accepts a character, which is used to represent the character used to separate hundreds from thousands, from millions, and so on. In the U.S. decimal format, this is a comma.

  • infinity— This allows you to specify a string that is used to represent infinity. The default value is the string "Infinity".

  • minus-sign— This allows you to specify a character that is used to represent the minus sign. The default value is the hyphen-minus character (-).

  • NaN— This allows you to specify a string that is used to represent something that is not a number. The default value is the string "NaN".

  • percent— This allows you to specify a character that is used to represent a percentage. The default value is the percent sign (%).

  • zero-digit— This allows you to specify a character that is used to represent zero. The default value is the character (0).

  • digit— This allows you to specify a character that is used to represent digits. The default value is the character number sign (#).

  • pattern-separator— This allows you to specify a character that is used as a pattern separator. The default value is the semicolon (;).

So, if we were to use the following:

<xsl:decimal-format decimal-separator="," grouping separator="."/> 

the result would be decimal numbers in the format “1.000,00”.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.14.86.218