XSLT Stylesheets

In Listing 17.2, you saw a simple stylesheet that used default transformation rules to remove everything except for the text from an XML document. You will now look at how to define your own rules for transforming an XML document.

Rules are based on matching elements in the XML document and transforming the elements into a new document. Text and information from the original XML document can be included or omitted. Components from the XML document are matched using the XPath notation defined by the W3C. You will learn more about XPath in the “Using XPath with XSLT” section later in today's lesson, after you have looked at some simple XSLT templates.

Template Rules

The most common XSLT template rules are those for matching and transforming elements. The following simple example matches the root node of a document and transforms it into an outline for an HTML document that will be created as you learn more about XSLT's capabilities.

<xsl:template match="/">
  <HTML>
    <HEAD> <TITLE>Job Details</TITLE> </HEAD>
    <BODY> </BODY>
  </HTML>
</xsl:template>

The <xsl:template> defines a new template rule in the stylesheet and its match attribute specifies which parts of the XML document will be matched by this rule. The root of a document is matched by the forward slash (/); other matching patterns are discussed later in the “Using XPath with XSLT” section.

The body of the <xsl:template> element is output in place of the matched element in the original document. In this case, the entire document is replaced by a blank HTML document. No other elements in the document will be matched.

If you want to transform other elements in the original document, you must define additional templates and apply those templates to the body of the matched element. The following text adds an <xsl:apply-templates/> element to the rule matching the XML document root:

<xsl:template match="/">
  <HTML>
    <HEAD> <title>Job Details</title> </HEAD>
    <BODY> <xsl:apply-templates/> </BODY>
  </HTML>
</xsl:template>

When this rule is applied to the transformed root element, the body of the root element is scanned for further template matches. The output from the other rules is inserted at the point where the <xsl:apply-templates/> element is defined.

Listing 17.7 shows a simple stylesheet that transforms all of the XML elements into HTML <STRONG> elements.

Listing 17.7. Full Text of basicHTML.xsl
 1: <?xml version="1.0"?>
 2: <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 3: <xsl:template match="/">
 4:   <HTML>
 5:     <HEAD> <TITLE>Job Details</TITLE> </HEAD>
 6:     <BODY> <xsl:apply-templates/> </BODY>
 7:   </HTML>
 8: </xsl:template>
 9: <xsl:template match="*">
10:   <P><STRONG><xsl:apply-templates/></STRONG></P>
11: </xsl:template>
12: </xsl:stylesheet>

In Listing 17.7, the second rule at lines 9–11 matches every element in the XML document, replaces it with a <STRONG> element, and applies all the templates recursively to the body of the XML element.

Caution

A stylesheet is an XML document, and you must ensure the XML remains valid when outputting HTML. In Listing 17.7, on line 10, the <STRONG> text is enclosed inside an HTML paragraph to ensure that the stylesheet remains valid. Many authors of HTML simply insert the paragraph <P> tag at the end of the paragraph. This will not work with stylesheets because the unterminated <P> tag is not well-formed XML. Other HTML tags, such as <BR> and <IMG>, must be treated in a similar manner. There are alternative solutions to the problem of defining HTML documents inside XSLT stylesheets that are outside the scope of this chapter.


Listing 17.8 shows the HTML output from applying the basicHTML.xsl stylesheet to the jobs.xml file shown in Listing 17.3.

Listing 17.8. Applying basicHTML.xsl to jobs.xml
 1: >java org.apache.xalan.xslt.Process -in XMLjob.xml -xsl XSLasicHTML.xsl
 2: <HTML>
 3: <HEAD>
 4: <META http-equiv="Content-Type" content="text/html; charset=UTF-8">
 5: <TITLE>Job Details</TITLE>
 6: </HEAD>
 7: <BODY>
 8: <P>
 9: <STRONG>
10:
11: <P>
12: <STRONG>
13:
14: <P>
15: <STRONG>London</STRONG>
16: </P>
17:
18: <P>
19: <STRONG>Must like to talk and smoke</STRONG>
20: </P>
21:
22:
23: <P>
24: <STRONG>Cigar maker</STRONG>
25: </P>
26:
27: <P>
28: <STRONG>Critic</STRONG>
29: </P>
30:
31: </STRONG>
32: </P>
33:
34: <P>
35: <STRONG>
36:
37: <P>
38: <STRONG>Washington</STRONG>
39: </P>
40:
41: <P>
42: <STRONG>Must be honest</STRONG>
43: </P>
44:
45:
46: <P>
47: <STRONG>Tree surgeon</STRONG>
48: </P>
49:
50: </STRONG>
51: </P>
52:
53: </STRONG>
54: </P>
55: </BODY>
56: </HTML>
						

In Listing 17.8, the HTML body starts with two STRONG elements corresponding to the <jobSummary> root element and the first <job> element. The nested XML elements <location> and <skill> are output inside <STRONG> tags.

If you studied Listing 17.8 carefully, you will have seen a <META> element inserted into the output at line 4. The XSL processor has identified the output as an HTML document and, on recognizing the HTML <HEAD> element, has inserted the <META> element to identify the contents of the Web page.

Note

The stylesheet must be well formed XML, so any HTML tags must use consistent letter case names for both the start and end tags. HTML is not case sensitive and would allow you to use mismatched names such as <STRONG>...</strong>. This example is invalid in XML and will cause the transformation to fail. It is also extremely poor HTML style.


Now that you have seen how the templates are applied to the body of a tag, you might be wondering how not to apply the templates but still output the text of an element. You do this by using the <xsl:value-of select='.'/> tag. This tag outputs the text of the currently selected XML element without applying any more templates either to this element or any of its descendents.

You will use the <xsl:value-of> element when you want to output the text of an XML tag rather than transform it in some way. Listing 17.9 shows a more realistic stylesheet for the jobs.xml example file, and Listing 17.10 shows the transformed document.

Listing 17.9. Full Text of textHTML.xsl
 1: <?xml version="1.0"?>
 2: <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 3: <xsl:template match="/">
 4:   <HTML>
 5:     <HEAD> <TITLE>Job Details</TITLE> </HEAD>
 6:     <BODY> <xsl:apply-templates/> </BODY>
 7:   </HTML>
 8: </xsl:template>
 9: <xsl:template match="jobSummary">
10:   <H2>Jobs</H2><xsl:apply-templates/>
11: </xsl:template>
12: <xsl:template match="job">
13:   New Job: <P><xsl:apply-templates/></p>
14: </xsl:template>
15: <xsl:template match="description">
16:   <P>Descriptiom: <xsl:value-of select="."/></P>
17: </xsl:template>
18: <xsl:template match="location">
19:   <P>Location: <xsl:value-of select="."/></P>
20: </xsl:template>
21: <xsl:template match="skill">
22:   <P>Skill: <xsl:value-of select="."/></P>
23: </xsl:template>
24: </xsl:stylesheet>

In Listing 17.9, the leaf elements of <description>, <location>, and <skill> are output as text rather than expanded using the template rules.

Note

Listing 17.9 includes a template for the document root (match="/") and the root element (match='jobSummary"). On Day 16, you learned that the document root is the entire XML document, including the processing instructions and comments outside of the root element.


Listing 17.10. Applying textHTML.xsl to jobs.xml
 1: >java org.apache.xalan.xslt.Process -in XMLjob.xml -xsl XSL	extHTML.xsl
 2: <HTML>
 3: <HEAD>
 4: <META http-equiv="Content-Type" content="text/html; charset=UTF-8">
 5: <TITLE>Job Details</TITLE>
 6: </HEAD>
 7: <BODY>
 8: <H2>Jobs</H2>
 9:
10:   New Job: <P>
11:
12: <P>Location: London</P>
13:
14: <P>Descriptiom: Must like to talk and smoke</P>
15:
16:
17: <P>Skill: Cigar maker</P>
18:
19: <P>Skill: Critic</P>
20:
21: </P>
22:
23:   New Job: <P>
24:
25: <P>Location: Washington</P>
26:
27: <P>Descriptiom: Must be honest</P>
28:
29:
30: <P>Skill: Tree surgeon</P>
31:
32: </P>
33:
34: </BODY>
35: </HTML>
						

Using the <xsl:value-of> tag raises two questions:

  • What is the text value of an XML element?

  • What does the select attribute do?

These questions are answered in the next two sections.

Text Representation of XML Elements

Every XML node has a textual representation that is used when the <xsl:value-of> tag is defined within a template rule. Table 17.1 shows how the textual equivalent of each of the seven XML nodes is obtained.

Table 17.1. Text Values of XML Elements
Element Type Description
Document root The concatenation of all the text in the document
Elements The concatenation of all the text in the body of the element
Text The text value of the node, including whitespace
Attributes The text value of the attribute, including whitespace
Namespaces The namespace URI that is bound to the namespace prefix associated with the node
Processing Instructions The text the processing instruction following the target name and including any whitespace
Comments The text of the comment between the <!-- and --> delimiters

As you can see from Table 17.1, every node has a textual equivalent. The default rules for a stylesheet only include the text values for the document root, elements, and all text nodes. By default, the other four nodes (attributes, namespaces, processing instructions, and comments) are not output. Before you can understand the default rules, you will need to study the XPath notation for matching nodes in an XML document.

Using XPath with XSLT

XPath is means of identifying nodes within an XML document. The W3C identified several aspects of XML that required the ability to identify nodes, for example

  • Pointers from one XML document to another called XPointer (the equivalent of href in HTML)

  • Template rules for XSLT stylesheets

  • Schemas

To ensure that the two requirements for identifying nodes share a common syntax, the XPath notation was defined as a separate standard.

An XPath is a set of patterns that can be used to match nodes within an XML document. There are a large number of patterns that can be used to match any part of an XML document. Rather than reproduce the entire XPath specification in today's lesson, you will just study some examples that will help you understand how to use XPath. Further information about XPath can be obtained from the WC3 Web site.

XPath uses the concept of axes and expressions to define a path in the XML document:

  • Axes define different parts of the XML document structure.

  • Expressions refer to a specific objects within an axis.

Some of the most frequently used axes have special shortcuts to reduce the amount of typing needed. Consider the stylesheet rule you used to match a skill element:

<xsl:template match="skill">
  Skill: <xsl:value-of select="."/><P></P>
</xsl:template>

This matches a child “skill" element using a simple abbreviation. The full XPath notation for this would be:

<xsl:template match=" child::skill">

The axis is child and the expression is an element with the name skill (the double colon separates the axis from the expression). The current node that a path is defined from is called the context node.

The child axis is used to identify all nodes that are immediate children of the context node. Related axes are

  • self The current node

  • parent The immediate parent of the context node

  • descendent Immediate children of the context node, all the children of those nodes, their children, and so on

  • descendent-or-self All descendent nodes and the current context node

  • ancestor Any node higher up the node tree that contains context node

There are several other axes defined in the XPath notation.

The match="." attribute in the example <xsl:value-of> element, shown previously, is another example of a shortcut. The full notation is as follows:

Skill: <xsl:value-of select="self::node()"/><P></P>

The function node() refers to the current context node. Additional functions are

  • name() The name of the context node instead of the body of the node

  • comment() Selects a comment node

  • text() Selects a text node

  • processing-instruction() Selects a processing instruction node

Some simple XPath expressions are as follows:

  • self::comment() All comments in the current element

  • child::text() All the text nodes in the immediate child nodes

  • descendent::node() All the nodes below the context node

  • descendent-or-self::skill All the nodes named skill below the current node, including the current node

Expressions can be more complex and specify a node hierarchy:

  • job/skill A skill node that is an immediate child of a job node (in full child::job/child::skill)

XPath expressions can be arbitrarily long and can contain the following special expressions:

  • .. The immediate parent node defined as parent::node()

  • // The current node or any descendent as descendent-or-self::node()

  • * Any node in the specified axis

  • | Used to provide alternate patterns (one pattern or another)

These patterns can be used to identify any node as illustrated by the following examples:

  • jobSummary//skill Nodes called skill defined anywhere below the jobSummary node

  • jobSummary/*/skill skill nodes defined as children of children of the jobSummary node

  • skill/.. The immediate parent node of a skill node

  • location|skill A location or skill node

  • parent::comment()|child::text() Comment nodes in the immediate parent and text nodes in the immediate child

  • /|* The document root and all elements

Attributes can be selected using the attribute axis, which can be abbreviated to @. For example,

  • attribute::customer An the attribute called customer of any node (not the node itself)

  • job/@reference An attribute called reference so long as it is associated with a job node

In addition to these basic features, XPath supports a powerful matching language supporting variable-like constructs, expressions, and additional functions.

Now that you have a basic understanding of Xpath, you can look at the default rules for a stylesheet.

Default Stylesheet Rules

There are some default stylesheet rules that apply to the whole XML document unless overridden by specific template rules.

The first default rule that ensures all elements are processed is as follows:

<xsl:template match="*|/">
  <xsl:apply-templates/>
</xsl:template>

A second rule is used to output the text of text nodes and attributes:

<xsl:template match="text()|@*">
  <xsl:value-of select="."/>
</xsl:template>

A third rule suppresses comments and processing instructions:

<xsl:template match="processing-instruction()|comment()"/>

If an XML element in the source document matches more than one rule, the most specific rule is applied. Consequently, rules defined in a stylesheet will override the default rules.

The second default rule specifies that the text value of attributes should be output, but you can see from Listing 17.4 that the attributes in job.xml (Listing 17.3) have not been included. Obviously, there is an extra requirement for processing attributes because this rule has never been invoked.

Processing Attributes

Attributes of XML elements are not processed unless a specific rule is defined to process the element's attributes.

An attribute is processed by using the <xsl:apply-templates> rule selecting one or more attributes. The third line in the following rule is the one that applies templates to all attributes:

<xsl:template match="*">
  <xsl:apply-templates/>
  <xsl:apply-templates select="@*"/>
</xsl:template>

This <xsl:template> rule matches all elements and applies templates to the child elements and then that element's attributes. It is the second <xsl:apply-templates> rule with the select="@*" attribute that ensures that all attributes are output. If you only defined the second <xsl:apply-templates select="@*"> rule, no output would be produced because the rule had not been applied to elements in the context node.

With this extra information, you can now revisit the job.xml file and define a stylesheet that will display the job information in an HTML table. Listing 17.11 shows a stylesheet that will convert a <job> element to an HTML table.

Listing 17.11. Full Text of table.xsl
 1: <?xml version="1.0"?>
 2: <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 3: <xsl:template match="job">
 4:   <H2>Job ref: <xsl:value-of select="@customer"/>/ <xsl:value-of select="@reference"/></H2>
 5:   <P><xsl:apply-templates select="description"/></P>
 6:   <TABLE border="1">
 7:   <xsl:apply-templates select="skill|location"/>
 8:   </TABLE>
 9: </xsl:template>
10: <xsl:template match="description">
11:   <I><xsl:value-of select="."/></I>
12: </xsl:template>
13: <xsl:template match="skill|location">
14:   <TR><TD><xsl:value-of select="name()"/>:</TD> <TD><xsl:value-of select="."/></TD></TR>
15: </xsl:template>
16: </xsl:stylesheet>

Listing 17.11 brings together several features of stylesheets that have been described previously. The rule at line 3 matches a <job> element, and the customer and reference attributes are inserted into the output at line 4. At line 5, the job description child element is output in its own paragraph, and the skill and location children are output inside an HTML table at lines 6 and 8.

In line 6, the HTML table border attribute is enclosed in quotes so that it is valid XML (the same is also true for line 10 and the colspan attribute).

Line 12 uses one rule to match the <location> or <skill> elements. Finally, the name of the selected node is inserted into the output stream using the name() function in line 13.

Figure 17.2 shows the result of applying the table.xsl stylesheet from Listing 17.10 to the jobs.xml file.

Figure 17.2. The XML to HTML table transformation.


XSL supports significantly more complex transformation rules than those shown so far. The next section will provide an overview of some of the additional XSL features.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.137.186.178