Chapter 8. Sorting Things Out

Sometimes nodes don’t come to you in a convenient order. XSLT’s sort instruction element allows you to sort nodes in alphabetical or numerical order. You can also use sort to sort nodes in ascending (a, b, c) or descending (z, y, x) order.

This chapter walks you through a brief exploration of sort. You can also read about sorting in Section 10 of the XSLT specification. I’ll start, as usual, with a simple example.

Simple Ascending Sort

If you look at Example 8-1, the document europe.xml in examples/ch08, you’ll notice that the European states are not listed in alphabetical order.

Example 8-1. Unalphabetized European countries
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="pretty.xsl" type="text/xsl"?>
   
<europe>
  <state>Belgium</state>
  <state>Germany</state>
  <state>Finland</state>
  <state>Greece</state>
  <state>Ireland</state>
  <state>Luxembourg</state>
  <state>Portugal</state>
  <state>Spain</state>
  <state>Andorra</state>
  <state>Belarus</state>
  <state>Monaco</state>
  <state>Sweden</state>
  <state>United Kingdom</state>
  <state>Austria</state>
  <state>Malta</state>
  <state>Vatican City</state>
  <state>Bulgaria</state>
  <state>Bosnia-Herzegovina</state>
  <state>Cyprus</state>
  <state>France</state>
  <state>Estonia</state>
  <state>Italy</state>
  <state>Hungary</state>
  <state>Latvia</state>
  <state>Ukraine</state>
  <state>Lithuania</state>
  <state>Moldova</state>
  <state>Denmark</state>
  <state>Poland</state>
  <state>Romania</state>
  <state>Slovenia</state>
  <state>The Netherlands</state>
  <state>Turkey</state>
  <state>Albania</state>
  <state>Serbia and Montenegro</state>
  <state>Croatia</state>
  <state>Slovakia</state>
  <state>Iceland</state>
  <state>Czech Republic</state>
  <state>Liechtenstein</state>
  <state>Macedonia, Former Yugoslav Republic of</state>
  <state>Norway</state>
  <state>Russia</state>
  <state>San Marino</state>
  <state>Switzerland</state>
</europe>

You can sort the names of the European states in ascending order (that is, in English as a, b, c, and so on) by using the sort element with no attributes, as a child of apply-templates.

The sort element is used in the stylesheet shown in Example 8-2, sort.xsl .

Example 8-2. Creating a sorted list of countries as text
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
   
 <xsl:template match="europe">
   <xsl:text>Alphabetical List of European States</xsl:text>
   <xsl:text>&#10;Total Number of States: </xsl:text>
   <xsl:value-of select="count(state)"/>
   <xsl:text>&#10;&#10;</xsl:text>
   <xsl:apply-templates select="state">
    <xsl:sort/>
   </xsl:apply-templates>
 </xsl:template>
   
 <xsl:template match="state">
   <xsl:text> - </xsl:text>
   <xsl:apply-templates/>
   <xsl:text>&#10;</xsl:text>
 </xsl:template>
   
</xsl:stylesheet>

This stylesheet produces plain text output (it uses the text method of output), as shown in Example 8-3. A few text instruction elements are sprinkled here and there to label the output or add a line feed (using the &#10; character reference). The count( ) function is also used to count the number of state elements in the source tree, and the return value of this function is displayed using value-of.

The sort element appears as a child of apply-templates . It may only appear as a child of either apply-templates or for-each.

Tip

The XSLT instruction element for-each is like a template within a template that selects a node-set and then instantiates its template for each node in the set. You will see this element demonstrated in several places in this book.

In this stylesheet, when state elements are selected with apply-templates, the processor will also apply a sort.

To see what happens, apply the sort.xsl stylesheet to europe.xml using this command:

xalan europe.xsl sort.xsl

The plain text, alphabetized result tree shown in Example 8-3 will be output to your screen.

Example 8-3. A sorted text list produced by sort.xsl
Alphabetical List of European States
Total Number of States: 45
   
 - Albania
 - Andorra
 - Austria
 - Belarus
 - Belgium
 - Bosnia-Herzegovina
 - Bulgaria
 - Croatia
 - Cyprus
 - Czech Republic
 - Denmark
 - Estonia
 - Finland
 - France
 - Germany
 - Greece
 - Hungary
 - Iceland
 - Ireland
 - Italy
 - Latvia
 - Liechtenstein
 - Lithuania
 - Luxembourg
 - Macedonia, Former Yugoslav Republic of
 - Malta
 - Moldova
 - Monaco
 - Norway
 - Poland
 - Portugal
 - Romania
 - Russia
 - San Marino
 - Serbia and Montenegro
 - Slovakia
 - Slovenia
 - Spain
 - Sweden
 - Switzerland
 - The Netherlands
 - Turkey
 - Ukraine
 - United Kingdom
 - Vatican City

If you would like pretty output worthy of a browser, you could also create an HTML wrapper with the stylesheet shown in Example 8-4, pretty.xsl .

Example 8-4. A stylesheet for producing a sorted list of states presented in HTML
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
   
 <xsl:template match="europe">
  <html>
  <head><title>European States</title></head>
  <style type="text/css">body {font-family: sans-serif}</style>
  <body>
  <h3>Alphabetical List of European States</h3>
  <p><b>Total Number of States:</b><xsl:text> </xsl:text>
   <xsl:value-of select="count(state)"/></p>
  <ul>
   <xsl:apply-templates select="state">
    <xsl:sort/>
   </xsl:apply-templates>
  </ul>
  </body>
  </html>
 </xsl:template>
   
 <xsl:template match="state">
  <li><xsl:apply-templates/></li>
 </xsl:template>
   
</xsl:stylesheet>

This stylesheet produces indented HTML output by default (without explicitly stating so in an output element), because the first element in the result tree is html and there is no output element to define the method.

Process the stylesheet with Xalan:

xalan -i 1 europe.xml pretty.xsl

Once again, the -i option followed by 1 tells the processor to indent the output by one space. You will see output like Example 8-5.

Example 8-5. The HTML results of using pretty.xsl
<html>
 <head>
  <META http-equiv="Content-Type" content="text/html; charset=UTF-8">
  <title>European States</title>
 </head>
 <style type="text/css">body {font-family: sans-serif}</style>
 <body>
  <h3>Alphabetical List of European States</h3>
  <p>
   <b>Total Number of States:</b> 45</p>
  <ul>
   <li>Albania</li>
   <li>Andorra</li>
   <li>Austria</li>
   <li>Belarus</li>
   <li>Belgium</li>
   <li>Bosnia-Herzegovina</li>
   <li>Bulgaria</li>
   <li>Croatia</li>
   <li>Cyprus</li>
   <li>Czech Republic</li>
   <li>Denmark</li>
   <li>Estonia</li>
   <li>Finland</li>
   <li>France</li>
   <li>Germany</li>
   <li>Greece</li>
   <li>Hungary</li>
   <li>Iceland</li>
   <li>Ireland</li>
   <li>Italy</li>
   <li>Latvia</li>
   <li>Liechtenstein</li>
   <li>Lithuania</li>
   <li>Luxembourg</li>
   <li>Macedonia, Former Yugoslav Republic of</li>
   <li>Malta</li>
   <li>Moldova</li>
   <li>Monaco</li>
   <li>Norway</li>
   <li>Poland</li>
   <li>Portugal</li>
   <li>Romania</li>
   <li>Russia</li>
   <li>San Marino</li>
   <li>Serbia and Montenegro</li>
   <li>Slovakia</li>
   <li>Slovenia</li>
   <li>Spain</li>
   <li>Sweden</li>
   <li>Switzerland</li>
   <li>The Netherlands</li>
   <li>Turkey</li>
   <li>Ukraine</li>
   <li>United Kingdom</li>
   <li>Vatican City</li>
  </ul>
 </body>
</html>

If you simply open europe.xml with a browser such as Netscape 7.1, the browser will apply the stylesheet pretty.xsl referenced in the XML stylesheet PI, and the result will appear in the browser as shown in Figure 8-1.

European states sorted alphabetically in Netscape 7.1
Figure 8-1. European states sorted alphabetically in Netscape 7.1

Reversing the Sort

The sort element uses ascending order by default, as if the order attribute were present with a value of ascending, like so:

<xsl:sort order="ascending"/>

This order follows the normal a, b, c order of the English alphabet. You can also sort in descending order, that is, using English, in the order z, y, x. To do this, you have to add an order attribute to sort , as does the stylesheet descending.xsl, shown in Example 8-6.

Example 8-6. A stylesheet for sorting country names backward
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
   
 <xsl:template match="europe">
   <xsl:apply-templates select="state">
    <xsl:sort order="descending"/>
   </xsl:apply-templates>
   <xsl:text>Number of European States: </xsl:text>
   <xsl:value-of select="count(state)"/>
   <xsl:text>&#10;</xsl:text>
 </xsl:template>
   
 <xsl:template match="state">
   <xsl:text> - </xsl:text>
   <xsl:apply-templates/>
   <xsl:text>&#10;</xsl:text>
 </xsl:template>
   
</xsl:stylesheet>

Now apply it with:

xalan europe.xml descending.xsl

to get the output shown in Example 8-7.

Example 8-7. A reverse-sorted list of countries produced using descending.xsl
 - Vatican City
 - United Kingdom
 - Ukraine
 - Turkey
 - The Netherlands
 - Switzerland
 - Sweden
 - Spain
 - Slovenia
 - Slovakia
 - Serbia and Montenegro
 - San Marino
 - Russia
 - Romania
 - Portugal
 - Poland
 - Norway
 - Monaco
 - Moldova
 - Malta
 - Macedonia, Former Yugoslav Republic of
 - Luxembourg
 - Lithuania
 - Liechtenstein
 - Latvia
 - Italy
 - Ireland
 - Iceland
 - Hungary
 - Greece
 - Germany
 - France
 - Finland
 - Estonia
 - Denmark
 - Czech Republic
 - Cyprus
 - Croatia
 - Bulgaria
 - Bosnia-Herzegovina
 - Belgium
 - Belarus
 - Austria
 - Andorra
 - Albania
Number of European States: 45

The output is in reverse, or descending, order in English.

By the Numbers

So far, you have sorted nodes alphabetically. You can also sort nodes numerically by specifying the sort element’s data-type attribute with a value of number. By default, sort works as if data-type were present and had a value of text, which indicates that you want to sort text alphabetically.

To see how it works, have a look at Example 8-8, the document member.xml .

Example 8-8. An XML list of EU member states
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="year.xsl" type="text/xsl"?>
   
<!-- European Union member states -->
   
<member>
 <state joined="1995">Austria</state>
 <state joined="1950">Belgium</state>
 <state joined="1973">Denmark</state>
 <state joined="1995">Finland</state>
 <state joined="1950">France</state>
 <state joined="1950">Germany</state>
 <state joined="1981">Greece</state>
 <state joined="1973">Ireland</state>
 <state joined="1950">Italy</state>
 <state joined="1950">Luxembourg</state>
 <state joined="1950">The Netherlands</state>
 <state joined="1986">Portugal</state>
 <state joined="1986">Spain</state>
 <state joined="1995">Sweden</state>
 <state joined="1973">United Kingdom</state>
</member>

Example 8-8 holds state elements, each containing the name of a European Union (EU) member state, in alphabetical order. Each of the 15 state elements also has a joined attribute with a number value, indicating the year the country joined the EU.

If you want to sort by year rather than name, you could use the stylesheet shown in Example 8-9, numeric.xsl.

Example 8-9. A stylesheet for sorting countries by year of EU membership
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
   
 <xsl:template match="member">
   <xsl:text>Number of EU Member States: </xsl:text>
   <xsl:value-of select="count(state)"/>
   <xsl:text>&#10;</xsl:text>
   <xsl:apply-templates select="state/@joined">
    <xsl:sort data-type="number"/>
   </xsl:apply-templates>
   <xsl:text>&#10;</xsl:text>
 </xsl:template>
   
 <xsl:template match="state/@joined">
   <xsl:text> - </xsl:text>
   <xsl:apply-templates select=".."/>
   <xsl:text> (</xsl:text>
   <xsl:value-of select="."/>
   <xsl:text>)&#10;</xsl:text>
 </xsl:template>
   
</xsl:stylesheet>

The sort element in Example 8-9 has a data-type attribute with a value of number and sorts by the year in the attribute joined. The template that matches state/@joined may seem a little obscure, but it gets exactly what it’s after, namely, the name of the European state (obtained with ..), followed by a year (obtained with .), placing the year in parentheses.

To see what happens, apply the stylesheet with:

xalan member.xml numeric.xsl

and you will get the output shown in Example 8-10.

Example 8-10. The sorted list of countries produced by running numeric.xsl
Number of EU Member States: 15
 - Belgium (1950)
 - France (1950)
 - Germany (1950)
 - Italy (1950)
 - Luxembourg (1950)
 - The Netherlands (1950)
 - Denmark (1973)
 - Ireland (1973)
 - United Kingdom (1973)
 - Greece (1981)
 - Portugal (1986)
 - Spain (1986)
 - Austria (1995)
 - Finland (1995)
 - Sweden (1995)

You can see from the output that the state nodes were sorted according to the year in the joined attribute, not alphabetically according to the name of the European state. Because the states are already in alphabetical order in the source tree, they also come out in alphabetical order in the result tree, after being sorted by year.

If you want to list the most recent year first, you can do so by adding the order attribute, as seen in Example 8-11, the stylesheet recent.xsl .

Example 8-11. A stylesheet for reverse-sorting countries by year of EU membership
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
   
 <xsl:template match="member">
   <xsl:text>Number of EU Member States: </xsl:text>
   <xsl:value-of select="count(state)"/>
   <xsl:text>&#10;</xsl:text>
   <xsl:apply-templates select="state/@joined">
    <xsl:sort data-type="number" order="descending"/>
   </xsl:apply-templates>
   <xsl:text>&#10;</xsl:text>
 </xsl:template>
   
 <xsl:template match="state/@joined">
   <xsl:text> - </xsl:text>
   <xsl:apply-templates select=".."/>
   <xsl:text> (</xsl:text>
   <xsl:value-of select="."/>
   <xsl:text>)&#10;</xsl:text>
 </xsl:template>
   
</xsl:stylesheet>

In recent.xsl, the order attribute is added to sort and has a value of descending. Now apply it with this command:

xalan member.xml recent.xsl

and your results will look like Example 8-12.

Example 8-12. The reverse-sorted list of countries produced by running recent.xsl
Number of EU Member States: 15
 - Austria (1995)
 - Finland (1995)
 - Sweden (1995)
 - Portugal (1986)
 - Spain (1986)
 - Greece (1981)
 - Denmark (1973)
 - Ireland (1973)
 - United Kingdom (1973)
 - Belgium (1950)
 - France (1950)
 - Germany (1950)
 - Italy (1950)
 - Luxembourg (1950)
 - The Netherlands (1950)

If you open member.xml with a browser, the stylesheet year.xsl (shown in Example 8-13) will be applied.

Example 8-13. A stylesheet for sorting countries by year into an XHTML representation
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:output doctype-public="-//W3C//DTD XHTML 1.0 Strict//EN"/>
<xsl:output doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"/>
   
 <xsl:template match="member">
   <html xmlns="http://www.w3.org/1999/xhtml">
   <head><title>EU Member States</title>
   <style type="text/css">
   h3 {font-size: 16pt}
   body {font-size: 13pt}</style>
   </head>
   <body>
   <h3>EU Member States</h3>
   <p>There are <xsl:text> </xsl:text>
    <xsl:value-of select="count(state)"/>
   member states, listed starting from the most recent year:</p>
   <ul>
   <xsl:apply-templates select="state">
    <xsl:sort select="@joined" data-type="number" order="descending"/>
   </xsl:apply-templates>
   </ul>
   </body>
   </html>
 </xsl:template>
   
 <xsl:template match="state">
   <xsl:element name="li" namespace="http://www.w3.org/1999/xhtml">
   <xsl:apply-templates/>
   <xsl:text> (</xsl:text>
   <xsl:value-of select="@joined"/>
   <xsl:text>)</xsl:text>
   </xsl:element>
 </xsl:template>
   
</xsl:stylesheet>

The stylesheet presents the same results as recent.xsl but in strict XHTML 1.0, as shown in Mozilla Firebird in Figure 8-2.

The document member.xml transformed in Mozilla Firebird
Figure 8-2. The document member.xml transformed in Mozilla Firebird

Multiple Sorts

You can sort nodes more than once, if needed, and you can also sort child nodes a different way than you sort their parent nodes. The select attribute of sort can help you do the job, as will be demonstrated in this section. Example 8-14, the document shopping.xml, represents a short, disorderly shopping list.

Example 8-14. An XML shopping list
<list>
 <freezer>
  <item>peas</item>
  <item>green beans</item>
  <item>pot pie</item>
  <item>ice cream</item>
 </freezer>
 <bakery>
  <item>rolls</item>
  <item>jelly doughnuts</item>
  <item>bread</item>
 </bakery>
 <produce>
  <item>bananas</item>
  <item>kumquats</item>
  <item>apples</item>
 </produce>
</list>

To help get things in better shape, Example 8-15, the stylesheet shopping.xsl , uses sort twice to sort different node-sets.

Example 8-15. A stylesheet for sorting the grocery list on two levels
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
   
 <xsl:template match="list">
   <xsl:apply-templates select="*">
    <xsl:sort select="name(  )"/>
   </xsl:apply-templates>
 </xsl:template>
   
 <xsl:template match="*">
   <xsl:text>Section: </xsl:text>
   <xsl:value-of select="name(  )"/>
   <xsl:text>&#10;</xsl:text>
   <xsl:apply-templates select="item">
    <xsl:sort/>
   </xsl:apply-templates>
 </xsl:template>
   
 <xsl:template match="item">
   <xsl:text> * </xsl:text>
    <xsl:apply-templates/>
   <xsl:text>&#10;</xsl:text>
 </xsl:template>
   
</xsl:stylesheet>

Example 8-15 outputs plain text. The first template in this stylesheet matches list and then sorts on the names (using name( )) of the element children (using *) of list. This is the first sort. The second template matches only on the element children of list, again using *. After inserting some text (such as Section:) and the name of the element (again with name( )), the template sorts the text node content of item children. This is the second sort.

Finally, the last template matches item elements, prefixing text nodes with an asterisk (a bullet) in the result tree, and throwing in a line break after the text.

To see the results, type the command:

xalan shopping.xml shopping.xsl

and you will see Example 8-16.

Example 8-16. The sorted list of groceries produced by running shopping.xsl
Section: bakery
 * bread
 * jelly doughnuts
 * rolls
Section: freezer
 * green beans
 * ice cream
 * peas
 * pot pie
Section: produce
 * apples
 * bananas
 * kumquats

Originally, in the source document, the child elements of list were ordered freezer, bakery, and produce. Now they are alphabetically correct, that is, bakery, freezer, and produce. The children of each of these elements—all item elements—are correctly ordered as well.

Using copy and copy-of, the stylesheet in Example 8-17 (list.xsl ) generates an XML result.

Example 8-17. A stylesheet for produce XML alphabetized by its content
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
   
 <xsl:template match="list">
  <xsl:copy>
   <xsl:apply-templates select="*">
    <xsl:sort select="name(  )"/>
   </xsl:apply-templates>
  </xsl:copy>
 </xsl:template>
   
 <xsl:template match="*">
  <xsl:copy>
   <xsl:apply-templates select="item">
    <xsl:sort/>
   </xsl:apply-templates>
  </xsl:copy>
 </xsl:template>
   
 <xsl:template match="item">
  <xsl:copy-of select="."/>
 </xsl:template>
   
</xsl:stylesheet>

As a result of the following command:

xalan -i 1 shopping.xml list.xsl

you’ll get nicely alphabetized nodes, as shown in Example 8-18.

Example 8-18. The sorted list of groceries produced by running list.xsl
<?xml version="1.0" encoding="UTF-8"?>
<list>
 <bakery>
  <item>bread</item>
  <item>jelly doughnuts</item>
  <item>rolls</item>
 </bakery>
 <freezer>
  <item>green beans</item>
  <item>ice cream</item>
  <item>peas</item>
  <item>pot pie</item>
 </freezer>
 <produce>
  <item>apples</item>
  <item>bananas</item>
  <item>kumquats</item>
 </produce>
</list>

The lang and case-order Attributes

One attribute of sort that I haven’t discussed is lang. This optional attribute lets you specify a language token, such as jp or ru, so that sorting rules will be determined by the alphabet of a language such as Japanese or Russian (Cyrillic). If the lang attribute is absent, an XSLT processor is supposed to determine the language system environment. This attribute does not yet appear to be supported by all processors, but the structure for that support is in place and over time, as demand arises for XSLT in more languages, support for lang will broaden.

Another missing attribute is the optional case-order . This attribute is supposed to allow you to sort by uppercase first using an attribute value of upper-first, or to sort by lowercase first with lower-first. The XSLT specification, however, allows for this to be language dependent, and so there are varying interpretations of how this is supposed to work. In some languages, a word may have a different meaning based on capitalization rather than spelling. In such cases, case-order will be useful.

Summary

You’ve learned how to sort alphabetically, in ascending and descending order, and by numbers. With this foundation, you may already have an appetite for advanced information on sorting, which you can find in Chapter 4 of Michael Kay’s XSLT Programmer’s Reference (Wrox) or in Chapter 6 of Doug Tidwell’s XSLT (O’Reilly). In the next chapter, you’ll learn how to add formatted numbers to the result tree.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.141.219