Chapter 6. XML to XML

To change and to change for the better are two different things.

German proverb

One of the beauties of XML is that if you don’t like something, you can change it. Since it is impossible to please everyone, transforming XML to XML is extremely common. However, you will not transform XML only to improve the structure of a poorly designed schema. Sometimes you need to merge disparate XML documents into a single document. At other times you want to break up a large document into smaller subdocuments. You might also wish to preprocess a document to filter out only the relevant information, without changing its structure, before sending it off for further processing.

A simple but important tool in many XML-to-XML transformations is the identity transform. This tool is a stylesheet that copies an input document to an output document without changing it. This task may seem better suited to the operating systems copy operation, but as the following examples demonstrate, this simple stylesheet can be imported into other stylesheets to yield very common types of transformations with little added coding effort.

Example 6-1 shows the identity stylesheet. I actually prefer calling this stylesheet the copying stylesheet, and I call the techniques that utilize it the overriding copy idiom.

Example 6-1. copy.xslt

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
   
<xsl:template match="node(  ) | @*">
  <xsl:copy>
    <xsl:apply-templates select="@* | node(  )"/>
  </xsl:copy>
</xsl:template>
   
</xsl:stylesheet>

Converting Attributes to Elements

Problem

You have a document that encodes information with attributes, and you would like to use child elements instead.

Solution

This problem is tailor-made for what the introduction to this chapter calls the overriding copy idiom. This example transforms attributes to elements globally:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
   
<xsl:import href="copy.xslt"/>
   
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
   
<xsl:template match="@*">
  <xsl:element name="{local-name(.)}" namespace="{namespace-uri(..)}">
    <xsl:value-of select="."/>
  </xsl:element>  
</xsl:template>
   
</xsl:stylesheet>

The stylesheet works by overriding the copy behavior for attributes. It replaces the behavior with a template that converts an attribute into an element (of the same name) whose value is the attribute’s value. It also assumes that this new element should be in the same namespace as the attribute’s parent. If you prefer not to make assumptions, then use the following code:

<xsl:template match="@*">
  <xsl:variable name="namespace">
    <xsl:choose>
      <!--Use namespsace of attribute, if there is one -->
      <xsl:when test="namespace-uri(  )">
        <xsl:value-of select="namespace-uri(  )" />
      </xsl:when>
      <!--Otherwise use parents namespace -->
      <xsl:otherwise>
        <xsl:value-of select="namespace-uri(..)" />
      </xsl:otherwise>
    </xsl:choose>
  </xsl:variable>
  <xsl:element name="{name(  )}" namespace="{$namespace}">
    <xsl:value-of select="." />
  </xsl:element>
</xsl:template>

You’ll often want to be selective when transforming attributes to elements (see Example 6-2 to Example 6-4).

Example 6-2. Input

<people which="MeAndMyFriends">
  <person firstname="Sal" lastname="Mangano" age="38" height="5.75"/>
  <person firstname="Mike" lastname="Palmieri" age="28" height="5.10"/>
  <person firstname="Vito" lastname="Palmieri" age="38" height="6.0"/>
  <person firstname="Vinny" lastname="Mari" age="37" height="5.8"/>
</people>

Example 6-3. A stylesheet that transforms person attributes only

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
   
<xsl:import href="copy.xslt"/>
   
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
   
<xsl:template match="person/@*">
  <xsl:element name="{local-name(.)}" namespace="{namespace-uri(..)}">
    <xsl:value-of select="."/>
  </xsl:element>  
</xsl:template>
   
</xsl:stylesheet>

Example 6-4. Output

<people which="MeAndMyFriends">
   
   <person>
      <firstname>Sal</firstname>
      <lastname>Mangano</lastname>
      <age>38</age>
      <height>5.75</height>
   </person>
   
   <person>
      <firstname>Mike</firstname>
      <lastname>Palmieri</lastname>
      <age>28</age>
      <height>5.10</height>
   </person>
   
   <person>
      <firstname>Vito</firstname>
      <lastname>Palmieri</lastname>
      <age>38</age>
      <height>6.0</height>
   </person>
   
   <person>
      <firstname>Vinny</firstname>
      <lastname>Mari</lastname>
      <age>37</age>
      <height>5.8</height>
   </person>
   
</people>

Discussion

This section and Recipe 6.2 address the problems that arise when a document designer makes a poor choice between encoding information in attributes versus elements. The attribute-versus-element decision is one of the most controversial aspects of document design.[9] These examples are helpful because they allow you to correct your own or others’ (perceived) mistakes.



[9] The only other stylistic issue I have seen software developers get more passionate about is where to put the curly braces in C-like programming languages (e.g., C++ and Java).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.109.30