Joins

Problem

You want to relate elements in a document to other elements in the same or different document.

Solution

A join is the process of considering all pairs of element as being related (i.e., a Cartesian product) and keeping only those pairs that meet the join relationship (usually equality).

To demonstrate, I have adapted the supplier parts database found in Date’s An Introduction to Database Systems (Addison Wesley, 1986) to XML:

<database>
  <suppliers>
    <supplier id="S1" name="Smith" status="20" city="London"/>
    <supplier id="S2" name="Jones" status="10" city="Paris"/>
    <supplier id="S3" name="Blake" status="30" city="Paris"/>
    <supplier id="S4" name="Clark" status="20" city="London"/>
    <supplier id="S5" name="Adams" status="30" city="Athens"/>
  </suppliers>
  <parts>
    <part id="P1" name="Nut" color="Red" weight="12" city="London"/>
    <part id="P2" name="Bult" color="Green" weight="17" city="Paris"/>
    <part id="P3" name="Screw" color="Blue" weight="17" city="Rome"/>
    <part id="P4" name="Screw" color="Red" weight="14" city="London"/>
    <part id="P5" name="Cam" color="Blue" weight="12" city="Paris"/>
    <part id="P6" name="Cog" color="Red" weight="19" city="London"/>
  </parts>
  <inventory>
    <invrec sid="S1" pid="P1" qty="300"/>
    <invrec sid="S1" pid="P2" qty="200"/>
    <invrec sid="S1" pid="P3" qty="400"/>
    <invrec sid="S1" pid="P4" qty="200"/>
    <invrec sid="S1" pid="P5" qty="100"/>
    <invrec sid="S1" pid="P6" qty="100"/>
    <invrec sid="S2" pid="P1" qty="300"/>
    <invrec sid="S2" pid="P2" qty="400"/>
    <invrec sid="S3" pid="P2" qty="200"/>
    <invrec sid="S4" pid="P2" qty="200"/>
    <invrec sid="S4" pid="P4" qty="300"/>
    <invrec sid="S4" pid="P5" qty="400"/>
  </inventory>
</database>

The join to be performed will answer the question, “Which suppliers and parts are in the same city (co-located)?”

You can use two basic techniques to approach this problem in XSLT. The first uses nested for-each loops:

<xsl:template match="/">
  <result>
    <xsl:for-each select="database/suppliers/*">
      <xsl:variable name="supplier" select="."/>
      <xsl:for-each select="/database/parts/*[@city=current(  )/@city]">
      <colocated>
        <xsl:copy-of select="$supplier"/>
        <xsl:copy-of select="."/>
      </colocated>
      </xsl:for-each>
    </xsl:for-each>
  </result>
</xsl:template>

The second approach uses apply-templates:

<xsl:template match="/">
  <result>
    <xsl:apply-templates select="database/suppliers/supplier" />
  </result>
</xsl:template>
   
<xsl:template match="supplier">
  <xsl:apply-templates select="/database/parts/part[@city = current(  )/@city]">
    <xsl:with-param name="supplier" select="." />
  </xsl:apply-templates>
</xsl:template>
   
<xsl:template match="part">
  <xsl:param name="supplier" select="/.." />
  <colocated>
    <xsl:copy-of select="$supplier" />
    <xsl:copy-of select="." />
  </colocated>
</xsl:template>

If one of the sets of elements to be joined has a large number of members, then consider using xsl:key to improve performance:

<xsl:key name="part-city" match="part" use="@city"/>
   
<xsl:template match="/">
  <result>
    <xsl:for-each select="database/suppliers/*">
      <xsl:variable name="supplier" select="."/>
      <xsl:for-each select="key('part-city',$supplier/@city)">
      <colocated>
        <xsl:copy-of select="$supplier"/>
        <xsl:copy-of select="."/>
      </colocated>
      </xsl:for-each>
    </xsl:for-each>
  </result>
</xsl:template>

Each stylesheet produces the same result:

<result>
   <colocated>
      <supplier id="S1" name="Smith" status="20" city="London"/>
      <part id="P1" name="Nut" color="Red" weight="12" city="London"/>
   </colocated>
   <colocated>
      <supplier id="S1" name="Smith" status="20" city="London"/>
      <part id="P4" name="Screw" color="Red" weight="14" city="London"/>
   </colocated>
   <colocated>
      <supplier id="S1" name="Smith" status="20" city="London"/>
      <part id="P6" name="Cog" color="Red" weight="19" city="London"/>
   </colocated>
   <colocated>
      <supplier id="S2" name="Jones" status="10" city="Paris"/>
      <part id="P2" name="Bult" color="Green" weight="17" city="Paris"/>
   </colocated>
   <colocated>
      <supplier id="S2" name="Jones" status="10" city="Paris"/>
      <part id="P5" name="Cam" color="Blue" weight="12" city="Paris"/>
   </colocated>
   <colocated>
      <supplier id="S3" name="Blake" status="30" city="Paris"/>
      <part id="P2" name="Bult" color="Green" weight="17" city="Paris"/>
   </colocated>
   <colocated>
      <supplier id="S3" name="Blake" status="30" city="Paris"/>
      <part id="P5" name="Cam" color="Blue" weight="12" city="Paris"/>
   </colocated>
   <colocated>
      <supplier id="S4" name="Clark" status="20" city="London"/>
      <part id="P1" name="Nut" color="Red" weight="12" city="London"/>
   </colocated>
   <colocated>
      <supplier id="S4" name="Clark" status="20" city="London"/>
      <part id="P4" name="Screw" color="Red" weight="14" city="London"/>
   </colocated>
   <colocated>
      <supplier id="S4" name="Clark" status="20" city="London"/>
      <part id="P6" name="Cog" color="Red" weight="19" city="London"/>
   </colocated>
</result>

Discussion

The join you performed is called an equi-join because the elements are related by equality. More generally, joins can be formed using other relations. For example, consider the query, “Select all combinations of supplier and part information for which the supplier city follows the part city in alphabetical order.”

It would be nice if you could simply write the following stylesheet, but XSLT 1.0 does not define relational operations on string types:

<xsl:template match="/">
  <result>
    <xsl:for-each select="database/suppliers/*">
      <xsl:variable name="supplier" select="."/>
       <!— This does not work! —>
               <xsl:for-each select="/database/parts/*[current(  )/@city > @city]">
      <colocated>
        <xsl:copy-of select="$supplier"/>
        <xsl:copy-of select="."/>
      </colocated>
      </xsl:for-each>
    </xsl:for-each>
  </result>
</xsl:template>

Instead, you must create a table using xsl:sort that can map city names onto integers that reflect the ordering. Here you rely on Saxon’s ability to treat variables containing result-tree fragments as node sets when the version is set to 1.1. However, you can also use the node-set function of your particular XSLT 1.0 processor or use an XSLT 2.0 processor:

<xsl:stylesheet version="1.1" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
     <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
   
<xsl:variable name="unique-cities" 
     select="//@city[not(. = ../preceding::*/@city)]"/>
   
<xsl:variable name="city-ordering">
  <xsl:for-each select="$unique-cities">
    <xsl:sort select="."/>
    <city name="{.}" order="{position(  )}"/>
  </xsl:for-each>
</xsl:variable>      
   
<xsl:template match="/">
  <result>
    <xsl:for-each select="database/suppliers/*">
      <xsl:variable name="s" select="."/>
      <xsl:for-each select="/database/parts/*">
        <xsl:variable name="p" select="."/>
        <xsl:if 
          test="$city-ordering/*[@name = $s/@city]/@order &gt; 
               $city-ordering/*[@name = $p/@city]/@order">
          <supplier-city-follows-part-city>
            <xsl:copy-of select="$s"/>
            <xsl:copy-of select="$p"/>
          </supplier-city-follows-part-city>
        </xsl:if>
      </xsl:for-each>
    </xsl:for-each>
  </result>
</xsl:template>
  
</xsl:stylesheet>

This query results in the following output:

<result>
   <supplier-city-follows-part-city>
      <supplier id="S2" name="Jones" status="10" city="Paris"/>
      <part id="P1" name="Nut" color="Red" weight="12" city="London"/>
   </supplier-city-follows-part-city>
   <supplier-city-follows-part-city>
      <supplier id="S2" name="Jones" status="10" city="Paris"/>
      <part id="P4" name="Screw" color="Red" weight="14" city="London"/>
   </supplier-city-follows-part-city>
   <supplier-city-follows-part-city>
      <supplier id="S2" name="Jones" status="10" city="Paris"/>
      <part id="P6" name="Cog" color="Red" weight="19" city="London"/>
   </supplier-city-follows-part-city>
   <supplier-city-follows-part-city>
      <supplier id="S3" name="Blake" status="30" city="Paris"/>
      <part id="P1" name="Nut" color="Red" weight="12" city="London"/>
   </supplier-city-follows-part-city>
   <supplier-city-follows-part-city>
      <supplier id="S3" name="Blake" status="30" city="Paris"/>
      <part id="P4" name="Screw" color="Red" weight="14" city="London"/>
   </supplier-city-follows-part-city>
   <supplier-city-follows-part-city>
      <supplier id="S3" name="Blake" status="30" city="Paris"/>
      <part id="P6" name="Cog" color="Red" weight="19" city="London"/>
   </supplier-city-follows-part-city>
</result>
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.191.22