Transform elements into attributes and back the other way with XSLT.
You’re sitting in a conference room, leaning back in your chair. Opinions are flying back and forth across the room about whether to represent the XML data from a new application in either element or attribute form.
One engineer says, combing his beard with his fingers, “You don’t want to use attributes at all. What if down the line you need more than one attribute with the same name. You can’t do that in XML. You can only use one attribute with a given name.”
“Attributes contain metadata about elements,” another barks. “You don’t store metadata in element content. Period. That’s where the real data goes.”
You rock forward in your chair. “Excuse me,” you say with a chuckle, “but none of these arguments matter.” The room goes silent. Your project manager’s nostrils flare. “You better explain yourself,” she says, taking the last swig of her spring water.
“Gladly,” you say. “I’ve got a pair of XSLT stylesheets that can transform the data easily between element and attribute forms in seconds. Walk with me to my cubicle and I’ll give you a demo.”
In reference to the element-or-attribute debate, Michael Kay has wisely said: “Beginners always ask this question. Those with a little experience express their opinions passionately. Experts tell you there is no right answer” (http://lists.xml.org/archives/xml-dev/200006/msg00285.html). This hack will allow you to keep changing your mind.
XML document design does matter, and it’s worthwhile to consider some questions when deciding between elements and attributes:
Are you dealing with data or metadata (data about data)? Elements are generally a good fit for data, and attributes are a good fit for metadata.
Is there a possibility of name conflicts when labeling data? If so, remember that you can have only one attribute with a given name per element.
Should the data be structured (i.e., does it have a logical relationship with nearby markup)? You can’t use XML to structure attribute values.
It’s nice to come away from a meeting like that sounding and looking like a genius. That’s one reason why this book contains a hack on how to use XSLT to convert XML elements to attributes or attributes to elements. You’ll recall our tried and true time.xml document:
<?xml version="1.0" encoding="UTF-8"?> <!-- a time instant --> <time timezone="PST"> <hour>11</hour> <minute>59</minute> <second>59</second> <meridiem>p.m.</meridiem> <atomic signal="true"/> </time>
Let’s say you want to convert the elements
hour
, minute
,
second
, and so forth into attributes. You can do
it with the stylesheet
elem2attr.xsl
shown in Example 3-17.
Example 3-17. elem2attr.xsl
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="xml" encoding="UTF-8" indent="yes"/> <xsl:template match="time"> <xsl:copy> <xsl:copy-of select="@timezone"> <xsl:for-each select="*"> <xsl:attribute name="{name(.)}"> <xsl:value-of select="."/> <xsl:value-of select="@signal"/> </xsl:attribute> </xsl:for-each> </xsl:copy> </xsl:template> </xsl:stylesheet>
This stylesheet could be adapted to match the needs of your XML data.
On line 4, the single template in this stylesheet matches the
document element time
(a built-in template first
matches the root node, though this is not evident from the markup).
The stylesheet copies the time
element (line 5),
and the copy-of
on line 6 copies over the
timezone
attribute (line 6). Then the
for-each
element marches through all the child
elements (select="*
“) of time
(line 7). For each element found, an attribute is created (line 8),
using the element names as attribute names (that’s
what the name()
function does). The element
content is retrieved with value-of
(line 9), as
well as the value of the signal
attribute (using
@signal
on line 10).
Apply this stylesheet to time.xml using Xalan with this command:
xalan -o attr.xml time.xml elem2attr.xsl
You can see in attr.xml that all the element
names and content have been converted to attribute names and values
(plus the timezone
attribute has been carried
over):
<?xml version="1.0" encoding="UTF-8"?> <time timezone="PST" hour="11" minute="59" second="59" meridiem="p.m." atomic="true"/>
A little information is lost in the transformation: the
signal
attribute’s value of
true
is assigned to the new attribute
atomic
. This information is restored with the next
stylesheet, attr2elem.xsl.
Now let’s take it back in the other direction with attr2elem.xsl , shown in Example 3-18.
Example 3-18. attr2elem.xsl
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="xml" encoding="UTF-8" indent="yes"/> <xsl:template match="time"> <xsl:comment> a time instant </xsl:comment> <xsl:copy> <xsl:copy-of select="@timezone"/> <xsl:apply-templates select="@*"/> </xsl:copy> </xsl:template> <xsl:template match="@*"> <xsl:element name="{name(.)}"> <xsl:value-of select="."/> </xsl:element> </xsl:template> <xsl:template match="@timezone"/> <xsl:template match="@atomic"> <xsl:element name="{name(.)}"> <xsl:attribute name="signal"><xsl:value-of select="."/> </xsl:attribute> </xsl:element> </xsl:template> </xsl:stylesheet>
As with the previous stylesheet in this hack, adapt this one to your
needs. The first template matches time
(line 4). A
comment is created (line 5), the time
element is
copied into the result tree (line 6), and the
timezone
attribute is re-created on
time
(line 7). Then
apply-templates
selects all attributes associated
with time
(@*
on line 9).
The template on line 13 matches all attributes and creates an element
for each one found using name(.)
. When
apply-templates
on line 9 selects
@timezone
, it finds another template for the
timezone
attribute (line 19), more specific than
the one on line 13, which just no-ops. This is done because the
stylesheet already re-created the timezone
attribute on line 7. (The timezone
attribute must
be re-created before any elements are created, which is why the
stylesheet deals with it explicitly rather than leaving it to the
templates that follow.)
When apply-templates
selects
@atomic
, the template on line 21 is instantiated,
which creates an atomic
element with a
signal
attribute, just as in
time.xml.
Apply this to attr.xml using:
xalan -i 1 attr.xml attr2elem.xsl
and you will get the following result, which looks an awful lot like time.xml:
<?xml version="1.0" encoding="UTF-8"?> <!-- a time instant --> <time timezone="PST"> <hour>11</hour> <minute>59</minute> <second>59</second> <meridiem>p.m.</meridiem> <atomic signal="true"/> </time>
Sal Mangano’s XSLT Cookbook (O’Reilly), pages 202-206, which inspired this hack
Peter Flynn’s XML FAQ, “Which should I use in my DTD, attributes or elements?”: http://www.ucc.ie/xml/#attriborelem
Robin Cover’s CoverPages “Using Elements and Attributes”: http://xml.coverpages.org/elementsAndAttrs.html
Uche Ogbuji’s IBM developerWorks article “When to Use Elements Versus Attributes”: http://www-106.ibm.com/developerworks/xml/library/x-eleatt.html
18.188.218.226