Ant is an extensible, open source build tool written in Java and sponsored by Apache. It can also be used as a framework for performing a large variety of operations--including XML-related tasks--in a single step.
Ant (http://ant.apache.org) uses build files that are written in XML, and takes advantage of XML in a variety of ways. It’s a suitable (if not ideal) framework for XML pipelining—it’s open, mature, stable, readily available, widely known and used, easily extensible, and already amenable to XML processing. What else could you ask for?
In this hack, I’ll show you the XML structures in an Ant build file, named build.xml by default; talk about some common XML-related tasks that Ant can perform; and end with an example of XML pipelining.
To get the examples in this hack to work, you’ll need to download and install Ant Version 1.6.1 (or later) binaries from http://ant.apache.org/bindownload.cgi. Because you’ll be using an external task that validates with RELAX NG (http://www.relaxng.org) schemas, you’ll also need James Clark’s Jing (http://www.thaiopensource.com/relaxng/jing.html).
Ant
has a task for validating XML documents called
xmlvalidate
. By default, Ant validates with
Xerces. The XML document
valid.xml
is shown in Example 7-1.
Example 7-1. valid.xml
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE time SYSTEM "time.dtd"> <!-- a time instant --> <time timezone="PST"> <hour>11</hour> <minute>59</minute> <second>59</second> <meridiem>p.m.</meridiem> <atomic signal="true"/> </time>
It points to the DTD time.dtd (Example 7-2).
Example 7-2. time.dtd
<!ELEMENT time (hour,minute,second,meridiem,atomic)> <!ATTLIST time timezone CDATA #REQUIRED> <!ELEMENT hour (#PCDATA)> <!ELEMENT minute (#PCDATA)> <!ELEMENT second (#PCDATA)> <!ELEMENT meridiem (#PCDATA)> <!ELEMENT atomic EMPTY> <!ATTLIST atomic signal CDATA #REQUIRED>
You can validate valid.xml with the build file
build.xml
,
which uses the xmlvalidate
task (Example 7-3).
Example 7-3. build.xml
<?xml version="1.0"?> <project default="valid"> <target name="valid"> <xmlvalidate file="valid.xml"/> </target> </project>
The target
element is a child of
project
and must have a name
attribute. The value of this attribute matches the value of the
default
attribute of project, i.e.,
valid
. When there is more than one target in a
build file, the value of default
only matches the
value of one name
attribute in one
target
. The target
element also
has several other attributes not shown here. On the
xmlvalidate
element, the file
attribute specifies the document to validate (in this case,
valid.xml).
In the working directory, and with Ant installed and in the path, issue the command:
ant
Ant knows to look for the build.xml file, and to
take its orders from there. The ant
command
produces the following output, if successful:
Buildfile: build.xml valid: [xmlvalidate] 1 file(s) have been successfully validated. BUILD SUCCESSFUL Total time: 1 second
In
Ant,
types are elements that can help perform tasks,
especially on groups of files. For example, using the
fileset
type as a child of
xmlvalidate
, you can validate a series of XML
documents, as shown in
build-fileset.xml
(Example 7-4).
Example 7-4. build-fileset.xml
<?xml version="1.0"?> <project default="valid"> <target name="valid"> <xmlvalidate> <fileset file="*ternal.xml"/> </xmlvalidate> </target> </project>
The file
attribute of fileset
allows you to specify a series of files with wildcards. If you run
this build file, you will see that Ant validates both the
internal.xml and
external.xml documents in one step.
The xmlvalidate
task has several features I
haven’t mentioned, but are worth looking at, such as
checking a document only for well-formedness by using
lenient="yes
" (see http://ant.apache.org/manual/OptionalTasks/xmlvalidate.html).
One way that you can extend Ant is by writing your own task (instructions on how to do this are found at http://ant.apache.org/manual/develop.html#writingowntask). James Clark has written a task for Jing that allows you to use Ant to validate XML documents against RELAX NG schemas, using either XML or compact syntax. This task is documented at http://www.thaiopensource.com/relaxng/jing-ant.html.
Jing’s source code (JingTask.java) is available for download from http://www.thaiopensource.com/download/jing-20030619.zip, but for convenience I have included a copy of JingTask.java in the example file archive for easy inspection (along with a copy of Jing’s license, jing-copying.txt).
The document time.xml is valid with regard to the RELAX NG schema time.rng , shown in Example 7-5.
Example 7-5. time.rng
<element name="time" xmlns="http://relaxng.org/ns/structure/1.0"> <attribute name="timezone"/> <element name="hour"><text/></element> <element name="minute"><text/></element> <element name="second"><text/></element> <element name="meridiem"><text/></element> <element name="atomic"> <attribute name="signal"/> </element> </element>
To validate time.xml against time.rng with Ant, use the build file build-jing.xml (Example 7-6).
Example 7-6. build-jing.xml
<?xml version="1.0"?> <project default="rng"> <taskdef name="jing" classname="com.thaiopensource.relaxng.util.JingTask"/> <target name="rng"> <echo message="Validating with RELAX NG schema using Jing..."/> <jing rngfile="time.rng" file="time.xml"/> </target> </project>
The taskdef
element defines the
jing
task in the name
attribute, and the classname
attribute identifies
the class that executes the task. The compiled class is stored in
jing.jar. If you place
jing.jar in Ant’s
lib directory, Ant will be able to find the
task. (For example, on my Windows machine, I’ve
placed jing.jar in
C:JavaAntapache-ant-1.6.1lib.)
The echo
task echoes the text in the
message
attribute. Jing is silent upon success, so
you can throw in an echo
task to send a message of
some sort, as shown in Example 7-6. The jing
task’s rngfile
attribute
identifies a RELAX NG schema, and the file
attribute names the instance of the schema. You can also use a
fileset
type as a child of
jing
, allowing you to validate more than one
document at a time.
Run this build file with this command:
ant -f build-jing.xml
and you will get a result like this:
Buildfile: build-jing.xml rng: [echo] Validating with RELAX NG schema using Jing... BUILD SUCCESSFUL Total time: 1 second
Jing can also validate against schemas in the compact syntax, RELAX NG’s terse, non-XML format. The compact schema time.rnc is shown in Example 7-7.
Example 7-7. time.rnc
element time { attribute timezone { text }, element hour { text }, element minute { text }, element second { text }, element meridiem { text }, element atomic { attribute signal { text } } }
The build file
build-rnc.xml
(Example 7-8) validates
time.xml against time.rnc.
Note the addition of the compactsyntax
attribute
to the jing
task element.
Example 7-8. build-rnc.xml
<?xml version="1.0"?> <project default="rng"> <taskdef name="jing" classname="com.thaiopensource.relaxng.util.JingTask"/> <target name="rng"> <echo message="Validating with RELAX NG compact syntax schema using Jing..."/> <jing compactsyntax="true" rngfile="time.rnc" file="time.xml"/> </target> </project>
Give the command:
ant -f build-rnc.xml
and you will get this report:
Buildfile: build-rnc.xml rng: [echo] Validating with RELAX NG compact syntax schema using Jing... BUILD SUCCESSFUL Total time: 1 second
This example places previously
discussed tasks together into a single build file and adds a few
other targets as well. The resulting file,
build-all.xml, is an example of a simple XML
pipeline. The basic scenario is that a property is set (holding the
current directory) using a local XML document
(properties.xml), and a remote ZIP file
(time.zip) is downloaded via the
get
task. The ZIP archive contains four files: two
RELAX NG schemas (time1.rng and
time1.rnc), the DTD
time1.dtd, and an XML instance
time2.xml. This archive is unzipped and
time2.xml is validated against
time1.rng, time1.rnc, and
time1.dtd. Then, time2.xml
is transformed into a text document with XSLT
(clock.txt). Granted, more complex operations
are possible, but this gives you an idea of how you can put a
pipeline together.
The build file is shown in Example 7-9.
Example 7-9. build-all.xml
<?xml version="1.0"?> <project default="xform"> <taskdef name="jing" classname="com.thaiopensource.relaxng.util.JingTask"/> <target name="init"> <echo message="Load XML properties..."/> <xmlproperty file="properties.xml"/> <property name= "MailLogger.from" value="[email protected]"/> <property name= "MailLogger.success.to" value="[email protected]"/> <property name= "MailLogger.failure.to" value="[email protected]"/> <property name= "MailLogger.mailhost" value="mail.example.com"/> </target> <target name="get" depends="init"> <get src="http://www.wyeast.net/time.zip" dest="time.zip"/> </target> <target name="unzip" depends="get"> <unzip src="time.zip" dest="${build.dir}"/> </target> <target name="rng" depends="unzip"> <echo message="Jing validating (XML)..."/> <jing rngfile="time1.rng" file="time2.xml"/> </target> <target name="rnc" depends="rng"> <echo message="Jing validating (compact)..."/> <jing compactsyntax="yes" rngfile="time1.rnc" file="time2.xml"/> </target> <target name="val" depends="rnc"> <xmlvalidate file="time2.xml" failonerror="no"> <dtd publicId="-//Wy'east Communications//Time DTD//EN" location="file:///C:/Hacks/examples/time1.dtd"/> </xmlvalidate> </target> <target name="xform" depends="val"> <echo message="Transforming time2.xml by clock1.xsl..."/> <xslt in="time2.xml" out="clock.txt" style="clock1.xsl"> <outputproperty name="method" value="text"/> <outputproperty name="encoding" value="US-ASCII"/> </xslt> </target> </project>
To run the pipeline, simply type:
ant -f build-all.xml
The output will look like this, provided you have a live Internet connection:
Buildfile: build-all.xml init: [echo] Load XML properties... get: [get] Getting: http://www.wyeast.net/time.zip unzip: [unzip] Expanding: C:Hacksexamples ime.zip into C:Hacksexamples rng: [echo] Jing validating (XML)... rnc: [echo] Jing validating (compact)... val: [xmlvalidate] 1 file(s) have been successfully validated. xform: [echo] Transforming valid.xml by clock.xsl... BUILD SUCCESSFUL Total time: 3 seconds
Each of the targets, except the one named init
,
has a depends
attribute. The value of this
attribute establishes a hierarchy of dependencies between the
targets. The default or starting target is xform
(identified in the project
element); in order for
this target to execute, the val
target must first
execute successfully, and in order for val
to
execute, rnc
must execute, and so forth. So this
dependency is not established structurally, as through a parent-child
relationship, but rather through attribute values. You can put the
targets in any order in the build file. They will still execute
according to the order of the values in the
depends
and name
attributes.
These dependencies make up the segments of the pipeline.
The build file has an xslt
target that transforms
time2.xml into clock.txt
according to the XSLT stylesheet clock1.xsl. The
outputproperty
children contribute attributes and
values that would normally be supplied by the
output
element of XSLT.
The xmlvalidate
target uses a
dtd
child to specify a formal public identifier
for the DTD and the location of a local copy of that DTD.
The get
target gets a URL source, downloading it
to a specified location. The xmlproperty
target
reads the file properties.xml:
<?xml version="1.0"?> <build> <dir>.</dir> </build>
The arbitrary tags in the properties file determine the name or names
for the variable that you can use elsewhere in the build file to
reference values, such as ${build.dir}
. The first
part of the variable name comes from the build
tag
and the second part from dir
. The content of
dir
becomes the value of the variable.
The property
elements in the
init
target list some properties for the
Ant MailLogger (http://ant.apache.org/manual/listeners.html),
which will send an email containing the Ant build information from
schlomo
to harvey
(on success)
or joe
(on failure) at
example.com
, using the mailhost
mail.example.com
. These are, of course, dummied
values. Use email addresses and a mail server that will work for you
when running this example.
To get the MailLogger to work, use the -logger
switch:
ant -logger org.apache.tools.ant.listener.MailLogger -f build-all.xml
Ant’s online manual: http://ant.apache.org/manual/index.html
Ant Wiki: http://wiki.apache.org/ant/FrontPage
“XML Pipelining with Ant,” by Michael Fitzgerald. http://XML.com, January 28, 2003: http://www.xml.com/pub/a/2003/01/29/ant.html
“Running Multiple XSLT Engines with Ant,” by Anthony Coates. http://XML.com, December 11, 2002: http://www.xml.com/pub/a/2002/12/11/ant-xml.html
Ant: The Definitive Guide, by Jesse E. Tilly and Eric M. Burke (O’Reilly)
Java Development with Ant, by Eric Hatcher and Steve Loughran (Manning)
18.189.2.122