Chapter 7. Traversing XML

In the last chapter, we learned how to create stylesheets for our XML documents, beginning our section on XSL. In this chapter, we complete that discussion by taking a detailed look at how our document and stylesheet are processed and transformed into output. As in our previous pairs of chapters, this chapter gives you the Java application of the XML language structures we just learned about. We will look at Java XSLT processors, Java APIs for handling XML input in tree formats, and how these APIs differ from the SAX APIs we have already examined.

To begin this chapter, we take a look at how to make the transformations dangled in front of you throughout the last chapter actually occur on your own local machine. This should give you a “virtual playground” where you can experiment with all the various XSL and XSLT constructs on your own, as well as adding more complex formatting to the stylesheet we created last chapter. It will also begin our closer look into how an XSLT processor works. We then complement our view of a processor’s output with a detailed look at the type of input it expects, and the format of this input. This leads us into a first look at the Document Object Model (DOM), an alternative to using SAX for getting to XML data. Finally, we will begin to move back a step from parsers, processors, and APIs, and look at how to put an XML application together. This will set the tone for the rest of the book, as we take a more topical approach on various types of XML applications and how to take advantage of proven design patterns and XML frameworks.

Before going on, you should understand not only the focus of the chapter, but also what it does not focus on. This chapter will not teach you how to write an XSLT processor, any more than previous chapters taught you to write an XML parser. Certainly the concepts here are very important, in fact critical, to using an XSLT processor, and are a great starting point for getting involved with existing efforts to enhance XSLT processors, such as the Apache Group’s Xalan processor. However, parsers and processors are extremely complex programs, and to try to explain the inner workings of them within these pages would consume the rest of this book and possibly another! Instead, we continue to take the approach of an application developer or Java architect; we use the excellent tools that are available, and enhance them when needed. In other words, you have to start somewhere, and for a Java developer, using a processor should precede trying to code one.

Getting the Output

If you followed along with our examples in the last chapter, you should be ready to put your stylesheet and XML document through a processor and see the output for yourself. This is a fairly straightforward process with most XSLT processors. Continuing in our vein of using open source, best-of-breed products, we will use the Apache Xalan XSLT processor, which you can find information and downloads for at http://xml.apache.org. In addition to being contributed to by Lotus, IBM, Sun, Oracle, and some of the best open source minds in the business, Xalan fits in very well with Apache Xerces, the parser we looked at in earlier chapters. If you already have another processor, you should easily be able to find the programs and instructions needed to run the examples in this chapter; your output should also be identical or very close to the example output we look at here.

The first use of an XSLT processor we will investigate is invoking it from a command line. This is often done for debugging, testing, and offline development of content. Consider that many high-performance web sites generate their content offline, often nightly or weekly, to reduce the load and performance constraints of dynamically transforming XML into HTML or other markup languages when a user requests a page. We can also use this as a starting point for peeling back the layers of an XML transformation. Consult your processor’s documentation for how to use XSLT from the command line. For Apache Xalan, the command used to perform this task is:

D:prodJavaXML> java org.apache.xalan.xslt.Process 
                       -IN [XML Document] 
                       -XSL [XSL Stylesheet]
                       -OUT [Output Filename]

Xalan, like any processor you choose, can take in many other command-line options, but these three are the primary ones we want to use. Xalan also uses the Xerces parser by default, so you will need to have both the parser and processor classes in your class path to run Xalan from the command line. You can specify a different XML parser implementation through the command line if you wish, although the support for Xerces is more advanced than for other parsers. You also do not need to reference a stylesheet in your XML document if generating a transformation this way; the XSLT processor will apply the stylesheet you specify on the command line to the XML document. We will use our XML document’s internal stylesheet declarations in Chapter 9. So taking the names of our XML document and XSL stylesheet (in this case in a subdirectory), we can determine the syntax needed to run the processor. Since we are transforming our XML into HTML, we specify contents.html as the output for the transformation:

D:prodJavaXML> java org.apache.xalan.xslt.Process 
                       -IN contents.xml
                       -XSL XSL/JavaXML.html.xsl
                       -OUT contents.html

Running this command from the appropriate directory should cause Xalan to begin the transformation process, giving you output similar to that shown in Example 7.1.

Example 7-1. Transforming XML with Apache Xalan

D:prodJavaXML>java org.apache.xalan.xslt.Process 
                     -IN contents.xml 
                     -XSL XSL/JavaXML.html.xsl 
                     -OUT contents.html
========= Parsing file:D:/prod/JavaXML/XSL/JavaXML.html.xsl ==========

Parse of file:D:/prod/JavaXML/XSL/JavaXML.html.xsl took 1161 milliseconds
========= Parsing contents.xml ==========
Parse of contents.xml took 311 milliseconds
=============================
Transforming...
transform took 300 milliseconds
XSLProcessor: done

Once this is complete, you should be able to open the generated file, contents.html, in an editor or web browser. If you followed along with all the examples in the last chapter, your HTML document should look similar to Figure 7.1 (remember our preview of this HTML from the last chapter?).

HTML from XML transformation

Figure 7-1. HTML from XML transformation

As simple as that, you have a means to make changes and test the resultant output from XML and XSL stylesheets! The Xalan processor, when run from the command line, also has the helpful feature of identifying errors that may occur in your XML or XSL and the line numbers on which those errors are encountered in the source documents, aiding even further in testing and debugging.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.29.119