Chapter 9. Web Publishing Frameworks

This chapter begins our look at specific Java and XML topics. So far, we have covered the basics of using XML from Java, looking at the SAX and DOM APIs to manipulate XML and the fundamentals of using and creating XML itself. We’ve also looked at how JDOM can provide a more Java-centric means of using our XML data and documents within Java programs. Now that you have a grasp on using XML from your code, we will spend time on specific applications. The next six chapters represent the most significant applications of XML, and, in particular, how those applications are implemented in the Java space. While there are literally hundreds and soon to be thousands of important applications of XML, the topics in these chapters are those that continually seem to be in the spotlight, and that have a significant potential to change the way traditional development processes occur.

We begin our look at these hot topics with the one XML application that seems to have generated the largest amount of excitement in the XML and Java communities: the web publishing framework. Although we have continually emphasized that generating presentation from content is perhaps over-hyped when compared to the value of the portable data that XML provides, using XML for presentation styling is still very important. This importance increases when looking at web-based applications.

Over the next five years, virtually every major application will either be completely web-based, or at a minimum have a web frontend. At the same time, users are demanding more functionality, and marketing departments are demanding more flexibility in look and feel. The result has been the rise of the web artist; this new role is different from the webmaster in that little to no Perl, ASP, JavaScript, or other scripting language coding is part of the job description. The web artist’s entire day is comprised of HTML creation, modification, and development. The rapid changes in business and market strategy can require a complete application or site overhaul as often as once a week, often forcing the web artist to spend days changing hundreds of HTML pages. While Cascading Style Sheets (CSS) have helped, the difficulty of maintaining consistency across these pages has required a huge amount of time. Even if this less than ideal situation were acceptable, no computer developer wants to spend his or her life making HTML changes to web pages.

With the advent of server-side Java, this problem has only grown. Servlet developers find themselves spending long hours modifying their out.println( ) statements to output HTML, and often glance hatefully at the marketing department when changes to a site’s look require modifications to their code. The entire Java Server Pages ( JSP) specification arguably stemmed from this situation; however, JSP is not a solution, as it only shifts the frustration to the HTML developer, who constantly has to avoid making incidental changes to embedded Java code. In addition, JSP does not provide the clean separation between content and presentation it promises. What was called for was a means to generate pure data content, and have that content uniformly styled either at predetermined times (static content generation), or dynamically at runtime (dynamic content generation).

Of course, you should be nodding your head at this familiar problem if you have ever done any web development, and hopefully your mind is wandering into the XSL and XSLT technology space. The problem is that an engine must exist to handle content generation, particularly in the dynamic sense. Having hundreds of XML documents on a site does no good if there is no mechanism to apply transformations on them when requested. Add to this the need for servlets and other server-side components to output XML that should be consistently styled, and you have defined a small set of requirements for the web publishing framework. In this chapter, we take a look at this framework, how it can allow you to toss out those long hours of HTML coding, and how it can help you convert all of those “web artists” into XML and XSL gurus, making you happy, them happy, and allowing your applications to change look and feel as often as you want.

A web publishing framework attempts to address these complicated issues. Just as a web server is responsible for responding to a URL request for a file, a web publishing framework is responsible for responding to a similar request; however, instead of responding with a file, it often will respond with a published version of a file. In this case, a published file refers to a file that may have been transformed with XSLT, or massaged at an application level, or converted into another format such as a PDF. The requestor does not see the raw data that may underlie the published result, but also does not have to explicitly request that publication occur. Often, a URI base (such as http://yourHost.com/publish) signifies that a publishing engine that sits on top of the web server should handle requests. As you may suspect, the concept is much simpler than the actual implementation of a framework like this, and finding the correct framework for your needs is not a trivial task.

Selecting a Framework

If you’re getting an idea of the importance of the web publishing framework, you might expect to find a list of hundreds of possible solutions. This is because the Java language offers an easy interface into the various XML tools used by web publishing frameworks. Additionally, Java servlets offer a simple means of handling web requests and responses. However, the list of frameworks is small, and the list of good and stable ones is even smaller. One of the best resources for seeing what products are currently available is XML Software’s list at http://xmlsoftware.com/publishing/. This list changes frequently enough that it is not worth repeating here. Still, some important criteria for determining what framework is right for you are worth mentioning.

Stability

Don’t be surprised if you have a hard time finding a product whose version tag is greater than 2.x. In fact, you may have to search diligently to even find a second-generation framework. While a higher version number is not a guarantee of stability, it often reflects the amount of time, effort, and review that a framework has undergone. The XML publishing system is such a new beast that the market is being flooded with 1.0 and 1.1 products that simply are not stable enough for practical use.

You can also often ascertain stability of a product by the stability of other products from the same vendor. Often, an entire suite of tools is released by a vendor; if their other tools do not offer SAX 2.0 and DOM Level 2 support, or are all also 1.0 and 1.1 products, you might be wise to pass on the framework until it has matured a little more, and has conformed to newer XML standards. You should also try to steer away from platform-specific technologies — if the framework is tied to a platform (such as Windows), you aren’t dealing with a pure Java solution. Remember that a publishing framework must serve clients on any platform; why be happy with a product that can’t also run on any platform?

Integration with Other XML Tools and APIs

Once you have ensured that your framework is stable enough for your needs, you should make sure that it has support for a variety of XML parsers and processors. If a framework is tied to a specific parser or processor, you are really just buying an XML version of Microsoft — you have tied yourself to one specific implementation of a technology. Although frameworks often integrate well with a particular parser vendor, determine if parsers can be interchanged. If you have a favorite processor (or one left to you from previous projects), make sure that processor can still be used.

Support for SAX and DOM is a must. Also, try to find a framework whose developers are monitoring the specifications of XML Schema, XLink, XPointer, and other emerging XML technologies. This will indicate if you can expect to see revisions of the framework add support for these XML specifications, an important indication of the framework’s longevity. Don’t be afraid to ask questions about how quickly new specifications can be expected to be integrated into the product, and insist on a firm answer.

Production Presence

The last, and perhaps most important, question to answer when looking for a web publishing framework is determining if it is used in production applications. If you cannot be supplied with at least a few reference applications or sites that are using the framework, don’t be surprised if there aren’t any. Vendors (and developers, in the open source realm) should be happy and proud to let you know where you can check out their frameworks in action. Hesitance in this area is a sign that you may be more of a pioneer with a product than you wish to be.

Making the Decision

Once you have evaluated these criteria, you will probably have a clear answer. Very few frameworks can positively answer all the questions raised here, not to mention your application specific concerns. In fact, at the time of this writing, less than five publishing frameworks exist that support the latest versions of SAX, DOM, and JAXP, are in production at even one application site, and have at least three significant revisions of code under their belt. These are not listed here because, honestly, in six months they may not exist, or may be radically changed. The world of web publishing frameworks is in such flux that trying to recommend you to four or five options and be assured they will be in existence months from now has a greater chance of misleading you than helping you.

However, one publishing framework has consistently succeeded and received notice within the Java and XML community; when considering the open source community in particular, this framework is often the choice of Java developers. The Apache Cocoon project, founded by Stefano Mazzocchi, has been a solid framework since its inception. Developed while most of us were still trying to figure out what XML was, Cocoon is now entering its second generation as an XML publishing framework based completely in Java. It also is part of the Apache XML project, and has default support for Apache Xerces and Apache Xalan. It allows any conformant XML parser to be used, and is based on the immensely popular Java servlet architecture. In addition, there are several production sites using Apache Cocoon (in its 1.x form) that push the boundaries of traditional web application development yet still perform extremely well. For this reason, and again in keeping with the spirit of open source software, we use Apache Cocoon as the framework of choice in this chapter.

In previous chapters, our choice of XML parser and processor was fairly open; in other words, examples would work with only small modifications to code when using different vendor implementations. However, the web publishing framework is not standardized, and each framework implements wildly different features and conventions. For this reason, the examples in this chapter using Apache Cocoon are not portable; however, the popularity of the concepts and design patterns used within Cocoon merit an entire chapter on using the framework. If you do not choose Cocoon, you should at least look over the examples, as the concepts in web publishing are usable across any vendor implementation, even though the specifics of the code are not.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
13.58.5.57