So far, we have looked at building our
applications assuming that the application clients would always
pull
data and content. In other words, a user
had to type a URL into their browser (in the case of the
mytechhbooks.com new book listings), or an application like the
mytechbooks.com servlet had to make an HTTP request for XML data (in
the case of the Foobar Public Library). While this is not a problem,
it is not always the best way for a company like mytechbooks.com to
sell books. Clients pulling data have to remember to visit sites they
would buy items from, and often don’t revisit those sites for
days, weeks, or even months. While those clients may often purchase
goods and services when they do remember, on average, those purchases
do not result in as much revenue as if small purchases were made more
frequently.
Realizing this trend, mytechbooks.com wants to be able to
push
data to its clients. Pushing data involves
letting the client know (without any client action) that new items
are available, or that specials are being run. This in turn allows
the client to make more frequent purchases without having to remember
to visit a web page. However, pushing data to clients is difficult in
a web medium, as the Internet does not behave as a thick client: it
is harder to send pop-up messages or generate alerts for users. What
mytechbooks.com has discovered, though, is the popularity of
personalized “start pages” like Netscape’s My
Netscape and Yahoo’s My Yahoo pages. In talking with Netscape,
mytechbooks.com has been hearing about a technology called Rich Site
Summary (RSS), and thinks it may be the answer to their need to push
data out to clients.
Rich Site Summary (RSS) is a
particular flavor of XML. It has its own DTD, and defines what is
called a
channel
. A
channel is a way to represent data about a specific subject, and
provides for a title and description for the channel, an image or
logo, and then several
items
within the channel. Each item, then, is something of particular
interest about the channel, or a product or service available.
Because the allowed elements of an item are fairly generic (title,
description, hyperlink), almost anything can be represented as an
item of a channel. An RSS channel is not intended to provide a
complete site’s content, but rather a short blurb about a
company or service, suitable for display in a portal-style framework,
or as a sidebar on a web site. In fact, the different
“widgets” at Netscape’s Netcenter are all RSS
channels, and Netscape allows the creation of new RSS channels that
can be registered with Netcenter. Netscape also has a built-in system
for displaying RSS channels in an HTML format, which of course fits
into their Netcenter start pages.
At this point, you may be a little concerned that RSS is to Netscape as Microsoft’s XML parser is to Microsoft: almost completely useless when used with other tools or vendors. Although originally developed by Netscape specifically for Netcenter, the XML structure of RSS has made it usable by any application that can read a DTD. In fact, many portal-style web sites and applications are beginning to use RSS, such as the Apache Jetspeed project ( http://java.apache.org/jetspeed), an open source Enterprise Information Portal system. Jetspeed takes the same RSS format that Netscape uses, and renders it in a completely different manner. Because of the concise grammar of RSS, this is easily done.
As many users have start pages, or home pages, or similar places on the Web that they frequent, mytechbooks.com would like to create an RSS channel that provides new book listings, and then allows interested clients to jump straight to buying an item that catches their eye. This is an effective means to push data, as products like Netcenter will automatically update RSS channel content as often as the user desires.
The first thing we need to do to use RSS is create an RSS file. This is almost too simple to be believed: other than referencing the correct DTD and following that DTD, there is nothing at all complicated about creating an RSS document. Example 13.6 shows a sample RSS file that mytechbooks.com has modeled.
Example 13-6. Sample RSS Channel Document for mytechbooks.com
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE rss PUBLIC "-//Netscape Communications//DTD RSS 0.91//EN" "http://my.netscape.com/publish/formats/rss-0.91.dtd"> <rss version="0.91"> <channel> <title>mytechbooks.com New Listings</title> <link>http://www.newInstance.com/javaxml/techbooks</link> <description> Your online source for technical material, computers, and computing books! </description> <language>en-us</language> <image> <title>mytechbooks.com</title> <url> http://newInstance.com/javaxml/techbooks/images/techbooksLogo.gif </url> <link>http://newInstance.com/javaxml/techbooks</link> <width>140</width> <height>23</height> <description> Your source on the Web for technical books. </description> </image> <item> <title>Java Servlet Programming</title> <link> http://newInstance.com/javaxml/techbooks/buy.xsp?isbn=156592391X </link> <description> This book is a superb introduction to Java servlets and their various communications mechanisms. </description> </item> </channel> </rss>
The root element must be rss
, and the
version
attribute must be defined; additionally,
this attribute’s value must match up with the version of the
DTD referenced. Within the root element, one single
channel
element must appear. This has elements
that describe the channel (title
,
link
, description
, and
language
), an optional image that can be
associated with the channel (as well as information about that
image), and then as many as fifteen item
elements,
each detailing one item related to the channel. Each item has a
title
, link
, and
description
element, all of which are
self-explanatory.
As in previous examples, actual RSS channel documents should
avoid having whitespace within the
link
and url
elements, but
rather have all information on a single line. Again, the formatting in
the example does not reflect this due to printing and sizing
constraints.
An optional text box and button to submit the information in the book can be added as well, although these are not included in the example. For a complete detail of allowed elements and attributes, visit http://my.netscape.com/publish/help/mnn20/quickstart.html online.
It is simple enough to create RSS files programmatically; the
procedure is similar to how we generated the HTML for the
mytechbooks.com web site. Half of the RSS file (the information about
the channel as well as the image information) is static content; only
the item
elements must be generated dynamically.
However, just as you were getting ready to open up vi and start
creating another XSL stylesheet, another requirement was dropped into
your lap: the machine that will house the RSS channel is a different
server than that used in our last example, and only has very outdated
versions of the Apache Xalan libraries available. Because of some of
the high-availability applications that also run on that machine,
such as the billing system, mytechbooks.com does not want to update
those libraries until change control can be stepped through, a
week-long process. However, they do have newer versions of the Xerces
libraries available (as XML parsing is used in the billing system),
so Java APIs for handling XML are available.[19] While SAX and DOM are both viable alternatives, JDOM
again would seem to be the simplest way to convert the XML from the
Foobar Public Library into an RSS channel format. Example 13.7 does just this.
Example 13-7. Java Servlet to Convert New Book Listings into an RSS Channel Document
package com.techbooks; import java.io.FileInputStream; import java.io.InputStream; import java.io.IOException; import java.io.PrintWriter; import java.net.URL; import java.util.Iterator; import java.util.List; import javax.servlet.*; import javax.servlet.http.*; // JDOM import org.jdom.Document; import org.jdom.Element; import org.jdom.JDOMException; import org.jdom.input.Builder; import org.jdom.input.SAXBuilder; public class GetRSSChannelServlet extends HttpServlet { /** Host to connect to for books list */ private static final String hostname = "newInstance.com"; /** Port number to connect to for books list */ private static final int portNumber = 80; /** File to request (URI path) for books list */ private static final String file = "/cgi/supplyBooks.pl"; public void service(HttpServletRequest req, HttpServletResponse res) throws ServletException, IOException { res.setContentType("text/plain"); PrintWriter out = res.getWriter( ); // Connect and get XML listing of books URL getBooksURL = new URL("http", hostname, portNumber, file); InputStream in = getBooksURL.openStream( ); try { // Request SAX Implementation and use default parser Builder builder = new SAXBuilder( ); // Create the document Document doc = builder.build(in); // Output XML out.println(generateRSSContent(doc)); } catch (JDOMException e) { out.println("Error: " + e.getMessage( )); } finally { out.close( ); } } /** * <p> * This will generate an RSS XML document using the supplied * JDOM <code>Document</code>. * </p. * * @param doc <code>Document</code> to use for input. * @return <code>String</code> - RSS file to output. * @throws <code>JDOMException</code> when errors occur. */ private String generateRSSContent(Document doc) throws JDOMException { StringBuffer rss = new StringBuffer( ); rss.append("<?xml version="1.0"?> ") .append("<!DOCTYPE rss PUBLIC ") .append(""-//Netscape Communications//DTD RSS 0.91//EN" ") .append(""http://my.netscape.com/publish/formats") .append("/rss-0.91.dtd"> ") .append("<rss version="0.91"> ") .append(" <channel> ") .append(" <title>Technical Books</title> ") .append(" <link>") .append("http://newInstance.com/javaxml/techbooks</link> ") .append(" <description> ") .append(" Your online source for technical materials, ") .append("computers, and computing books! ") .append(" </description> ") .append(" <language>en-us</language> ") .append(" <image> ") .append(" <title>mytechbooks.com</title> ") .append(" <url>") .append("http://newInstance.com/javaxml/techbooks/") .append("images/techbooksLogo.gif") .append("</url> ") .append(" <link>") .append("http://newInstance.com/javaxml/techbooks</link> ") .append(" <width>140</width> ") .append(" <height>23</height> ") .append(" <description> ") .append(" Your source on the Web for technical books. ") .append(" </description> ") .append(" </image> "); // Add an item for each new title with Computers as subject List books = doc.getRootElement( ).getChildren("book"); for (Iterator i = books.iterator(); i.hasNext( ); ) { Element book = (Element)i.next( ); if (book.getAttribute("subject") .getValue( ) .equals("Computers")) { // Output an item rss.append("<item> ") // Add title .append(" <title>") .append(book.getChild("title").getContent( )) .append("</title> ") // Add link to buy book .append(" <link>") .append("http://newInstance.com/javaxml") .append("/techbooks/buy.xsp?isbn=") .append(book.getChild("saleDetails") .getChild("isbn") .getContent( )) .append("</link> ") .append(" <description>") // Add description .append(book.getChild("description").getContent( )) .append("</description> ") .append("</item> "); } } rss.append(" </channel> ") .append("</rss>"); return rss.toString( ); } }
By this time, nothing in this code should be the least bit surprising
to you; we import the JDOM and I/O classes we need, and access the
Foobar Public Library application as in the
ListBooksServlet
. The resulting
InputStream
is used to create a JDOM
Document
, with the default parser (Apache Xerces
in JDOM 1.0) and the JDOM implementation built on SAX doing the work
for us:
// Request SAX Implementation and use default parser Builder builder = new SAXBuilder( ); // Create the document Document doc = builder.build(in);
We then hand off the JDOM Document
to the
generateRSSContentMethod( )
, which prints out all
of the static content for the RSS channel. This method then obtains
the book
elements within the XML from the library,
and iterates through them, ignoring those without a
subject
attribute equal to
“Computers”:
// Add an item for each new title with Computers as subject List books = doc.getRootElement( ).getChildren("book"); for (Iterator i = books.iterator(); i.hasNext( ); ) { Element book = (Element)books.elementAt(i); if (book.getAttribute("subject") .getValue( ).equals("Computers")) { // Output as an item element } }
Finally, each element that makes it through the comparison is added
to the RSS channel. Nothing very exciting here, right? Figure 13.5 shows a sample output from accessing this
servlet, saved as GetRSSChannelServlet.java
,
through a web browser.
With this RSS channel ready for use, mytechbooks.com has made their content available by any service provider that supports RSS! To get the ball rolling on allowing clients to use their channel, mytechbooks.com would like to register their channel with Netscape Netcenter and see it in action (and so would we!).
Once the channel is created, it should be validated. In addition to ensuring that the document meets the constraints laid out by the RSS DTD, there are limitations that Netscape lays out that the DTD cannot enforce (although XML Schema could rectify this in the future). In order to ensure that channels are properly formed and usable, Netscape provides an online validation mechanism, located at http://my.netscape.com/publish/help/validate.tmpl. Visiting this site and entering in the URL to your RSS channel (which can be a servlet, CGI script, or static file) allows the Netscape program to ensure you are generating a usable RSS channel. Figure 13.6 shows the output of a successful validation run.
Once validation is complete, we are ready to register the RSS channel with Netcenter.
Once the RSS channel has been validated, we need to publish the channel to Netcenter (or whatever other service provider is being used). This can be done through accessing http://my.netscape.com/publish. Walking through the steps, you have to supply a Netcenter account name, as a confirmation email is sent to the address attached to that account. Once the valid RSS channel URL has been accepted, Netcenter adds the channel to its system and send an email. Figure 13.7 shows this email, which includes instructions on adding links to the RSS channel from a web site (like mytechbooks.com, which we look at next), as well as how to add the channel to a Netcenter start page.
Validating and registering the channel has been a breeze! Additionally, the email that Netscape generates even makes adding the channel to a start page simple. Following the hyperlink provided, it takes two mouse clicks to make this channel visible. Figure 13.8 shows the RSS channel within a Netcenter start page, displayed in the left column, with all of our XML converted into formatted HTML.
Each item is listed with the title and a hyperlink (letting the user buy the selected book with an easy mouse click), as well as the description of the book. Additionally, the mytechbooks.com logo is included with a short description of the channel. Every time a user opens her start page, this channel can inform her of new books available through mytechbooks.com, potentially doubling or tripling the income of the company.
Finally, as a means of advertising the availability of this channel to other customers, we can update the XSL stylesheet we created for mytechbooks.com to include a link that will automatically add the channel to a customer’s own start page. This means that a single pull of data from mytechbooks.com can result in the client having data pushed to them daily! Add in the following HTML to our XSL stylesheet:
<p align="center" > <font color="#FFFFFF"><b> <a href="/javaxml/techbooks/contact.html">Contact Us</a> </b></font> </p><br />
<p align="center">
<A HREF="http://my.netscape.com/addchannel.tmpl?service=net.2209">
<IMG SRC="http://my.netscape.com/publish/images/addchannel.gif"
WIDTH="88" HEIGHT="31" BORDER="0" /></A>
</p>
</td> <td valign="top" align="left"> <table border="0" cellpadding="5" cellspacing="5"> <tr> <td width="450" align="left" valign="top"> <p> <b> Welcome to <font face="courier">mytechbooks.com</font>, your source on the Web for computing and technical books. Our newest offerings are listed on the left. To purchase any of these fine books, simply click on the "Buy this Book!" link, and you will be taken to the shopping cart for our store. Enjoy! </b> </p>
This change (included in the email that Netscape generates and sends to you when registering an RSS channel) will add a button with a Netscape graphic taking the user straight to the web site that adds the custom channel to his start page. The formatted HTML that results from this change is shown in Figure 13.9.
At this point, we have completed our business-to-business case study. We have taken an organization that had one language and no XML capabilities (the Foobar Public Library with their Perl scripts) and allowed that organization to communicate with a company that uses an entirely different technology ( Java servlets). The two companies are completely uncoupled, meaning that there is no code in either application that is tied to code in the other application. Because of the standard XML data used as a communication medium, either company can change applications, technologies, and even architectures without affecting the operation of the other. We then looked at how this communication could be used to present HTML content to users (in a totally different fashion for each application), and how to push that content out to customers in yet another HTML format through the use of RSS channels. Underneath all this interaction and communication, XML drove the communication and interoperability of these very different businesses.
[19] Yes, this is a bit of a silly case, and perhaps not so likely to really occur. However, it does afford us the opportunity to look at another alternative for creating XML programmatically. Don’t sneer too much at the absurdity of the example; all of the examples in this book, including the silly ones, stem from actual experiences consulting for real-world companies; laughing at this scenario might mean your next project has the same silly requirements!
3.14.252.56