There are hundreds of XML dialects out there representing data in a platform-independent, software-independent manner. One of the most popular XML dialects is RSS, a format for sharing headlines and links from online news sites, weblogs, and other sources of information.
RSS makes web content available in XML form, perfect for reading in software, in web-accessible files called feeds. RSS readers, called news aggregators, have been adopted by several million information junkies to track all their favorite websites. There also are web applications that collect and share RSS items.
The hard-working Builder
class in the nu.xom
package can load XML over the Internet from any URL:
String rssUrl = "http://feeds.drudge.com/retort";
Builder builder = new Builder();
Document doc = builder.build(rssUrl);
This hour’s workshop employs this technique to read an RSS file, presenting the 15 most recent items.
Open your editor and enter the text from Listing 21.4. Save the result as Aggregator.java
.
1: import java.io.*;
2: import nu.xom.*;
3:
4: public class Aggregator {
5: public String[] title = new String[15];
6: public String[] link = new String[15];
7: public int count = 0;
8:
9: public Aggregator(String rssUrl) {
10: try {
11: // retrieve the XML document
12: Builder builder = new Builder();
13: Document doc = builder.build(rssUrl);
14: // retrieve the document's root element
15: Element root = doc.getRootElement();
16: // retrieve the root's channel element
17: Element channel = root.getFirstChildElement("channel");
18: // retrieve the item elements in the channel
19: if (channel != null) {
20: Elements items = channel.getChildElements("item");
21: for (int current = 0; current < items.size(); current++) {
22: if (count > 15) {
23: break;
24: }
25: // retrieve the current item
26: Element item = items.get(current);
27: Element titleElement = item.getFirstChildElement("title");
28: Element linkElement = item.getFirstChildElement("link");
29: title[current] = titleElement.getValue();
30: link[current] = linkElement.getValue();
31: count++;
32: }
33: }
34: } catch (ParsingException exception) {
35: System.out.println("XML error: " + exception.getMessage());
36: exception.printStackTrace();
37: } catch (IOException ioException) {
38: System.out.println("IO error: " + ioException.getMessage());
39: ioException.printStackTrace();
40: }
41: }
42:
43: public void listItems() {
44: for (int i = 0; i < 15; i++) {
45: if (title[i] != null) {
46: System.out.println("
" + title[i]);
47: System.out.println(link[i]);
48: i++;
49: }
50: }
51: }
52:
53: public static void main(String[] arguments) {
54: if (arguments.length > 0) {
55: Aggregator aggie = new Aggregator(arguments[0]);
56: aggie.listItems();
57: } else {
58: System.out.println("Usage: java Aggregator rssUrl");
59: }
60: }
61: }
Before running the application, set up a command-line argument for the feed you’d like to read, which can be any RSS feed. If you don’t know any, use http://feeds.drudge.com/retort, which contains headlines from the Drudge Retort, an online news site that I publish.
Sample output from the feed is shown in Figure 21.1.
You can find out more about the RSS XML dialect from the RSS Advisory Board website at www.rssboard.org. I’m the chairman of the board, which offers guidance on the format and a directory of software that can be used to read RSS feeds.
The Java language liberates software from dependence on a particular operating system. The program you write with the language on a Windows box creates class files that can be run on a Linux server or a Mac OS X computer.
XML achieves a similar liberation for the data produced by software. If XML data follows the simple rules required to make it well formed, you can read it with any software that parses XML. You don’t need to keep the originating program around just to ensure there’s always a way to access it.
The XOM library makes it easy to read and write XML data.
When you’re using Java and XML, you can declare your independence from two of the major obstacles faced by computer programmers for decades: obsolete data and obsolete operating systems.
Q. What’s the purpose of the DOCTYPE statement in the XML file produced by the PropertyFileCreator
application?
A. That’s a reference to a document type definition (DTD), a file that defines the rules XML data must follow to be considered valid in its dialect.
If you load the web page referred to in that statement, http://java.sun.com/dtd/properties.dtd, you find references to each of the elements and attributes contained in the XML file produced by the Java library’s Properties
class.
Although Sun provides this DTD, Java’s official documentation indicates that it shouldn’t be relied upon when evaluating property configuration data. Parsers are supposed to ignore it.
Q. Are the Hatfields and McCoys still feuding?
A. The West Virginia and Kentucky families are on good terms 121 years after the last casualty in their infamous 35-year conflict.
In 1979, Hatfields and McCoys got together to play the TV game show Family Feud for a week. A pig was kept on stage and awarded to the winning family.
In 2003, a formal peace treaty was reached between the families in Pikeville, KY.
The Hatfield-McCoy Trails, 500 miles of trails for recreational off-road driving, were established in West Virginia in 2000 and expanded over the next decade.
To see whether your knowledge of XML processing in Java is well-formed, answer the following questions.
1. Which of the following terms should not be used to complement XML data that is properly formatted?
A. This data is well formed.
B. This data is valid.
C. This data is dy-no-mite!
2. What method reads all the elements that are enclosed within a parent element?
A. get()
B. getChildElements()
C. getFirstChildElement()
3. What method of the Elements
class can be used to determine the number of elements that it contains?
A. count()
B. length()
C. size()
1. C. Well-formed data has properly structured start and end tags, a single root element containing all child elements, and an ?XML
declaration at the top. Valid data follows the rules of a particular XML dialect. “Dy-no-mite!” is the catchphrase of the comedian Jimmie “J.J.” Walker.
2. B. The getChildElements()
method returns an Elements
object holding all the elements.
3. C. Just like vectors, Elements
uses a size()
method that provides a count of the items it holds.
To extend your knowledge of XML, parse the following activities:
• Revise the WeatherStation
application to display an additional element from the Weather Underground forecast data.
• Write a program that displays the data contained in shortChanges.xml
, an XML document of weblog information available at www.weblogs.com/shortChanges.xml.
To see Java programs that implement these activities, visit the book’s website at www.java24hours.com.
3.141.7.186