Chapter 15. The URLConnection Class

URLConnection is an abstract class that represents an active connection to a resource specified by a URL. The URLConnection class has two different but related purposes. First, it provides more control over the interaction with a server than the URL class. With a URLConnection, you can inspect the MIME headers sent by an HTTP server and respond accordingly. You can adjust the MIME header fields used in the client request. You can use a URLConnection to download binary files. Finally, a URLConnection lets you send data back to a web server with POST or PUT and use other HTTP request methods. We will explore all of these techniques in this chapter.

Second, the URLConnection class is part of Java’s protocol handler mechanism, which also includes the URLStreamHandler class. The idea behind protocol handlers is simple: they separate the details of processing a protocol from processing particular data types, providing user interfaces, and doing the other work that a monolithic web browser performs. The base java.net.URLConnection class is abstract; to implement a specific protocol, you write a subclass. These subclasses can be loaded at runtime by your own applications or by the HotJava browser; in the future, it may be possible for Java applications to download protocol handlers over the Net as needed, making them automatically extensible. For example, if your browser runs across a URL with a strange prefix, such as compress:, rather than throwing up its hands and issuing an error message, it could download a protocol handler for this unknown protocol and use it to communicate with the server. Writing protocol handlers is the subject of the next chapter.

Only abstract URLConnection classes are present in the java.net package. The concrete subclasses are hidden inside the sun.net package hierarchy. Many of the methods and fields as well as the single constructor in the URLConnection class are protected. In other words, they can be accessed only by instances of the URLConnection class or its subclasses. It is rare to instantiate URLConnection objects directly in your source code; instead, the runtime environment creates these objects as needed, depending on the protocol in use. The class (which is unknown at compile time) is then instantiated using the forName( ) and newInstance( ) methods of the java.lang.Class class.

Note

URLConnection does not have the best designed API in the Java class library. It’s been cleaned up somewhat in Java 1.1 and 1.2, but it’s still a lot more confusing than it should be. Since the URLConnection class itself relies on the Socket class for network connectivity, there’s little you can do with URLConnection that can’t also be done with Socket. The URLConnection class is supposed to provide an easier-to-use, higher-level abstraction for network connections than Socket does. In practice, however, it’s so poorly designed that most programmers have chosen to ignore it and simply use the Socket class instead. One of several problems is that the URLConnection class is too closely tied to the HTTP protocol. For instance, it assumes that each file transferred is preceded by a MIME header or something very much like one. However, most classic protocols such as FTP and SMTP don’t use MIME headers. Another problem, one I hope to alleviate in this chapter, is that the URLConnection class is extremely poorly documented, so very few programmers understand how it’s really supposed to work.

Opening URLConnections

A program that uses the URLConnection class directly follows this basic sequence of steps:

  1. Construct a URL object.

  2. Invoke the URL object’s openConnection( ) method to retrieve a URLConnection object for that URL.

  3. Configure the URLConnection.

  4. Read the header fields.

  5. Get an input stream and read data.

  6. Get an output stream and write data.

  7. Close the connection.

You don’t always perform all these steps. For instance, if the default setup for a particular kind of URL is acceptable, then you’re likely to skip step 3. If you want only the data from the server and don’t care about any meta-information, or if the protocol doesn’t provide any meta-information, you’ll skip step 4. If you want only to receive data from the server but not send data to the server, you’ll skip step 6. Depending on the protocol, steps 5 and 6 may be reversed or interlaced.

The single constructor for the URLConnection class is protected:

protected URLConnection(URL url)

Consequently, unless you’re subclassing URLConnection to handle a new kind of URL (that is, writing a protocol handler), you can get a reference to one of these objects only through the openConnection( ) methods of the URL and URLStreamHandler classes. For example:

try {
  URL u = new URL("http://www.greenpeace.org/");
  URLConnection uc = u.openConnection(  );
}
catch (MalformedURLException e) {
  System.err.println(e);
}
catch (IOException e) {
  System.err.println(e);
}

Note

In practice, the openConnection( ) method of java.net.URL is the same as the openConnection( ) method of java.net.URLStreamHandler. All a URL object’s openConnection( ) method does is call its URLStreamHandler’s openConnection( ) method.

The URLConnection class is declared abstract. However, all but one of its methods are implemented. The single method that subclasses are forced to implement is connect( ) , which makes a connection to a server and thus depends on the type of service you’re implementing (HTTP, FTP, etc.). For example, a sun.net.www.protocol.file.FileURLConnection’s connect( ) method converts the URL to a filename in the appropriate directory, creates MIME information for the file, and then opens a buffered FileInputStream to the file. The connect( ) method of sun.net.www.protocol.http.HttpURLConnection creates an HttpClient object (from sun.net.www.http.HttpClient), which is responsible for connecting to the server. Of course, you may find it convenient or necessary to override other methods in the class.

public abstract void connect(  ) throws IOException

When a URLConnection is first constructed, it is unconnected; that is, the local and remote host cannot send and receive data. There is no socket connecting the two hosts. The connect( ) method establishes a connection—normally using TCP sockets but possibly through some other mechanism—between the local and remote host so that you can send and receive data. However, the getInputStream( ), getContent( ), getHeaderField( ), and other methods that require an open connection will themselves call connect( ) if the connection isn’t yet open. Therefore, you rarely need to call connect( ) directly.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.223.114.142