HTTP over SSL

HTTP has the notion of clients identifying and accessing network resources, files or programs, from an HTTP server through an HTTP URL, a string of form "http://<machine>:<port>/<path>". Underneath, the client program opens a TCP connection to the server identified by machine and port (port 80 is assumed if no port is specified), sends a request, essentially a message consisting of text headers separated by newlines and optionally followed by a binary or text payload. The server gets the request, processes it, and sends back the response.

As is evident, it is fairly straightforward to layer HTTP over SSL, the combination also known as HTTPS. IETF RFCs 2817 and 2818 contain the necessary information to accomplish this. A client indicates its desire to use SSL by using string "https" in the protocol part of the URL in place of "http". A server could either upgrade a TCP connection to SSL at a client's request or open a separate port for all SSL connections. In practice, a separate port is almost always used, the default being port 443.

A client is expected to match the identity of the server by matching the server name in the URL with the CNAME field of the distinguished name corresponding to the certificate presented by the server. This check can be overridden if the client has external information, say in the form of user input, to verify the server's identity. Optionally, the server may also ask the client to authenticate itself by presenting a certificate.

Simple, isn't it? Don't be too hasty. Real world uses of HTTP are more complex than direct communication between a client and server. HTTP proxies aggregate outgoing traffic from behind the firewall corporate networks and virtual hosts allow multiple websites corresponding to different domain names to be hosted on the same physical machine. Their presence poses unique challenges to HTTPS.

HTTP proxies have been used to cache static content, thus reducing the load from the HTTP servers and improving the response time. They are also used to filter or log access to certain sites in some restricted environments. But HTTPS works by instructing the proxy to establish transparent TCP connection, bypassing it completely and negating many of its advantages. Virtual hosts present a different kind of problem—as the information about the target virtual host is available only in an HTTP Header field, which comes encrypted over SSL, there is no way for the HTTP server to identify and present the certificate corresponding to the target virtual host.

More detailed discussion of these issues is beyond the scope of this book.

Java API for HTTP and HTTPS

You can access an HTTP or HTTPS URL by simply constructing a java.net.URL object with the URL string, either starting with http:// or https:// as argument and reading the InputStream obtained by invoking method openStream() on the URL object. The source code of program GetURL.java illustrates this.

//File: srcjsbookch6ex2GetURL.java
import java.net.URL;
import java.io.BufferedReader;
import java.io.InputStreamReader;

public class GetURL {
  public static void main(String[] args) throws Exception {
    if (args.length < 1){
      System.out.println("Usage:: java GetURL <url>");
      return;
    }
    String urlString = args[0];
    URL url = new URL(urlString);
    BufferedReader br =
        new BufferedReader(new InputStreamReader(url.openStream()));
    String line;
    while ((line = br.readLine()) != null){
      System.out.println(line);
   }
  }
}

Let us compile and run this program with a HTTPS URL.

C:ch6ex2>javac GetURL.java
C:ch6ex2>java GetURL https://www.etrade.com
<META HTTP-EQUIV="expires" CONTENT=0>
<META HTTP-EQUIV="Pragma" CONTENT="no-cache">
... more stuff skipped ...

How did the program validate the certificate provided by the Web server running at www.etrade.com? The short answer is that it validates the server certificate against the default truststore. Recall that JSSE first tries file jssecacerts, and if not found then file cacerts, in directory jre-homelibsecurity as the default truststore. Successful execution of GetURL with https://www.etrade.com implies that the server certificate is signed, either directly or indirectly, by a CA whose certificate is present in the default truststore.

This is really simple. But then, how do you specify the truststore to validate the server certificate? How do you override the default server identity verification based on matching the name in the certificate and the hostname in the URL? How do you provide the client certificate, if the server asks for it?

All of these are valid questions. And the good news is that you can do all of the above. But let us first understand what happens under the hood when openStream() is called.

The method openStream() internally invokes openConnection() on the URL object, and then getInputStream() on the returned java.net.URLConnection object. The returned URLConnection object is of type java.net.HttpURLConnection for an HTTP URL and of type javax.net.ssl.HttpsURLConnection for an HTTPS URL. Note that HttpsURLConnection is a subclass of HttpURLConnection and inherits all its public and protected methods.

Class HttpsURLConnection is associated with class SSLSocketFactory and uses this factory to establish the SSL connection. By default, this association is with the default SSLSocketFactory instance obtained by the static method getDefault() of SSLSocketFactory. Recall that the default SSLSocketFactory relies on a number of system properties to locate the truststore and the keystore information, and so does HttpsURLConnection. What it means is that you can specify the system properties as options to the JVM launcher at the command line or set them by invoking the System.setProperty() method.

You can also replace the default SSLSocketFactory with your instance of SSLSocketFactory, for all instances of HttpsURLConnection by calling the static method setDefaultSSLSocketFactory() or for a single instance by calling the method setSSLSocketFactory() on that instance.

Custom Hostname Verification

The mechanism to override hostname verification is slightly different. You need to extend the interface javax.net.ssl.HostnameVerifier and override the method verify() to place your logic there. An instance of this class can be set as the verifier to be called whenever the identification present in the server certificate doesn't match the hostname part of the URL. Similar to SSLSocketFactory, a HostnameVerifier can be set for all instances of HttpsURLConnection by calling the static method setDefaultHostnameVerifier() or for a single instance by calling the method setHostnameVerifier () on that instance. The example program GetVerifiedURL.java, shown in Listing 6-5 overrides hostname verification for a single instance of HTTPS connection.

Listing 6-5. Overriding hostname verifier
// File: srcjsbookch6ex2GetVerifiedURL.java
import java.net.URL;
import java.net.URLConnection;
import javax.net.ssl.HttpsURLConnection;
import javax.net.ssl.HostnameVerifier;
import javax.net.ssl.SSLSession;
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.io.IOException;

public class GetVerifiedURL {
  public static class CustomHostnameVerifier
                                   implements HostnameVerifier {
    private String hostname;
    public CustomHostnameVerifier(String hostname){
      this.hostname = hostname;
    }
    public boolean verify(String hostname, SSLSession sess){
      try {
        String peerHost = sess.getPeerHost();
        System.out.println("Expected hostname: " + hostname +
                                           ", Found: " + peerHost);
        System.out.print("Proceed(yes/no)?");  // Prompt user
        System.out.flush();
        BufferedReader br = new BufferedReader(
                            new InputStreamReader(System.in));
        String response = br.readLine();
        return ("yes".equalsIgnoreCase(response.trim()));
      } catch (IOException ioe){
      return false;
      }
    }
  }
  public static void main(String[] args) throws Exception {
    if (args.length < 1){
      System.out.println("Usage:: java GetVerifiedURL <url>");
   return;
    }
    String urlString = args[0];
    URL url = new URL(urlString);
    CustomHostnameVerifier custVerifier =
                         new CustomHostnameVerifier(url.getHost());
    URLConnection con = url.openConnection();
    if (!(con instanceof HttpsURLConnection)){
      System.out.println(urlString + " is not a HTTPS URL.");
   return;
    }
    HttpsURLConnection httpsCon = (HttpsURLConnection)con;
    httpsCon.setHostnameVerifier(custVerifier);
    BufferedReader br = new BufferedReader(
    new InputStreamReader(httpsCon.getInputStream()));
    String line;
    while ((line = br.readLine()) != null){
      System.out.println(line);
    }
  }
}

This program sets up a verifier that succeeds or fails based on user response, displaying expected hostname as specified in the URL and the hostname retrieved from the certificate presented by the server. Let us run this program twice, once with the URL https://www.etrade.com and then with https://ip-addr, where ip-addr is the IP address of the host www.etrade.com obtained by a DNS lookup tool like nslookup. What we find is that the first execution runs exactly like GetURL, but the second one triggers the execution of the CustomHostnameVerifier class, prompting the user to proceed or abort.

C:ch6ex2>java GetVerifiedURL https://12.153.224.22
Expected hostname: 12.153.224.22, Found: www.etrade.com
Proceed(yes/no)?yes
<META HTTP-EQUIV="expires" CONTENT=0>
<META HTTP-EQUIV="Pragma" CONTENT="no-cache">
... more stuff skipped ...

This is expected, as the hostname string found in the certificate is not same as the one specified in the URL.

Tunneling Through Web Proxies

If you are running these programs on a machine behind a corporate firewall to read an external URL, and the firewall allows HTTP and HTTPS connections to be made through an HTTP proxy, then you should set the system properties http.proxyHost and http.proxyPort for HTTP URLs and https.proxyHost and https.proxyPort for HTTPS URLs to the host where the proxy is running and the corresponding port. Here is how the execution command looks behind a firewall.

C:ch6ex2>java -Dhttp.proxyHost=web-proxy -Dhttp.proxyPort=8088 
							GetURL http://www.etrade.com

C:ch6ex2>java -Dhttps.proxyHost=web-proxy -Dhttps.proxyPort=8088 
							GetURL https://www.etrade.com
						

To get Web proxy hostname and port values for your network, you could look into your browser setup or check with the network administrator.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.226.166.37