CHAPTER GOALS
To understand the concept of sockets
To learn how to send and receive data through sockets
To implement network clients and servers
To communicate with web servers and server-side applications through the Hypertext Transfer Protocol (HTTP)
You probably have quite a bit of experience with the Internet: the global network that links together millions of computers. In particular, you use the Internet whenever you browse the World Wide Web. Note that the Internet is not the same as the "Web". The World Wide Web is only one of many services offered over the Internet. E-mail, another popular service, also uses the Internet, but its implementation differs from that of the Web. In this chapter, you will see what goes on "under the hood" when you send an e-mail message or when you retrieve a web page from a remote server. You will also learn how to write your own programs that fetch data from sites across the Internet and how to write server programs that can serve information to other programs.
Computers can be connected with each other through a variety of physical media. In a computer lab, for example, computers are connected by network cabling. Electrical impulses representing information flow across the cables. If you use a DSL modem to connect your computer to the Internet, the signals travel across a regular telephone wire, encoded as tones. On a wireless network, signals are sent by transmitting a modulated radio frequency. The physical characteristics of these transmissions differ widely, but they ultimately consist of sending and receiving streams of zeroes and ones along the network connection.
The Internet is a worldwide collection of networks, routing equipment, and computers using a common set of protocols to define how each party will interact with each other.
These zeroes and ones represent two kinds of information: application data, the data that one computer actually wants to send to another, and network protocol data, the data that describe how to reach the intended recipient and how to check for errors and data loss in the transmission. The protocol data follow certain rules set forth by a particular network protocol. Various protocols have been developed for local area networks, such as Microsoft Networking, Novell NetWare, or Apple-Talk. The Internet Protocol (IP), on the other hand, was developed to enable different local area networks to communicate with each other and has become the basis for connecting computers around the world over the Internet. We will discuss IP in this chapter.
Suppose that a computer A wants to send data to a computer B, both on the Internet. The computers aren't connected directly with a cable, as they could be if both were on the same local area network. Instead, A may be someone's home computer and connected to an Internet service provider (ISP), which is in turn connected to an Internet access point; B might be a computer on a local area network belonging to a large firm that has an Internet access point of its own, which may be half a world away from A. The Internet itself, finally, is a complex collection of pathways on which a message can travel from one Internet access point to, eventually, any other Internet access point (see Figure 1). Those connections carry millions of messages, not just the data that A is sending to B.
For the data to arrive at its destination, it must be marked with a destination address. In IP, addresses are denoted by sequences of four numbers, each one byte (that is, between 0 and 255); for example, 130.65.86.66. (Because there aren't enough four-byte addresses for all devices that would like to connect to the Internet, these addresses will be extended to sixteen bytes in the near future.) In order to send data, A needs to know the Internet address of B and include it in the protocol portion when sending the data across the Internet. The routing software that is distributed across the Internet can then deliver the data to B.
Of course, addresses such as 130.65.86.66 are not easy to remember. You would not be happy if you had to use number sequences every time you sent e-mail or requested information from a web server. On the Internet, computers can have so-called domain names that are easier to remember, such as cs.sjsu.edu
or horstmann.com
. A special service called the Domain Naming Service (DNS) translates between domain names and Internet addresses. Thus, if computer A wants to have information from horstmann.com
, it first asks the DNS to translate this domain name into a numeric Internet address; then it includes the numeric address with the request.
One interesting aspect of IP is that it breaks large chunks of data up into more manageable packets. Each packet is delivered separately, and different packets that are part of the same transmission can take different routes through the Internet. Packets are numbered, and the recipient reassembles them in the correct order.
TCP/IP is the abbreviation for Transmission Control Protocol over Internet Protocol, the pair of communication protocols used to establish reliable transmission of data between two computers on the Internet.
The Internet Protocol has just one function—to attempt to deliver data from one computer to another across the Internet. If some data get lost or garbled in the process, IP has safeguards built in to make sure that the recipient is aware of that unfortunate fact and doesn't rely on incomplete data. However, IP has no provision for retrying an incomplete transmission. That is the job of a higher-level protocol, the Transmission Control Protocol (TCP). This protocol attempts reliable delivery of data, with retries if there are failures, and it notifies the sender whether or not the attempt succeeded. Most, but not all, Internet programs use TCP for reliable delivery. (Exceptions are "streaming media" services, which bypass the slower TCP for the highest possible throughput and tolerate occasional information loss. However, the most popular Internet services—the World Wide Web and e-mail—use TCP.) TCP is independent of the Internet Protocol; it could in principle be used with another lower-level network protocol. However, in practice, TCP over IP (often called TCP/IP) is the most commonly used combination. We will focus on TCP/IP networking in this chapter.
A computer that is connected to the Internet may have programs for many different purposes. For example, a computer may run both a web server program and a mail server program. When data are sent to that computer, they need to be marked so that they can be forwarded to the appropriate program. TCP uses port numbersfor this purpose. A port number is an integer between 0 and 65,535. The sending computer must know the port number of the receiving program and include it with the transmitted data. Some applications use "well-known" port numbers. For example, by convention, web servers use port 80, whereas mail servers running the Post Office Protocol (POP) use port 110. TCP packets, therefore, must contain
The Internet address of the recipient.
The port number of the recipient.
The Internet address of the sender.
The port number of the sender.
You can think of a TCP connection as a "pipe" between two computers that links the two ports together. Data flow in either direction through the pipe. In practical programming situations, you simply establish a connection and send data across it without worrying about the details of the TCP/IP mechanism. You will see how to establish such a connection in Section 21.3.
A TCP connection requires the Internet addresses and port numbers of both end points.
Why do some streaming media services not use TCP?
In the preceding section you saw how the TCP/IP mechanism can establish an Internet connection between two ports on two computers so that the two computers can exchange data. Each Internet application has a different application protocol, which describes how the data for that particular application are transmitted.
HTTP, or Hypertext Transfer Protocol, is the protocol that defines communication between web browsers and web servers.
Consider, for example, HTTP: the Hypertext Transfer Protocol, which is used for the World Wide Web. Suppose you type a web address, called a Uniform Resource Locator (URL, often pronounced like "Earl"), such as
, into the address window of your browser and ask the browser to load the page.http://horstmann.com/index.html
An URL, or Uniform Resource Locator, is a pointer to an information resource (such as a web page or an image) on the World Wide Web.
The browser now takes the following steps:
It examines the part of the URL between the double slash and the first single slash ("horstmann.com
"), which identifies the computer to which you want to connect. Because this part of the URL contains letters, it must be a domain name rather than an Internet address, so the browser sends a request to a DNS server to obtain the Internet address of the computer with domain name horstmann.com
.
From the http:
prefix of the URL, the browser deduces that the protocol you want to use is HTTP, which by default uses port 80.
3. It establishes a TCP/IP connection to port 80 at the Internet address it obtained in Step 1.
It deduces from the /index.html
suffix that you want to see the file /index.html
, so it sends a request, formatted as an HTTP command, through the connection that was established in Step 3. The request looks like this:
GET /index.html HTTP/1.1
Host: horstmann.com
blank line
(The host is needed because a web server can host multiple domains with the same Internet address.)
The web server running on the computer whose Internet address is the one the browser obtained in Step 1 receives the request and decodes it. It then fetches the file /index.html
and sends it back to the browser on your computer.
The browser displays the contents of the file. Because it happens to be an HTML file, the browser translates the HTML tags into fonts, bullets, separator lines, and so on. If the HTML file contains images, then the browser makes more GET
requests, one for each image, through the same connection, to fetch the image data. (Appendix F contains a summary of the most frequently used HTML tags.)
You can try the following experiment to see this process in action. The "Telnet" program enables a user to type characters for sending to a remote computer and view characters that the remote computer sends back. On Windows, you need to enable the Telnet program in the control panel. UNIX, Linux, and Mac OS X systems normally have Telnet preinstalled.
For this experiment, you want to start Telnet with a host of horstmann.com
and port 80. To start the program from the command line, simply type
telnet horstmann.com 80
Table 21.1. HTTP Commands
Command | Meaning |
---|---|
| Return the requested item |
| Request only the header information of an item |
| Request communications options of an item |
| Supply input to a server-side command and return the result |
| Store an item on the server |
| Delete an item on the server |
| Trace server communication |
Once the program starts, type very carefully, without making any typing errors and without hitting the backspace key,
GET / HTTP/1.1 Host: horstmann.com
Then hit the Enter key twice.
The first /
denotes the root page of the web server. Note that there are spaces before and after the first /, but there are no spaces in HTTP/1.1
.
On Windows, you will not see what you type, so you should be extra careful when typing in the commands.
The server now sends a response to the request—see Figure 2. The response, of course, consists of the root web page that you requested. The Telnet program is not a browser and does not understand HTML tags, so it simply displays the HTML file—text, tags, and all.
The Telnet program is a useful tool for establishing test connections with servers.
The GET
command is one of the commands of HTTP. Table 1 shows the other commands of the protocol. As you can see, the protocol is pretty simple.
By the way, be sure not to confuse HTML with HTTP. HTML is a document format (with commands such as <h1>
or <ul>
) that describes the structure of a document, including headings, bulleted lists, images, hyperlinks, and so on. HTTP is a protocol (with commands such as GET
and POST
) that describes the command set for web server requests. Web browsers know how to display HTML documents and how to issue HTTP commands. Web servers know nothing about HTML. They merely understand HTTP and know how to fetch the requested items. Those items may be HTML documents, GIF or JPEG images, or any other data that a web browser can display.
The HTTP GET
command requests information from a web server. The web server returns the requested item, which may be a web page, an image, or other data.
HTTP is just one of many application protocols in use on the Internet. Another commonly used protocol is the Post Office Protocol (POP), which is used to download received messages from e-mail servers. To send messages, you use yet another protocol called the Simple Mail Transfer Protocol (SMTP). We don't want to go into the details of these protocols, but Figure 3 gives you a flavor of the commands used by the Post Office Protocol.
Both HTTP and POP use plain text, which makes it particularly easy to test and debug client and server programs (see How To 21.1 on page 855).
Why is it important that you don't make typing errors when you type HTTP commands in Telnet?
In this section you will see how to write a Java program that establishes a TCP connection to a server, sends a request to the server, and prints the response.
In the terminology of TCP/IP, there is a socket on each side of the connection (see Figure 4). In Java, a client establishes a socket with a call
Socket s = new Socket(hostname
,portnumber
);
For example, to connect to the HTTP port of the server horstmann.com
, you use
final int HTTP_PORT = 80; Socket s = new Socket("horstmann.com", HTTP_PORT);
A socket is an object that encapsulates a TCP connection. To communicate with the other end point of the connection, use the input and output streams attached to the socket.
The socket constructor throws an UnknownHostException
if it can't find the host.
Once you have a socket, you obtain its input and output streams:
InputStream instream = s.getInputStream(); OutputStream outstream = s.getOutputStream();
When you send data to outstream
, the socket automatically forwards it to the server. The socket catches the server's response, and you can read the response through instream
(see Figure 4).
When you are done communicating with the server, you should close the socket:
s.close();
When transmission over a socket is complete, remember to close the socket.
In Chapter 19, you saw that the InputStream
and OutputStream
classes are used for reading and writing bytes. If you want to communicate with the server by sending and receiving text, you should turn the streams into scanners and writers, as follows:
Scanner in = new Scanner(instream); PrintWriter out = new PrintWriter(outstream);
For text protocols, turn the socket streams into scanners and writers.
A print writer buffers the characters that you send to it. That is, characters are not immediately sent to their destination. Instead, they are placed into an array. When the array is full, then the print writer sends all characters in the array to its destination. The advantage of buffering is increased performance—it takes some amount of time to contact the destination and send it data, and it is expensive to pay for that contact time for every character. However, when communicating with a server that responds to requests, you want to make sure that the server gets a complete request at a time. Therefore, you need to flush the buffer manually whenever you send a command:
out.print(command
);
out.flush();
The flush
method empties the buffer and forwards all waiting characters to the destination.
The WebGet
program at the end of this section lets you retrieve any item from a web server. You need to specify the host and the item from the command line. For example,
java WebGet horstmann.com /
The /
item denotes the root page of the web server that listens to port 80 of the host horstmann.com
. Note that there is a space before the /
.
Flush the writer attached to a socket at the end of every command. Then the command is sent to the server, even if the writer's buffer is not completely filled.
The program simply establishes a connection to the host, sends a GET
command to the host, and then receives input from the server until the server closes its connection.
ch21/webget/WebGet.java
1
import java.io.InputStream;2
import java.io.IOException;3
import java.io.OutputStream;4
import java.io.PrintWriter;5
import java.net.Socket;6
import java.util.Scanner;7
8
/**9
This program demonstrates how to use a socket to communicate10
with a web server. Supply the name of the host and the11
resource on the command-line, for example,12
java WebGet horstmann.com index.html.13
*/14
public class WebGet15
{16
public static void main(String[] args) throws IOException17
{18
// Get command-line arguments19
20
String host;21
String resource;22
23
if (args.length == 2)24
{25
host = args[0];26
resource = args[1];27
}28
else29
{30
System.out.println("Getting / from horstmann.com");31
host = "horstmann.com";32
resource = "/";33
}34
35
// Open socket36
37
final int HTTP_PORT = 80;38
Socket s = new Socket(host, HTTP_PORT);39
40
// Get streams41
42
InputStream instream = s.getInputStream();43
OutputStream outstream = s.getOutputStream();44
45
// Turn streams into scanners and writers46
47
Scanner in = new Scanner(instream);48
PrintWriter out = new PrintWriter(outstream);49
50
// Send command51
52
String command = "GET " + resource + " HTTP/1.1 "53
+ "Host: " + host + " ";54
out.print(command);
55
out.flush();56
57
// Read server response58
59
while (in.hasNextLine())60
{61
String input = in.nextLine();62
System.out.println(input);63
}64
65
// Always close the socket at the end66
67
s.close();68
}69
}
Getting / from horstmann.com HTTP/1.1 200 OK Date: Thu, 17 Sep 2009 14:15:04 GMT Server: Apache/1.3.41 (Unix) Sun-ONE-ASP/4.0.2 ... Content-Length: 6654 Content-Type: text/html <html> <head><title>Cay Horstmann's Home Page</title></head> <body> <h1>Welcome to Cay Horstmann's Home Page</h1> ... </body> </html>
6. How do you open a socket to read e-mail from the POP server at e-mail.sjsu.edu
?
Now that you have seen how to write a network client, we will turn to the server side. In this section we will develop a server program that enables clients to manage a set of bank accounts in a bank.
Whenever you develop a server application, you need to specify some application-level protocol that clients can use to interact with the server. For the purpose of this example, we will create a "Simple Bank Access Protocol". Table 2 shows the protocol format. Of course, this is just a toy protocol to show you how to implement a server.
The server program waits for clients to connect to a particular port. We choose port 8888 for this service. This number has not been preassigned to another service, so it is unlikely to be used by another server program. To listen to incoming connections, you use a server socket. To construct a server socket, you need to supply the port number.
Table 21.2. A Simple Bank Access Protocol
Client Request | Server Response | Description |
---|---|---|
| n and the balance | Get the balance of account n |
| n and the new balance | Deposit amount a into account n |
| n and the new balance | Withdraw amount a from account n |
| None | Quit the connection |
ServerSocket server = new ServerSocket(8888);
The accept
method of the ServerSocket
class waits for a client connection. When a client connects, then the server program obtains a socket through which it communicates with the client.
Socket s = server.accept(); BankService service = new BankService(s, bank);
The ServerSocket
class is used by server applications to listen for client connections.
The BankService
class carries out the service. This class implements the Runnable
inter-face, and its run
method will be executed in each thread that serves a client connection. The run
method gets a scanner and writer from the socket in the same way as we discussed in the preceding section. Then it executes the following method:
public void doService() throws IOException { while (true) { if (!in.hasNext()) return; String command = in.next(); if (command.equals("QUIT")) return; executeCommand(command); } }
The executeCommand
method processes a single command. If the command is DEPOSIT
, then it carries out the deposit.
int account = in.nextInt(); double amount = in.nextDouble(); bank.deposit(account, amount);
The WITHDRAW
command is handled in the same way. After each command, the account number and new balance are sent to the client:
out.println(account + " " + bank.getBalance(account));
The doService
method returns to the run
method if the client closed the connection or the command equals "QUIT"
. Then the run
method closes the socket and exits.
Let us go back to the point where the server socket accepts a connection and constructs the BankService
object. At this point, we could simply call the run
method.
But then our server program would have a serious limitation: only one client could connect to it at any point in time. To overcome that limitation, server programs spawn a new thread whenever a client connects. Each thread is responsible for serving one client.
Our BankService
class implements the Runnable
interface. Therefore, the server program simply starts a thread with the following instructions:
Thread t = new Thread(service); t.start();
The thread dies when the client quits or disconnects and the run
method exits. In the meantime, the BankServer
loops back to accept the next connection.
while (true) { Socket s = server.accept(); BankService service = new BankService(s, bank); Thread t = new Thread(service); t.start(); }
The server program never stops. When you are done running the server, you need to kill it. For example, if you started the server in a shell window, hit Ctrl+C.
To try out the program, run the server. Then use Telnet to connect to localhost
, port number 8888. Start typing commands. Here is a typical dialog (see Figure 5):
DEPOSIT 3 1000 3 1000.0 WITHDRAW 3 500 3 500.0 QUIT
Alternatively, you can use a client program that connects to the server. You will find a sample client program at the end of this section.
1
import java.io.IOException;2
import java.net.ServerSocket;3
import java.net.Socket;4
5
/**6
A server that executes the Simple Bank Access Protocol.7
*/8
public class BankServer9
{10
public static void main(String[] args) throws IOException11
{12
final int ACCOUNTS_LENGTH = 10;13
Bank bank = new Bank(ACCOUNTS_LENGTH);14
final int SBAP_PORT = 8888;15
ServerSocket server = new ServerSocket(SBAP_PORT);16
System.out.println("Waiting for clients to connect ... ");17
18
while (true)19
{20
Socket s = server.accept();21
System.out.println("Client connected.");22
BankService service = new BankService(s, bank);23
Thread t = new Thread(service);24
t.start();25
}26
}27
}
ch21/bank/BankService.java
1
import java.io.InputStream;2
import java.io.IOException;3
import java.io.OutputStream;4
import java.io.PrintWriter;5
import java.net.Socket;6
import java.util.Scanner;7
8
/**9
Executes Simple Bank Access Protocol commands10
from a socket.11
*/12
public class BankService implements Runnable13
{14
private Socket s;15
private Scanner in;16
private PrintWriter out;17
private Bank bank;18
19
/**20
Constructs a service object that processes commands21
from a socket for a bank.22
@param aSocket the socket23
@param aBank the bank24
*/25
public BankService(Socket aSocket, Bank aBank)26
{
27
s = aSocket;28
bank = aBank;29
}30
31
public void run()32
{33
try34
{35
try36
{37
in = new Scanner(s.getInputStream());38
out = new PrintWriter(s.getOutputStream());39
doService();40
}41
finally42
{43
s.close();44
}45
}46
catch (IOException exception)47
{48
exception.printStackTrace();49
}50
}51
52
/**53
Executes all commands until the QUIT command or the54
end of input.55
*/56
public void doService() throws IOException57
{58
while (true)59
{60
if (!in.hasNext()) return;61
String command = in.next();62
if (command.equals("QUIT")) return;63
else executeCommand(command);64
}65
}66
67
/**68
Executes a single command.69
@param command the command to execute70
*/71
public void executeCommand(String command)72
{73
int account = in.nextInt();74
if (command.equals("DEPOSIT"))75
{76
double amount = in.nextDouble();77
bank.deposit(account, amount);78
}79
else if (command.equals("WITHDRAW"))80
{81
double amount = in.nextDouble();82
bank.withdraw(account, amount);
83
}84
else if (!command.equals("BALANCE"))85
{86
out.println("Invalid command");87
out.flush();88
return;89
}90
out.println(account + " " + bank.getBalance(account));91
out.flush();92
}93
}
1
/**2
A bank consisting of multiple bank accounts.3
*/4
public class Bank5
{6
private BankAccount[] accounts;7
8
/**9
Constructs a bank account with a given number of accounts.10
@param size the number of accounts11
*/12
public Bank(int size)13
{14
accounts = new BankAccount[size];15
for (int i = 0; i < accounts.length; i++)16
accounts[i] = new BankAccount();17
}18
19
/**20
Deposits money into a bank account.21
@param accountNumber the account number22
@param amount the amount to deposit23
*/24
public void deposit(int accountNumber, double amount)25
{26
BankAccount account = accounts[accountNumber];27
account.deposit(amount);28
}29
30
/**31
Withdraws money from a bank account.32
@param accountNumber the account number33
@param amount the amount to withdraw34
*/35
public void withdraw(int accountNumber, double amount)36
{37
BankAccount account = accounts[accountNumber];38
account.withdraw(amount);39
}40
41
/**42
Gets the balance of a bank account.43
@param accountNumber the account number44
@return the account balance45
*/46
public double getBalance(int accountNumber)47
{48
BankAccount account = accounts[accountNumber];49
return account.getBalance();50
}51
}
1
import java.io.InputStream;2
import java.io.IOException;3
import java.io.OutputStream;4
import java.io.PrintWriter;5
import java.net.Socket;6
import java.util.Scanner;7
8
/**9
This program tests the bank server.10
*/11
public class BankClient12
{13
public static void main(String[] args) throws IOException14
{15
final int SBAP_PORT = 8888;16
Socket s = new Socket("localhost", SBAP_PORT);17
InputStream instream = s.getInputStream();18
OutputStream outstream = s.getOutputStream();19
Scanner in = new Scanner(instream);20
PrintWriter out = new PrintWriter(outstream);21
22
String command = "DEPOSIT 3 1000 ";23
System.out.print("Sending: " + command);24
out.print(command);25
out.flush();26
String response = in.nextLine();27
System.out.println("Receiving: " + response);28
29
command = "WITHDRAW 3 500 ";30
System.out.print("Sending: " + command);31
out.print(command);32
out.flush();33
response = in.nextLine();34
System.out.println("Receiving: " + response);35
36
command = "QUIT ";37
System.out.print("Sending: " + command);38
out.print(command);39
out.flush();40
41
s.close();42
}43
}
Sending: DEPOSIT 3 1000 Receiving: 3 1000.0 Sending: WITHDRAW 3 500 Receiving: 3 500.0 Sending: QUIT
8.Can you read data from a server socket?
In Section 21.3, you saw how to use sockets to connect to a web server and how to retrieve information from the server by sending HTTP commands. However, because HTTP is such an important protocol, the Java library contains an URLConnection
class, which provides convenient support for the HTTP. The URLConnection
class takes care of the socket connection, so you don't have to fuss with sockets when you want to retrieve from a web server. As an additional benefit, the URLConnection
class can also handle FTP, the file transfer protocol.
The URLConnection
class makes it very easy to fetch a file from a web server given the file's URL as a string. First, you construct an URL
object from the URL in the familiar format, starting with the http
or ftp
prefix. Then you use the URL
object's openConnection()
method to get the URLConnection
object itself.
URL u = new URL("http://horstmann.com/index.html"); URLConnection connection = u.openConnection();
The URLConnection
class makes it easy to communicate with a web server without having to issue HTTP commands.
Then you call the getInputStream
method to obtain an input stream:
InputStream instream = connection.getInputStream();
You can turn the stream into a scanner in the usual way, and read input from the scanner.
The URLConnection
class can give you additional useful information. To understand those capabilities, we need to have a closer look at HTTP requests and responses. You saw in Section 21.2 that the command for getting an item from the server is
GET item HTTP/1.1
Host: hostname
blank line
You may have wondered why you need to provide a blank line. This blank line is a part of the general request format. The first line of the request is a command, such as GET
or POST
. The command is followed by request properties (such as Host:
). Some commands—in particular, the POST
command—send input data to the server. The reason for the blank line is to denote the boundary between the request property section and the input data section.
The URLConnection
and HttpURLConnection
classes can give you additional information about HTTP requests and responses.
A typical request property is If-Modified-Since
. If you request an item with
GETitem
HTTP/1.1 Host:hostname
If-Modified-Since:date
blank line
the server sends the item only if it is newer than the date. Browsers use this feature to speed up redisplay of previously loaded web pages. When a web page is loaded, the browser stores it in a cache directory. When the user wants to see the same web page again, the browser asks the server to get a new page only if it has been modified since the date of the cached copy. If it hasn't been, the browser simply redisplays the cached copy and doesn't spend time downloading another identical copy.
The URLConnection
class has methods to set request properties. For example, you can set the If-Modified-Since
property with the setIfModifiedSince
method:
connection.setIfModifiedSince(date
);
You need to set request properties before calling the getInputStream
method. The URLConnection
class then sends to the web server all the request properties that you set.
Similarly, the response from the server starts with a status line followed by a set of response parameters. The response parameters are terminated by a blank line and followed by the requested data (for example, an HTML page). Here is a typical response:
HTTP/1.1 200 OK Date: Tue, 24 Aug 2010 00:15:48 GMT Server: Apache/1.3.3 (Unix) Last-Modified: Sat, 26 Jun 2010 20:53:38 GMT Content-Length: 4813 Content-Type: text/html blank line requested data
Normally, you don't see the response code. However, you may have run across bad links and seen a page that contained a response code 404 Not Found
. (A successful response has status 200 OK
.)
To retrieve the response code, you need to cast the URLConnection
object to the HttpURLConnection
subclass. You can retrieve the response code (such as the number 200 in this example, or the code 404 if a page was not found) and response message with the getResponseCode
and getResponseMessage
methods:
HttpURLConnection httpConnection = (HttpURLConnection) connection; int code = httpConnection.getResponseCode(); // e.g., 404 String message = httpConnection.getResponseMessage(); // e.g., "Not found"
As you can see from the response example, the server sends some information about the requested data, such as the content length and the content type. You can request this information with methods from the URLConnection
class:
int length = connection.getContentLength(); String type = connection.getContentType();
You need to call these methods after calling the getInputStream
method.
To summarize: You don't need to use sockets to communicate with a web server, and you need not master the details of the HTTP protocol. Simply use the URLConnection
and HttpURLConnection
classes to obtain data from a web server, to set request parameters, or to obtain response information.
The program at the end of this section puts the URLConnection
class to work. The program fulfills the same purpose as that of Section 21.3—to retrieve a web page from a server—but it works at a higher level of abstraction. There is no longer a need to issue an explicit GET
command. The URLConnection
class takes care of that. Similarly, the parsing of the HTTP request and response headers is handled transparently to the programmer. Our sample program takes advantage of that fact. It checks whether the server response code is 200. If not, it exits. You can try that out by testing the program with a bad URL, like
. Then the program prints a server response, such as http://horstmann.com/wombat.html
404 Not Found
.
This program completes our introduction into Internet programming with Java. You have seen how to use sockets to connect client and server programs. You also saw how to use the higher-level URLConnection
class to obtain information from web servers.
ch21/urlget/URLGet.java
1
import java.io.InputStream;2
import java.io.IOException;3
import java.io.OutputStream;4
import java.io.PrintWriter;5
import java.net.HttpURLConnection;6
import java.net.URL;7
import java.net.URLConnection;8
import java.util.Scanner;9
10
/**11
This program demonstrates how to use an URL connection12
to communicate with a web server. Supply the URL on13
the command-line, for example14
java URLGet http://horstmann.com/index.html.15
*/16
public class URLGet17
{18
public static void main(String[] args) throws IOException19
{20
// Get command-line arguments21
22
String urlString;23
if (args.length == 1)24
urlString = args[0];25
else26
{27
urlString = "http://horstmann.com/";28
System.out.println("Using " + urlString);29
}30
31
// Open connection32
33
URL u = new URL(urlString);34
URLConnection connection = u.openConnection();35
36
// Check if response code is HTTP_OK (200)37
38
HttpURLConnection httpConnection39
= (HttpURLConnection) connection;40
int code = httpConnection.getResponseCode();41
String message = httpConnection.getResponseMessage();42
System.out.println(code + " " + message);43
if (code != HttpURLConnection.HTTP_OK)44
return;45
46
// Read server response47
48
InputStream instream = connection.getInputStream();49
Scanner in = new Scanner(instream);50
51
while (in.hasNextLine())52
{53
String input = in.nextLine();54
System.out.println(input);55
}56
}57
}
Using http://horstmann.com/ 200 OK <html> <head><title>Cay Horstmann's Home Page</title></head> <body> <h1>Welcome to Cay Horstmann's Home Page</h1> ... </body> </html>
What happens if you use the URLGet
program to request an image (such as
)?http://horstmann.com/cay-tiny.gif
Describe the IP and TCP protocols.
The Internet
is a worldwide collection of networks, routing equipment, and computers using a common set of protocols to define how each party will interact with each other.
TCP/IP is the abbreviation for Transmission Control Protocol over Internet Protocol
, the pair of communication protocols used to establish reliable transmission of data between two computers on the Internet.
A TCP connection requires the Internet addresses and port numbers of both end points.
Describe the HTTP protocol.
HTTP, or Hypertext Transfer Protocol, is the protocol that defines communication between web browsers and web servers.
An URL, or Uniform Resource Locator, is a pointer to an information resource (such as a web page or an image) on the World Wide Web.
The Telnet program is a useful tool for establishing test connections with servers.
The HTTP GET
command requests information from a web server. The web server returns the requested item, which may be a web page, an image, or other data.
Implement programs that use network sockets for reading data.
A socket is an object that encapsulates a TCP connection. To communicate with the other end point of the connection, use the input and output streams attached to the socket.
When transmission over a socket is complete, remember to close the socket.
For text protocols, turn the socket streams into scanners and writers.
Flush the writer attached to a socket at the end of every command. Then the command is sent to the server, even if the writer's buffer is not completely filled.
Implement programs that serve data over a network.
The ServerSocket
class is used by server applications to listen for client connections.
Use the URLConnection class to read data from a web server.
The URLConnection
class makes it easy to communicate with a web server without having to issue HTTP commands.
The URLConnection
and HttpURLConnection
classes can give you additional information about HTTP requests and responses.
R21.1 What is a server? What is a client? How many clients can connect to a server at one time?
R21.2 What is a socket? What is the difference between a Socket
object and a ServerSocket
object?
R21.3 Under what circumstances would an UnknownHostException
be thrown?
R21.4 What happens if the Socket
constructor's second parameter is not the same as the port number at which the server waits for connections?
R21.5 When a socket is created, which Internet address is used?
The address of the computer to which you want to connect
The address of your computer
The address of your ISP
R21.6 What is the purpose of the accept
method of the ServerSocket
class?
R21.7 After a socket establishes a connection, what mechanism will your client program use to read data from the server computer?
The Socket
will fill a buffer with bytes.
You will use a Reader
obtained from the Socket
.
You will use an InputStream
obtained from the Socket
.
R21.8 Why is it not common to work directly with the InputStream
and OutputStream
objects obtained from a Socket
object?
R21.9 When a client program communicates with a server, it sometimes needs to flush the output stream. Explain why.
R21.10 What is the difference between HTTP and HTML?
R21.11 How can you communicate with a web server without using sockets?
R21.12 What is the difference between an URL
instance and an URLConnection
instance?
R21.13 What is an URL? How do you create an object of class URL
? How do you connect to an URL?
P21.1 Modify the WebGet
program to print only the HTTP header of the returned HTML page. The HTTP header is the beginning of the response data. It consists of several lines, such as
HTTP/1.1 200 OK Date: Tue, 15 Jun 2010 16:10:34 GMT Server: Apache/1.3.19 (Unix) Cache-Control: max-age=86400 Expires: Wed, 16 Jun 2010 16:10:34 GMT Connection: close Content-Type: text/html
P21.2 Modify the WebGet
program to print only the title of the returned HTML page. An HTML page has the structure
<html><head><title> ... </title></head><body> ... </body></html>
For example, if you run the program by typing at the command line java WebGet
horstmann.com/
, the output should be the title of the root web page at horstmann.com
, such as Cay Horstmann's Home Page
.
P21.3 Modify the BankServer
program so that it can be terminated more elegantly. Provide another socket on port 8889 through which an administrator can log in. Support the commands LOGIN
password, STATUS, PASSWORD
newPassword, LOGOUT
, and SHUTDOWN
. The STATUS
command should display the total number of clients that have logged in since the server started.
P21.4 Modify the BankServer
program to provide complete error checking. For example, checking to make sure that there is enough money in the account when withdrawing. Send appropriate error reports back to the client. Enhance the protocol to be similar to HTTP, in which each server response starts with a number indicating the success or failure condition, followed by a string with response data or an error description.
P21.5 Write a client application that executes an infinite loop that does the following: (a) prompts the user for a number, (b) sends that value to the server, (c) receives the number, and (d) displays the new number. Also write a server that executes an infinite loop whose body accepts a client connection, reads a number from the client, computes its square root, and writes the result to the client.
P21.6 Implement a client-server program in which the client will print the date and time given by the server. Two classes should be implemented: DateClient
and DateServer
. The DateServer
simply prints new Date().toString()
whenever it accepts a connection and then closes the socket.
P21.7 Write a program to display the protocol, host, port, and file components of an URL. Hint: Look at the API documentation of the URL
class.
P21.8 Write a simple web server that recognizes only the GET
request (without the Host:
request parameter and blank line). When a client connects to your server and sends a command, such as GET
filename HTTP/1.1
, then return a header followed by a blank line and all lines in the file. If the file doesn't exist, return 404 Not Found
instead.
HTTP/1.1 200 OK
Your server should listen to port 8080. Test your web server by starting up your web browser and loading a page, such as localhost:8080/c:cs1myfile.html
.
P21.9 Write a chat server and client program. The chat server accepts connections from clients. Whenever one of the clients sends a chat message, it is displayed for all other clients to see. Use a protocol with three commands: LOGIN
name, CHAT
message, and LOGOUT
.
P21.10 A query such as
http://aa.usno.navy.mil/cgi-bin/aa_moonphases.pl?year=2011
returns a page containing the moon phases in a given year. Write a program that asks the user for a year, month, and day and then prints the phase of the moon on that day.
Project 21.1 Write a program that allows several people to play a networked game. Each player connects to a game server. Each player's move is transmitted to the game server. The game server checks that the move is valid and informs all client programs of the updated game status.
You can either implement your favorite multiplayer game, or simply use Poker (see http://www.rgpfaq.com/basic-rules.html
for the rules). Extra credit if your code is structured to separate the generic mechanism that is required for all games and the specific rules of a particular game.
Project 21.2 Write a program that allows a user to query the CIA World Fact Book (
) for facts about a country, such as the size, average income, capital city, and so on. To get the answers for user queries, connect to the web site, retrieve the web page, and extract the requested information. You will find that task simpler if you access the text version of the fact book.http://www.cia.gov/cia/publications/factbook
An IP address is a numerical address, consisting of four or sixteen bytes. A domain name is an alphanumeric string that is associated with an IP address.
TCP is reliable but somewhat slow. When sending sounds or images in real time, it is acceptable if a small amount of the data is lost. But there is no point in transmitting data that is late.
The browser software translates your requests (typed URLs and mouse clicks on links) into HTTP commands that it sends to the appropriate web servers.
Some Telnet implementations send all keystrokes that you type to the server, including the backspace key. The server does not recognize a character sequence such as G W Backspace E T as a valid command.
The program makes a connection to the server, sends the GET
request, and prints the error message that the server returns.
Socket s = new Socket("e-mail.sjsu.edu", 110);
Port 80 is the standard port for HTTP. If a web server is running on the same computer, then one can't open a server socket on an open port.
No, a server socket just waits for a connection and yields a regular Socket
object when a client has connected. You use that socket object to read the data that the client sends.
The URLConnection
class understands the HTTP protocol, freeing you from assembling requests and analyzing response headers.
The bytes that encode the images are displayed on the console, but they will appear to be random gibberish.
3.137.164.24