HTTP headers

Requests, and responses are made up of two main parts, headers and a body. We briefly saw some HTTP headers when we used our TCP RFC downloader in Chapter 1, Network Programming and Python. Headers are the lines of protocol-specific information that appear at the beginning of the raw message that is sent over the TCP connection. The body is the rest of the message. It is separated from the headers by a blank line. The body is optional, its presence depends on the type of request or response. Here's an example of an HTTP request:

GET / HTTP/1.1
Accept-Encoding: identity
Host: www.debian.com
Connection: close
User-Agent: Python-urllib/3.4

The first line is called the request line. It is comprised of the request method, which is GET in this case, the path to the resource, which is / here, and the HTTP version, 1.1. The rest of the lines are request headers. Each line is comprised of a header name followed by a colon and a header value. The request in the preceding output only contains headers, it does not have a body.

Headers are used for several purposes. In a request they can be used for passing extra data, such as cookies and authorization credentials, and for asking the server for preferred formats of resources.

For example, an important header is the Host header. Many web server applications provide the ability to host more than one website on the same server using the same IP address. DNS aliases are set up for the various website domain names, so they all point to the same IP address. Effectively, the web server is given multiple hostnames, one for each website it hosts. IP and TCP (which HTTP runs on), can't be used to tell the server which hostname the client wants to connect to because both of them operate solely on IP addresses. The HTTP protocol allows the client to supply the hostname in the HTTP request by including a Host header.

We'll look at some more request headers in the following section.

Here's an example of a response:

HTTP/1.1 200 OK
Date: Sun, 07 Sep 2014 19:58:48 GMT
Content-Type: text/html
Content-Length: 4729
Server: Apache
Content-Language: en

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
...

The first line contains the protocol version, the status code, and the status message. Subsequent lines contain the headers, a blank line, and then the body. In the response, the server can use headers to inform the client about things such as the length of the body, the type of content the response body contains, and the cookie data that the client should store.

Do the following to view a response object's headers:

>>> response = urlopen('http://www.debian.org)
>>> response.getheaders()
[('Date', 'Sun, 07 Sep 2014 19:58:48 GMT'), ('Server', 'Apache'), ('Content-Location', 'index.en.html'), ('Vary', 'negotiate,accept- language,Accept-Encoding')...

The getheaders() method returns the headers as a list of tuples of the form (header name, header value). A complete list of HTTP 1.1 headers and their meanings can be found in RFC 7231. Let's look at how to use some headers in requests and responses.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.148.108.112