The HTTP protocol

The Hypertext Transport Protocol (HTTP) is built on top of the TCP. When you type a URL in a browser, the browser opens a TCP channel to the server (after DNS lookup, of course) and sends a HTTP request to the web server. The server, after receiving the request, produces a response and sends it to the client. After that, the TCP channel may be closed or kept alive for further HTTP request-response pairs.

Both the request and the response contain a header and an optional (possibly zero-length) body. The header is in the text format, and it is separated from the body by an empty line.

More precisely the header and the body are separated by four bytes: 0x0D, 0x0A, 0x0D, and 0x0A, which are two CR, LF line separators. The HTTP protocol uses carriage return and line feed to terminate lines in the header, and thus, an empty line is two CRLF following each other.

The start of the header is a status line plus header fields. The following is a sample HTTP request:

GET /html/rfc7230 HTTP/1.1 
Host: tools.ietf.org
Connection: keep-alive
Pragma: no-cache
Cache-Control: no-cache
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
DNT: 1
Referer: https://en.wikipedia.org/
Accept-Encoding: gzip, deflate, sdch, br
Accept-Language: en,hu;q=0.8,en-US;q=0.6,de;q=0.4,en-GB;q=0.2

The following is the response:

HTTP/1.1 200 OK 
Date: Tue, 04 Oct 2016 13:06:51 GMT
Server: Apache/2.2.22 (Debian)
Content-Location: rfc7230.html
Vary: negotiate,Accept-Encoding
TCN: choice
Last-Modified: Sun, 02 Oct 2016 07:11:54 GMT
ETag: "225d69b-418c0-53ddc8ad0a7b4;53e09bba89b1f"
Accept-Ranges: bytes
Cache-Control: max-age=604800
Expires: Tue, 11 Oct 2016 13:06:51 GMT
Content-Encoding: gzip
Strict-Transport-Security: max-age=3600
X-Frame-Options: SAMEORIGIN
X-Xss-Protection: 1; mode=block
X-Content-Type-Options: nosniff
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: text/html; charset=UTF-8

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head profile="http://dublincore.org/documents/2008/08/04/dc-html/">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="robots" content="index,follow" />

The request does not contain a body. The status line is as follows:

GET /html/rfc7230 HTTP/1.1

It contains the so-called method of the request, the object that is requested, and the protocol version used by the request. The rest of the request of the header contains header fields that have the format, label : value. Some of the lines are wrapped in the printed version, but there is no line break in a header line.

The response specifies the protocol it uses (usually the same as the request), the status code, and the message format of the status:

HTTP/1.1 200 OK

After this, the response fields come with the same syntax as in the request. One important header is the content type:

Content-Type: text/html; charset=UTF-8

It specifies that the response body (truncated in the printout) is HTML text.

The actual request was sent to the URL, https://tools.ietf.org/html/rfc7230, which is the standard that defines the 1.1 version of HTTP. You can easily look into the communication yourself, starting up the browser and opening the developer tools. Such a tool is built into every browser these days. You can use it to debug the program behavior on the network application level looking at the actual HTTP requests and responses on the byte level. The following screenshot shows how the developer tool shows this communication:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.60.249