Hyper Text Transfer Protocol

Data on the web is transferred using the HTTP application layer protocol. Normal communication in HTTP is a request/response model where the communication between a client and a server is coordinated by a set of rules. The client requests for a certain resource to the server and then receives a status code that specifies the current status of the requested resource. If available then, the resource is also sent along with the status code. HTTP is one of the most popular and most widely used protocols to transfer data requested by browsers from the respective servers. The world of Internet is mostly governed by HTTP that runs on the transport layer.

How it works – request/response

Every time you visit a website, this smart protocol takes care of your web-browsing experience. Web server utilizes the HTTP protocol to serve web pages they contain to the requesting clients. At the beginning of every HTTP session, the TCP three-way handshake takes place. It creates a dedicated channel between the communicating hosts followed by HTTP and data packets, which are sent in and received while the session is active. For instance, you are visiting a web server located at http://172.16.136.129 and the client at 172.16.136.1. Using our client-server infrasrtucture, we will try to capture the requests sent and responses received.

I will try to visit the home page located at the server mentioned earlier and will capture the traffic generated for the whole session, that is, requests sent and responses received. Follow the actions mentioned here to replicate the scenario.

Request

  • Open your browser, and type the Uniform Resource Locator (URL) of any website that you want to visit. In my case, the website is located at http://172.16.136.129 (Don't get confused because of the IP address I am using to visit a webserver. While studying DNS remove, we discussed that it is just a way to locate a webserver that is assigned with an IP address.). Press Enter to go to the home page. Here is the screenshot of the home page I am visiting:
    Request
  • Due to the our preceding actions, a couple of packets are generated that are captured by Wireshark. Let's have a look at the list pane shown in the following screenshot:
    Request

    Figure 4.13: Packets captured by Wireshark

    All these packets get generated as soon as you press Enter. As you can see, the first three packets are TCP three-way handshake packets where our client is requesting the server to create a dedicated channel. In our case, the connection was successful. However, if the server daemon wasn't running or because of any reason the server is not accepting our requests, then we could have seen RST ACK packets, like the one shown here:

    Request

    Figure 4.14:RST and ACK packets, as server not accepting the requests

    This error states that the server is out of service or perhaps the server is not supposed to respond to our requests.

  • After the TCP packets, you can see the first HTTP request sent by our client. Every request comprises a couple of elements that are sent to the server:
    Request

    Figure 4.15: HTTP request

  • This is how a request looks. In the first line, there are three things passed on to the server as the arguments, which are HTTP method and requested resource location "/" (root directory)
  • The second line specifies the Host argument that is required by the HTTP/1.1 protocol requests. The value of this field is the webserver's address that you typed in the address bar of the browser.
  • The fourth line is the ACCEPT parameter that mentions what kind of content is acceptable by the requesting client in response.
  • The If-modified-since parameter is sent from the client to the server, which includes the date and time of your previous request made to the server. If the server contents have been changed since your previous request, then you will receive the new updated page. Otherwise, your system will present you with the locally cached page that will eventually save some resources.
  • The next field is User-Agent, which specifies the browser-related information that you are using to visit the webpage. This information will be used by the server to present you with browser-compatible content.
  • Parameters such as Accept-Language and Accept-Encoding are passed on to the server to inform us of what type of content is acceptable to the client. So, while the server prepares the response material, these things should be taken into consideration.
  • The Connection-Alive parameter specifies that the client wishes to keep the connection working after this particular request has been processed.

All the HTTP packets are sent most commonly to the webserver at port 80 (other common webserver ports are 8080, 3132, 8088 and so on. which are being dissected by Wireshark as per HTTP protocol preferences).

Response

  • As you can see, after the fourth packet, the server acknowledges the client's request to get to the server's web root directory. The server starts transmitting the resource that client requested for. The sixth packet in the list pane is what the client received, a status code followed by a short message, including the content of the resource requested. Refer to the following Figure 4.16 illustrating the HTTP response:
    Response

    Figure 4.16: HTTP response

  • As a part of TCP communication, the client will acknowledge every packet sent by the server. It can be seen in the seventh packet that the client is trying to send an ACK for the resource it received.
  • Let's dissect the response elements for packet number six. The first line consists of three arguments sent in response. They denote the HTTP protocol version in use, the status code (304 in our case, which specifies that the requested resource did not change since the time mentioned in the Date parameter), and finally, a brief description about the status code (Not Modified in our case).
  • In the third line, the Server parameter mentions the name and version of the web server running. We can see that Apache/2.2.22 is the server that is located at 172.16.136.129.
  • The fourth and fifth lines state that the server wishes to keep the connection alive. The duration for which the server wishes to do so is also mentioned in the next line of the parameters sent in response to us. Rest of the content is mentioned in the next few lines are some configuration parameters.

This is a very basic example to check out the request and responses exchanged between the client and the server. However, this basic thing is what actually happens every time you visit a website. As stated earlier, we receive a status code followed by a brief description in response. With every tab you open in your browser, there will be a new socket created between a client and a server connected through an IP address and the port number on which the web server runs.

Unusual HTTP traffic

All the details mentioned earlier are part of a normal traffic pattern. What we are about to witness is some unusual traffic pattern that you might face while dealing with HTTP. I will try to mention some do's and don'ts, which might prove helpful to you while troubleshooting and analyzing HTTP. Most of the HTTP problems revolve around errors such as 404, some kind of redirection, DNS resolution problems, and server-related issues. Let me explain each scenario in detail.

For instance, you are visiting a web server, and you are looking for something that is currently not available or the requested resource's location has been changed. In such cases, you will receive a 404 status code, which denotes that the requested resource is not found on the server. Refer to the following screenshot where I tried to request for a file named abc.txt on a web server that does not exist:

Unusual HTTP traffic

Figure 4.17 : HTTP 404

On the list pane, you can see that the requested resource is not available. So, we get 404 Not Found Error. Such errors could be malicious too if someone is trying to perform directory listing on your webserver. Changing the coloring rules of such 404 packets to something different other than the normal HTTP packets rules will get our attention quickly. As you can see, packet number eight is a HTTP packet, applied with a different coloring scheme.

Redirection of the user's request is often done when a certain requested resource location has been changed to another address or the resource isn't available. Now, to make you understand redirection, I have made some changes in our infrastructure that can be easily seen in the diagram shown here:

Unusual HTTP traffic

Now, the request from the client sent to the original server at 192.168.1.104 will be redirected to a new server located at 192.168.1.103 without any further efforts by the client. To configure redirection, you have to modify your server's configuration file. The following captured packets depict the redirection happened. Refer to the next list pane in Figure 4.18:

Unusual HTTP traffic

Figure 4.18: HTTP redirection

As you can see, a TCP handshake was initiated with the old server at 104 followed by an HTTP GET request. The server at 104 responded with a 302 Found response in packet 21, which is an indication of redirection. Our request was sent to the new server located at 103 with whom we again initiated the TCP three-way handshake (packet 31). After packet 31, the destination field was changed to the new server's address.

On investigating packet 21 further, we can see the content that redirected our request to the new server. Expand the Line-based text data section under the HTTP section of the details pane for packet 21. Refer to the following screenshot:

Unusual HTTP traffic

We have already discussed DNS resolution problems in the DNS protocol section. For example, if the requested web server is not able to resolve your request using your internal DNS server as well as other external servers, then you won't be able to visit the website. Even if the DNS servers are working fine and you are not able to visit the site, then congestion can be the problem, where a server is not able to process multiple requests at the same time. This will result in errors such as 408 time-out requests, 429 Too Many requests, or even 404 not found. The world of HTTP is enormous, and day-to-day situations can differ from person to person. The most important fact that you should keep in mind is that if all your basic-level concepts are clear, then only it would be an easy to do the job you have been assigned. Nothing can beat common sense with out-of-the-box thinking.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.21.105.193