Data on the web is transferred using the HTTP application layer protocol. Normal communication in HTTP is a request/response model where the communication between a client and a server is coordinated by a set of rules. The client requests for a certain resource to the server and then receives a status code that specifies the current status of the requested resource. If available then, the resource is also sent along with the status code. HTTP is one of the most popular and most widely used protocols to transfer data requested by browsers from the respective servers. The world of Internet is mostly governed by HTTP that runs on the transport layer.
Every time you visit a website, this smart protocol takes care of your web-browsing experience. Web server utilizes the HTTP protocol to serve web pages they contain to the requesting clients. At the beginning of every HTTP session, the TCP three-way handshake takes place. It creates a dedicated channel between the communicating hosts followed by HTTP and data packets, which are sent in and received while the session is active. For instance, you are visiting a web server located at http://172.16.136.129
and the client at 172.16.136.1
. Using our client-server infrasrtucture, we will try to capture the requests sent and responses received.
I will try to visit the home page located at the server mentioned earlier and will capture the traffic generated for the whole session, that is, requests sent and responses received. Follow the actions mentioned here to replicate the scenario.
http://172.16.136.129
(Don't get confused because of the IP address I am using to visit a webserver. While studying DNS remove, we discussed that it is just a way to locate a webserver that is assigned with an IP address.). Press Enter to go to the home page. Here is the screenshot of the home page I am visiting:All these packets get generated as soon as you press Enter. As you can see, the first three packets are TCP three-way handshake packets where our client is requesting the server to create a dedicated channel. In our case, the connection was successful. However, if the server daemon wasn't running or because of any reason the server is not accepting our requests, then we could have seen RST
ACK
packets, like the one shown here:
This error states that the server is out of service or perhaps the server is not supposed to respond to our requests.
Host
argument that is required by the HTTP/1.1
protocol requests. The value of this field is the webserver's address that you typed in the address bar of the browser.ACCEPT
parameter that mentions what kind of content is acceptable by the requesting client in response.If-modified-since
parameter is sent from the client to the server, which includes the date and time of your previous request made to the server. If the server contents have been changed since your previous request, then you will receive the new updated page. Otherwise, your system will present you with the locally cached page that will eventually save some resources.User-Agent
, which specifies the browser-related information that you are using to visit the webpage. This information will be used by the server to present you with browser-compatible content.Accept-Language
and Accept-Encoding
are passed on to the server to inform us of what type of content is acceptable to the client. So, while the server prepares the response material, these things should be taken into consideration.Connection-Alive
parameter specifies that the client wishes to keep the connection working after this particular request has been processed.All the HTTP packets are sent most commonly to the webserver at port 80
(other common webserver ports are 8080
, 3132
, 8088
and so on. which are being dissected by Wireshark as per HTTP protocol preferences).
ACK
for the resource it received.304
in our case, which specifies that the requested resource did not change since the time mentioned in the Date
parameter), and finally, a brief description about the status code (Not Modified
in our case).Server
parameter mentions the name and version of the web server running. We can see that Apache/2.2.22
is the server that is located at 172.16.136.129
.This is a very basic example to check out the request and responses exchanged between the client and the server. However, this basic thing is what actually happens every time you visit a website. As stated earlier, we receive a status code followed by a brief description in response. With every tab you open in your browser, there will be a new socket created between a client and a server connected through an IP address and the port number on which the web server runs.
All the details mentioned earlier are part of a normal traffic pattern. What we are about to witness is some unusual traffic pattern that you might face while dealing with HTTP. I will try to mention some do's and don'ts, which might prove helpful to you while troubleshooting and analyzing HTTP. Most of the HTTP problems revolve around errors such as 404
, some kind of redirection, DNS resolution problems, and server-related issues. Let me explain each scenario in detail.
For instance, you are visiting a web server, and you are looking for something that is currently not available or the requested resource's location has been changed. In such cases, you will receive a 404
status code, which denotes that the requested resource is not found on the server. Refer to the following screenshot where I tried to request for a file named abc.txt
on a web server that does not exist:
On the list pane, you can see that the requested resource is not available. So, we get 404 Not Found Error
. Such errors could be malicious too if someone is trying to perform directory listing on your webserver. Changing the coloring rules of such 404 packets to something different other than the normal HTTP packets rules will get our attention quickly. As you can see, packet number eight is a HTTP packet, applied with a different coloring scheme.
Redirection of the user's request is often done when a certain requested resource location has been changed to another address or the resource isn't available. Now, to make you understand redirection, I have made some changes in our infrastructure that can be easily seen in the diagram shown here:
Now, the request from the client sent to the original server at 192.168.1.104
will be redirected to a new server located at 192.168.1.103
without any further efforts by the client. To configure redirection, you have to modify your server's configuration file. The following captured packets depict the redirection happened. Refer to the next list pane in Figure 4.18:
As you can see, a TCP handshake was initiated with the old server at 104
followed by an HTTP GET request. The server at 104
responded with a 302 Found
response in packet 21
, which is an indication of redirection. Our request was sent to the new server located at 103
with whom we again initiated the TCP three-way handshake (packet 31
). After packet 31
, the destination field was changed to the new server's address.
On investigating packet 21
further, we can see the content that redirected our request to the new server. Expand the Line-based text data
section under the HTTP section of the details pane for packet 21
. Refer to the following screenshot:
We have already discussed DNS resolution problems in the DNS protocol section. For example, if the requested web server is not able to resolve your request using your internal DNS server as well as other external servers, then you won't be able to visit the website. Even if the DNS servers are working fine and you are not able to visit the site, then congestion can be the problem, where a server is not able to process multiple requests at the same time. This will result in errors such as 408 time-out requests
, 429 Too Many requests
, or even 404 not found
. The world of HTTP is enormous, and day-to-day situations can differ from person to person. The most important fact that you should keep in mind is that if all your basic-level concepts are clear, then only it would be an easy to do the job you have been assigned. Nothing can beat common sense with out-of-the-box thinking.
3.21.105.193