One of the most important things while designing a new website or application is to think of how the user will eventually access your information. One of the corner stones of the Web is the use of proxy servers that cache things for the end user, usually done by Internet Service Providers (ISP) to improve the user experience of the Web. Web browsers also cache things locally to improve the speed of frequently accessed websites.
This is done by using four different HTTP headers that will command how the proxy servers and web browsers cache content:
Last-Modified
ETag
Expires
Cache-Control
We will go over these headers next, and how to manipulate them to improve the caching of your website or application.
Most web servers nowadays return valid content automatically for these two headers, which are used by user agents (web browsers and proxy servers) to check whether a given URL should be cached or not. The following would be a good example of a HTTP response that includes these two headers:
HTTP/1.1 200 OK Date: Thu, 29 Sep 2005 18:54:26 GMT Server: Apache/1.3.33 (Unix) Cache-Control: max-age=3600, must-revalidate Expires: Sat, 29 Oct 2005 18:54:26 GMT Last-Modified: Thu, 29 Sep 2005 18:54:26 GMT ETag: “6e93-130-3852bdde” Content-Length: 3489 Content-Type: text/html
The Last-Modified
header is used to tell the user agent when this content was generated and it will be re-used later when the user agent needs to check whether the server has a fresher version of the content than the one it has cached locally. This check is done by yet another HTTP header, called If-Modified-Since
, which will be passed to the server, and will help decide whether the local cache should be used or not. Here’s a simple example:
GET /cnn/images/1.gif HTTP/1.1
Host: i.cnn.net
Accept: image/png,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
If-Modified-Since: Tue, 27 May 2003 19:00:10 GMT
Cache-Control: max-age=0
As you can see above, the 1.gif
image was probably previously returned with a Last-Modified
value of Tue, 27 May 2003 19:00:10 GMT
, and that value is being sent back to the server to see whether the web browser should use the local cached copy of this image, or if it should request a new copy. Here’s the response from the server:
HTTP/1.x 304 Not Modified
Date: Sat, 03 Sep 2005 04:15:42 GMT
Content-Type: image/gif
Last-Modified: Tue, 27 May 2003 19:00:10 GMT
Age: 987
Connection: keep-alive
The 304
response code is what tells the web browser to simply re-use the local cached copy.
The ETag
header is used in a similar way to the Last-Modified
one, but this time a unique identifier created by the server that relates to that specific version of the content. Here’s an example HTTP request:
GET /cnn/.element/img/1.3/searchbar/bg.gif HTTP/1.1
Host: i.a.cnn.net
Accept: image/png,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
If-Modified-Since: Wed, 09 Mar 2005 17:32:07 GMT
If-None-Match: “d0b3114c-1b1-9b-0”
Cache-Control: max-age=0
The If-None-Match
header value above is what is passed by the user agent to the server, to check whether the local cached copy of this image is still valid or not. If it is, the web server will return a 304
response code as it did for the If-Modified-Since
example we described before:
HTTP/1.x 304 Not Modified Content-Type: image/gif Last-Modified: Wed, 09 Mar 2005 17:32:07 GMT Etag: “d0b3114c-182-9b-0” Date: Sat, 03 Sep 2005 04:44:30 GMT Connection: close
As you can see, the server will return the 304
response code, and the same ETag
value, to demonstrate to the user agent that its local copy is still valid and it should be used.
This header allows you to set an expiration date for a particular piece of content on your website or application. It is most useful for static content, such as images for buttons and logos. Since these things don’t tend to change very often, you can set a very long expiration date on them, and it will improve the performance of your pages since some of their content will be delivered through caches.
The allowed value for this header is an HTTP date set to Greenwich Mean Time (GMT). That alone brings a potential problem, since depending on how synchronized the clocks on your web server and the cache user agent are, you might get stale content that is deemed fresh. Here’s an example of the use of the Expires
header:
HTTP/1.x 200 OK
Date: Sat, 03 Sep 2005 05:06:24 GMT
Server: Apache
Content-Type: text/html
Last-Modified: Sat, 03 Sep 2005 05:06:17 GMT
Cache-Control: max-age=60, private
Expires: Sat, 03 Dec 2005 05:06:24 GMT
Content-Encoding: gzip
Content-Length: 13606
Connection: close
As you can see above, the Expires
header is being set to 3 months in the future, which is just fine in this specific case since it is the response for a navigation bar image that doesn’t get changed very often, if at all.
This header was introduced with HTTP 1.1 to provide more control to proxy servers and overall cache maintenance. There are several values that can be used for this response header:
Header Value |
Description |
---|---|
Amount of time for which the cached content will be deemed valid. | |
Same as above header value, but only appropriate to proxy caches. | |
Flags authenticated responses as cacheable. | |
Instructs caches to submit the request to the web server before releasing any content. Useful in combination with authenticated requests. | |
Prevents caches from storing any content. | |
Forces caches to obey your expiration rules. | |
Similar to |
Here’s an example of how to mark a specific URL as being cacheable, but still force caches to check with the web server before releasing any content:
Cache-Control: public, no-cache, must-revalidate
3.149.213.209