Caching is the ability to store frequently accessed data (a response in this context) to serve the client requests, and never having to generate the same response more than once until it needs to be. Well-managed caching eliminates partial or complete client-server interactions and still serves the client with the expected response. Obviously, caching brings scalability and also performance benefits with faster response times and reduced server load.
As you can see in the next diagram, the service consumer (Client) receives the response from the cache and not from the server itself, and a few other responses are directly from the server as well. So, caching helps with the partial or complete elimination of some interactions between the service consumers and so helps to improve efficiency and performance (reduced latency time in response):
There are different caching strategies or mechanisms available, such as browser caches, proxy caches, and gateway caches (reverse-proxy), and there are several ways that we can control the cache behavior, such as through pragma, expiration tags, and so on. The following table gives a glimpse of the various cache control headers one use to can fine-tune cache behaviors:
Headers |
Description |
Samples |
Expires |
Header attribute to represent date/time after which the response is considered stale |
Expires: Fri, 12 Jan 2018 18:00:09 GMT |
Cache-control |
A header that defines various directives (for both requests and responses) that are followed by caching mechanisms |
Max age=4500, cache-extension |
E-Tag |
Unique identifier for server resource states |
ETag:uqv2309u324klm |
Last-modified |
Response header helps to identify the time the response was generated |
Last-modified: Fri, 12 Jan 2018 18:00:09 GMT |
For more about cache-control directives. Please refer to https://tools.ietf.org/html/rfc2616#section-14.9.