Chapter 23. Configuring Caching

<feature><title>What You’ll Learn in This Hour</title> <objective>

How to configure cache backends

</objective>
<objective>

How to implement a per-site cache

</objective>
<objective>

How to implement a per-view cache

</objective>
<objective>

How to manage access to cached pages

</objective>
</feature>

One of Django’s best features is that it lets you easily generate web pages on-the-fly. It saves you a lot of development and maintenance time. However, this advantage comes at a price. Each time a request comes in, Django spends many cycles in database queries and page generation. On small to medium websites, this isn’t much of a problem. However, on large sites that receive several requests per second, this can quickly become a problem.

Django’s caching framework solves this problem nicely. Using the cache framework, you can cache web pages and objects so that subsequent requests for the same data can quickly be drawn from the cache rather than performing the resource-intensive query and processing again.

In this hour, we will discuss configuring caching and the different types of backends that are available. We will then discuss how to implement cache at the site, view, and object levels. Finally, we will cover how to use the response headers to manage caching in upstream caches.

Configuring Caching Backends

Django’s caching system is easy to implement. The only thing you need to do is define in the settings.py file which backend you want Django to use for caching. Django includes several backends that you can use depending on your particular needs. To enable a caching backend, set the CACHE_BACKEND setting in the settings.py file. For example, to enable local memory caching, you would use the following setting in the settings.py file:

CACHE_BACKEND = 'locmem:///'

You can also configure how long Django caches data. You can add the following arguments to the CACHE_BACKEND setting:

  • timeout is the amount of time in seconds that the data is cached. The default is 300.

  • max_entries is the maximum number of entries allowed in the cache before old values are removed. The default is 300.

  • cull_percentage specifies the percentage of entries that are removed from the cache when the max_entries limit is reached. The default is 3, meaning that one in three entries are removed. If you specify a value of 0, the entire cache is emptied.

For example, the following setting keeps data in the cache for 2 minutes and allows 200 cached entries:

CACHE_BACKEND = 'locmem:///?timeout=120&max_entries=200'

The following sections describe each of the available backends you can configure for your website.

Database Backend

The database backend allows you to create a table in the database that can then be used to store and retrieve cached data. An advantage of using the database backend is that the cached data is persistently stored and is available even after a server reboot.

Before you can enable the database backend, you need to create a table in the database to store the cache using Python’s createcachetable application at the root of your project. The table can be given any valid table name as long as the database doesn’t already have a table with that name. For example, the following command creates a database backend table called mysitecache:

python manage.py createcachetable mysitecache

To enable the database backend in the settings.py file, set the CACHE_BACKEND to the db:// backend and provide the name of the cache table. For example, to enable the database backend using the table just listed, you would use the following setting:

CACHE_BACKEND = 'db://mysitecache'

By the Way

The database backend uses the same database for the cache that you will use for the rest of the website. This can adversely affect the database’s performance if your site receives a lot of hits.

File System Backend

The file system backend allows you to define a directory in which Django stores and retrieves cached data. An advantage of using the file system backend is that the cached data is stored persistently in the file system and is available even after a server reboot.

Did you Know?

Cached data is stored as individual files. The contents of the files are in Python pickled format. The file system backend uses the cache key as the filename (escaped for the purpose of security).

Before you can enable the file system backend, you need to create a directory in which to store the cached data.

Watch Out!

Django needs read and write permissions to that directory. For example, if your web server is running as the user apache, you need to grant the apache user read and write access to the directory.

To enable the file system backend in the settings.py file, you should set the CACHE_BACKEND to the file:// backend and provide the full path to the directory. For example, if you create a directory called /var/temp/mysitecache on a Linux system, you would use the following setting to the settings.py file:

CACHE_BACKEND = 'db:///mysitecache'

As another example, if you create a directory called c: empmysitecache on a Windows system, you would use the following setting to the settings.py file:

CACHE_BACKEND = 'db://c:/temp/mysitecache'

Local Memory Backend

The local memory backend uses the system memory to store and retrieve cached data. An advantage of using the local memory backend is that the cached data is stored in memory, which is extremely quick to access. The local memory backend uses locking to ensure that it is multiprocess and thread-safe.

Watch Out!

Cached data that is stored in the local memory backend is lost if the server crashes. You should not rely on items in the local memory cache as any kind of data storage.

To enable the local memory backend in the settings.py file, set the CACHE_BACKEND to the locmem:/// backend:

CACHE_BACKEND = 'locmem:///'

Simple Backend

The simple backend caches data in memory for a single process. This is useful when you are developing the website and for testing purposes. However, you should not use it in production.

To enable the simple backend in the settings.py file, set the CACHE_BACKEND to the simple:/// backend:

CACHE_BACKEND = 'simple:///'

Dummy Backend

The dummy backend does not cache any data, but it enables the cache interface. The dummy backend should be used only in development or test websites.

To enable the dummy backend in the settings.py file, set the CACHE_BACKEND to the dummy:/// backend:

CACHE_BACKEND = 'dummy:///'

Did you Know?

The dummy backend is useful if you need to test or debug a website that has a heavy amount of caching. All you need to do is modify the CACHE_BACKEND setting for the test environment.

Memcached Backend

The fastest and most efficient backend available for Django is the Memcached backend. It runs as a daemon that stores and retrieves data into a memory cache.

The Memcached backend is not distributed with Django; you must obtain it from www.django.com/memcached/. Before you can enable Memcached, you must install it, along with the Memcached Python bindings. The Memcached Python bindings are in the Python module, memcache.py, which you can find at www.djangoproject.com/thirdparty/python-memcached/.

To enable the Memcached backend in the settings.py file, you should set the CACHE_BACKEND to the memcached:// backend and provide the IP address and port that the Memcached daemon is running on. For example, if the Memcached daemon is running on the local host (127.0.0.1) using port 12221, you would use the following setting:

CACHE_BACKEND = 'memcached://127.0.0.1:12221'

One of the best features of Memcached is that you can distribute the cache over multiple servers by running the Memcached daemon on multiple machines. Memcached treats the servers as a single cache.

Implementing a Per-Site Cache

After you have configured a caching backend, you can implement caching on the website. The easiest way to implement caching is at the site level. Django provides the django.middleware.cache.CacheMiddleware middleware framework to cache the entire site. Add the following entry to the MIDDLEWARE_CLASSES setting in the settings.py file to enable caching for the entire website:

' django.middleware.cache.CacheMiddleware',

Watch Out!

The CacheMiddleware application does not cache pages that have GET or POST. When you design your website, make certain that pages that need to be cached do not require URLs that contain query strings.

After you enable the CacheMiddleware framework, you need to add the following required settings to the settings.py file:

  • CACHE_MIDDLEWARE_SECONDS: Defines the number of seconds that each page should be kept in the cache.

  • CACHE_MIDDLEWARE_KEY_PREFIX: If you are using the same cache for multiple websites, you can use a unique string for this setting to designate which site the object is being cached from to prevent collisions. If you are not worried about collisions, you can use an empty string.

By the Way

You can enable the same cache for multiple sites that reside on the same Django installation. Just add the middleware to the settings.py file for each site.

The CacheMiddleware framework also allows you to restrict caching to requests made by anonymous users. If you set CACHE_MIDDLEWARE_ANONYMOUS_ONLY in the settings.py file to True, requests that come from logged-in users are not cached.

Watch Out!

If you use the CACHE_MIDDLEWARE_ANONYMOUS_ONLY option, make certain that AuthenticationMiddleware is enabled and is listed earlier in the MIDDLEWARE_CLASSES setting.

Did you Know?

The CacheMiddleware framework automatically sets the value of some headers in each HttpResponse. The Last-Modified header is set to the current date and time when a fresh version of the page is requested. The Expires header is set to the current date and time plus the value defined in CACHE_MIDDLEWARE_SECONDS. The Cache-Control header is set to the value defined in CACHE_MIDDLEWARE_SECONDS.

Implementing a Per-View Cache

Django’s caching makes it possible to implement the cache at the view level as well. Instead of caching every page in the website, you might want to cache only a few specific views.

Use the django.views.decorators.cache.cache_page decorator function to implement caching for a specific view. The cache_page decorator function caches the web page generated by a view function. The cache_page decorator accepts one argument that specifies how many seconds to keep the web page cached.

The following code shows an example of implementing the cache_page decorator function to cache the web page generated by myView for 3 minutes:

@cache_page(180)
def myView(request):
    . . .

By the Way

The cache_page decorator keys of the URL are just like the CacheMiddleware framework. Different URLs that point to the same view are cached as different entries.

Implementing a Per-Object Cache

Django provides a low-level cache API that allows you to access the cache from your Python code. Instead of caching entire pages, you may want to cache only specific data that will be used to render the display.

The django.core.cache.cache.set(key, value, timeout_seconds) function allows you to store any Python object that can be pickled in the cache. The set() function accepts three arguments—key, value, and timeout_seconds. The key argument is a string used to reference the object. The value argument is the object to be cached. The timeout_seconds argument specifies the number of seconds to cache the object.

The following code stores a list of Blog objects in the cache for 25 seconds:

from django.core.cache import cache
blogs = Blog.objects.all()
cache.set('Blog_List', blogs, 25)

The django.core.cache.cache.get(key) function accesses the cache and returns the value of the entry in the cache. If the entry is not found, None is returned. For example, the following code accesses the Blog list stored in the cache using the preceding example:

blogs = cache.get('Blog_List')

By the Way

The get() function can also accept a second argument that specifies a value to be returned instead of None if no entry is found:

blogs = cache.get('Blog_List', [])

The django.core.cache.cache.getmany(key_list) function accesses the cache and returns the values of the multiple cache entries. The getmany() function accepts a list of keys as its only argument. It returns a dictionary containing the keys from the arguments and their corresponding values in the cache. If the entry is not found or is expired, it is not included in the dictionary.

For example, the following code returns a dictionary containing the Date and User entries in the cache:

from datetime import datetime
from django.core.cache import cache
Date = datetime.now()
cache.set('User', request.User, 60)
cache.set('Date', datetime.now(), 60)
. . .
cache.get_many(['User', 'Date'])

Did you Know?

The cache API is key-based, so you can store an object in one view function and retrieve it in another.

The django.core.cache.cache.delete(key) function deletes the entry specified by the key argument in the cache. The delete() function has no return value and does not raise an error if the key is not found in the cache. The following example deletes the Blog_List entry from the cache:

cache.delete('Blog_List')

Managing Upstream Caching

So far in this hour we have discussed how to implement caching on your own website. Web pages are also cached upstream from your website by ISPs, proxies, and even web browsers. Upstream caching provides a major boost to the Internet’s efficiency, but it can also pose a couple of problems and security holes. For example, a home page that contains personal data about a user may be cached. A subsequent request to that home page would display that user’s information in another user’s browser.

The HTTP protocol solves these types of problems using Vary and Cache-Control headers. They allow websites to define some behavior and access requirements before cached pages are distributed. The following sections discuss how to implement these headers in your view functions.

Allowing Cached Pages to Vary Based on Headers

The Vary header allows you to define headers that an upstream cache engine checks when building its cache key. Then the cached page is used only if the values of headers in the Vary header of the request match those in the database.

The Vary header can be set in several different ways in the view function. The simplest way is to set the header manually in the HttpResponse object using the following syntax:

def myView(request):
    . . .
    response = HttpResponse()
    response['Vary'] = 'User-Agent'

Setting the Vary header manually in this way can potentially overwrite items that are already there. Django provides the django.views.decorators.vary.vary_on_headers() decorator function so that you can easily add headers to the Vary header for the view function.

The vary_on_headers() decorator function adds headers to the Vary header instead of overwriting headers that are already there. The vary_on_headers() decorator function can accept multiple headers as arguments. For example, the following code adds both the User-Agent and Content-Language headers to the Vary header:

from django.views.decorators import vary_on_headers
@vary_on_headers('User-Agent', 'Content-Language')
def myView(request):
    . . .

Another useful function to modify the Vary header is the django.utils.cache.patch_vary_headers(response, [headers]) function. The patch_vary_headers() function requires a response object as the first argument and a list of headers as the second. All headers listed in the second argument are added to the Vary header of the response object. For example, the following code adds the User-Agent and Content-Language headers to the Vary header inside the view function:

from django.utils.cache import patch_vary_headers
def myView(request):
    . . .
    response = HttpResponse()
    patch_vary_headers(response, ['User-Agent', 'Content-Language'])

One of the biggest advantages of using the patch_vary_headers() function is that you can selectively set which headers to add using code inside the view function. For example, you might want to add the Cookie header only if your view function actually sets a cookie.

By the Way

The values that get passed to vary_on_headers() and patch_vary_headers() are not case-sensitive. For example, the header user-agent is the same as User-Agent.

One of the most common headers that you will want to add to the Vary header is the Cookie header. For that reason, Django has added the django.views.decorators.vary_on_cookie() decorator function to add just the Cookie header to the Vary header. The vary_on_cookie() decorator function does not accept any parameters and simply adds the Cookie header to Vary:

from django.views.decorators import vary_on_cookie
@vary_on_cookie
def myView(request):
    . . .

Controlling Cached Pages Using the Cache-Control Header

One of the biggest problems with caching is keeping data that should remain private, private. Users basically use two types of caches—the private cache stored in the user’s web browser, and the public cache stored by ISPs or other upstream caches. Private data, such as credit card numbers and account numbers, should only be stored in the private cache.

HTTP handles the issue of keeping data private using the Cache-Control header. The Cache-Control header allows you to define directives that caching engines will use to determine if data is public or private and if it should even be cached.

The following are the currently valid directives for the Cache-Control header:

  • public=True

  • private=True

  • no_cache=True

  • no_store=True

  • no_transform=True

  • must_revalidate=True

  • proxy_revalidate=True

  • max_age=num_seconds

  • s_maxage=num_seconds

Django provides the django.views.decorators.cache.cache_control() decorator function to configure the directives in the Cache-Control header. The cache_control() decorator function accepts any valid Cache-Control directive as an argument. For example, the following code sets the private and max_age directives in the Cache-Control header for a view function:

from django.views.decorators.cache import cache_control
@ cache_control(private=True, max_age=600)
def myView(request):
    . . .

By the Way

The max_age directive in the Cache-Control header is set to the value of CACHE_MIDDLEWARE_SECONDS if it is specified in the settings.py file. The value you add to the max_age directive in the cache_control() decorator function takes precedence.

Summary

In this hour, we discussed how to configure caching for your website using different types of backends. You also learned that you can implement caching at the site level using the CacheMiddleware framework. You can implement caching at the view level using the cache_page() decorator function. You also can implement caching at the object level using a low-level API that allows you to get, set, and delete items in the cache.

We also discussed how to use the Vary and Cache-Control headers to manage how upstream caches cache web pages.

Q&A

Q.

Does Django’s CacheMiddleware framework support the Vary and Cache-Control headers?

A.

Yes. The CacheMiddleware framework conforms to the Vary and Cache-Control specifications.

Q.

Where can I go to better understand the Cache-Control and Vary headers?

A.

The header definitions can be found at www.w3.org/Protocols/rfc2616/rfc2616-sec14.html.

Workshop

The workshop consists of a set of questions and answers designed to solidify your understanding of the material covered in this hour. Try answering the questions before looking at the answers.

Quiz

1.

What types of caching backends could you use if you wanted the cache to be stored in a persistent state?

2.

What type of cache should you use if you want web pages from only two specific views cached?

3.

How can you cache an instance of a Django model?

4.

What function would you use inside a view function to add the Content-Language header to the Vary header?

Quiz Answers

1.

The db and file backends.

2.

The view-level cache using the cache_page() decorator function.

3.

Use the django.core.cache.cache.set() function.

4.

The django.utils.cache.patch_vary_headers() function.

Exercises

1.

Use the caching low-level API to modify the index() view function in the iFriends/People/views.py file to use the PersonList key to store and retrieve the current list of Person objects from the cache. That way, the home_view() and index() functions can access the same list from the cache.

2.

Use the cache_control() decorator function to add directives to the Cache-Control headers in the response of a view function.

3.

Use the vary_on_headers() decorator function to add headers to the Vary header in the response of a view function.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.147.55.42