Retrieval patterns

This section contains design patterns that deal with accessing model properties or performing queries on them.

Pattern – property field

Problem: Models have attributes that are implemented as methods. However, these attributes should not be persisted to the database.

Solution: Use the property decorator on such methods.

Problem details

Model fields store per-instance attributes, such as first name, last name, birthday, and so on. They are also stored in the database. However, we also need to access some derived attributes, such as full name or age.

They can be easily calculated from the database fields, hence need not be stored separately. In some cases, they can just be a conditional check such as eligibility for offers based on age, membership points, and active status.

A straightforward way to implement this is to define functions, such as get_age similar to the following:

class BaseProfile(models.Model):
    birthdate = models.DateField()
    #...
    def get_age(self):
        today = datetime.date.today()
        return (today.year - self.birthdate.year) - int(
            (today.month, today.day) <
            (self.birthdate.month, self.birthdate.day))

Calling profile.get_age() would return the user's age by calculating the difference in the years adjusted by one based on the month and date.

However, it is much more readable (and Pythonic) to call it profile.age.

Solution details

Python classes can treat a function as an attribute using the property decorator. Django models can use it as well. In the previous example, replace the function definition line with:

    @property
    def age(self):

Now, we can access the user's age with profile.age. Notice that the function's name is shortened as well.

An important shortcoming of a property is that it is invisible to the ORM, just like model methods are. You cannot use it in a QuerySet object. For example, this will not work, Profile.objects.exclude(age__lt=18).

It might also be a good idea to define a property to hide the details of internal classes. This is formally known as the Law of Demeter. Simply put, the law states that you should only access your own direct members or "use only one dot".

For example, rather than accessing profile.birthdate.year, it is better to define a profile.birthyear property. It helps you hide the underlying structure of the birthdate field this way.

Tip

Best Practice

Follow the law of Demeter, and use only one dot when accessing a property.

An undesirable side effect of this law is that it leads to the creation of several wrapper properties in the model. This could bloat up models and make them hard to maintain. Use the law to improve your model's API and reduce coupling wherever it makes sense.

Cached properties

Each time we call a property, we are recalculating a function. If it is an expensive calculation, we might want to cache the result. This way, the next time the property is accessed, the cached value is returned.

from django.utils.functional import cached_property
    #...
    @cached_property
    def full_name(self):
        # Expensive operation e.g. external service call
        return "{0} {1}".format(self.firstname, self.lastname)

The cached value will be saved as a part of the Python instance. As long as the instance exists, the same value will be returned.

As a failsafe mechanism, you might want to force the execution of the expensive operation to ensure that stale values are not returned. In such cases, set a keyword argument such as cached=False to prevent returning the cached value.

Pattern – custom model managers

Problem: Certain queries on models are defined and accessed repeatedly throughout the code violating the DRY principle.

Solution: Define custom managers to give meaningful names to common queries.

Problem details

Every Django model has a default manager called objects. Invoking objects.all(), will return all the entries for that model in the database. Usually, we are interested in only a subset of all entries.

We apply various filters to find out the set of entries we need. The criterion to select them is often our core business logic. For example, we can find the posts accessible to the public by the following code:

public = Posts.objects.filter(privacy="public")

This criterion might change in the future. Say, we might want to also check whether the post was marked for editing. This change might look like this:

public = Posts.objects.filter(privacy=POST_PRIVACY.Public,
         draft=False)

However, this change needs to be made everywhere a public post is needed. This can get very frustrating. There needs to be only one place to define such commonly used queries without 'repeating oneself'.

Solution details

QuerySets are an extremely powerful abstraction. They are lazily evaluated only when needed. Hence, building longer QuerySets by method-chaining (a form of fluent interface) does not affect the performance.

In fact, as more filtering is applied, the result dataset shrinks. This usually reduces the memory consumption of the result.

A model manager is a convenient interface for a model to get its QuerySet object. In other words, they help you use Django's ORM to access the underlying database. In fact, managers are implemented as very thin wrappers around a QuerySet object. Notice the identical interface:

>>> Post.objects.filter(posted_by__username="a")
[<Post: a: Hello World>, <Post: a: This is Private!>]

>>> Post.objects.get_queryset().filter(posted_by__username="a")
[<Post: a: Hello World>, <Post: a: This is Private!>]

The default manager created by Django, objects, has several methods, such as all, filter, or exclude that return QuerySets. However, they only form a low-level API to your database.

Custom managers are used to create a domain-specific, higher-level API. This is not only more readable but less affected by implementation details. Thus, you are able to work at a higher level of abstraction closely modeled to your domain.

Our previous example for public posts can be easily converted into a custom manager as follows:

# managers.py
from django.db.models.query import QuerySet

class PostQuerySet(QuerySet):
    def public_posts(self):
        return self.filter(privacy="public")

PostManager = PostQuerySet.as_manager

This convenient shortcut for creating a custom manager from a QuerySet object appeared in Django 1.7. Unlike other previous approaches, this PostManager object is chainable like the default objects manager.

It sometimes makes sense to replace the default objects manager with our custom manager, as shown in the following code:

from .managers import PostManager
class Post(Postable):
    ...
    objects = PostManager()

By doing this, to access public_posts our code gets considerably simplified to the following:

public = Post.objects.public_posts()

Since the returned value is a QuerySet, they can be further filtered:

public_apology = Post.objects.public_posts().filter(
                  message_startswith="Sorry")

QuerySets have several interesting properties. In the next few sections, we can take a look at some common patterns that involve combining QuerySets.

Set operations on QuerySets

True to their name (or the latter half of their name), QuerySets support a lot of (mathematical) set operations. For the sake of illustration, consider two QuerySets that contain the user objects:

>>> q1 = User.objects.filter(username__in=["a", "b", "c"])
[<User: a>, <User: b>, <User: c>]
>>> q2 = User.objects.filter(username__in=["c", "d"])
[<User: c>, <User: d>]

Some set operations that you can perform on them are as follows:

  • Union: This combines and removes duplicates. Use q1 | q2 to get [<User: a>, <User: b>, <User: c>, <User: d>]
  • Intersection: This finds common items. Use q1 and q2 to get [<User: c>]
  • Difference: This removes elements in second set from first. There is no logical operator for this. Instead use q1.exclude(pk__in=q2) to get [<User: a>, <User: b>]

The same operations can be done using the Q objects:

from django.db.models import Q

# Union
>>> User.objects.filter(Q(username__in=["a", "b", "c"]) | Q(username__in=["c", "d"]))
[<User: a>, <User: b>, <User: c>, <User: d>]

# Intersection
>>> User.objects.filter(Q(username__in=["a", "b", "c"]) & Q(username__in=["c", "d"]))
[<User: c>]

# Difference
>>> User.objects.filter(Q(username__in=["a", "b", "c"]) & ~Q(username__in=["c", "d"]))
[<User: a>, <User: b>]

Note that the difference is implemented using & (AND) and ~ (Negation). The Q objects are very powerful and can be used to build very complex queries.

However, the Set analogy is not perfect. QuerySets, unlike mathematical sets, are ordered. So, they are closer to Python's list data structure in that respect.

Chaining multiple QuerySets

So far, we have been combining QuerySets of the same type belonging to the same base class. However, we might need to combine QuerySets from different models and perform operations on them.

For example, a user's activity timeline contains all their posts and comments in reverse chronological order. The previous methods of combining QuerySets won't work. A naïve solution would be to convert them to lists, concatenate, and sort them, like this:

>>>recent = list(posts)+list(comments)
>>>sorted(recent, key=lambda e: e.modified, reverse=True)[:3]
[<Post: user: Post1>, <Comment: user: Comment1>, <Post: user: Post0>]	

Unfortunately, this operation has evaluated the lazy QuerySets object. The combined memory usage of the two lists can be overwhelming. Besides, it can be quite slow to convert large QuerySets into lists.

A much better solution uses iterators to reduce the memory consumption. Use the itertools.chain method to combine multiple QuerySets as follows:

>>> from itertools import chain
>>> recent = chain(posts, comments)
>>> sorted(recent, key=lambda e: e.modified, reverse=True)[:3]

Once you evaluate a QuerySet, the cost of hitting the database can be quite high. So, it is important to delay it as long as possible by performing only operations that will return QuerySets unevaluated.

Tip

Keep QuerySets unevaluated as long as possible.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.218.38.125