Designing URLs

Django has one of the most flexible URL schemes among web frameworks. Basically, there is no implied URL scheme. You can explicitly define any URL scheme you like using appropriate regular expressions.

However, as superheroes love to say—"With great power comes great responsibility." You cannot get away with a sloppy URL design any more.

URLs used to be ugly because they were considered to be ignored by users. Back in the 90s when portals used to be popular, the common assumption was that your users will come through the front door, that is, the home page. They will navigate to the other pages of the site by clicking on links.

Search engines have changed all that. According to a 2013 research report, nearly half (47 percent) of all visits originate from a search engine. This means that any page in your website, depending on the search relevance and popularity can be the first page your user sees. Any URL can be the front door.

More importantly, Browsing 101 taught us security. Don't click on a blue link in the wild, we warn beginners. Read the URL first. Is it really your bank's URL or a site trying to phish your login details?

Today, URLs have become part of the user interface. They are seen, copied, shared, and even edited. Make them look good and understandable from a glance. No more eye sores such as:

http://example.com/gallery/default.asp?sid=9DF4BC0280DF12D3ACB60090271E26A8&command=commntform

Short and meaningful URLs are not only appreciated by users but also by search engines. URLs that are long and have less relevance to the content adversely affect your site's search engine rankings.

Finally, as implied by the maxim "Cool URIs don't change," you should try to maintain your URL structure over time. Even if your website is completely redesigned, your old links should still work. Django makes it easy to ensure that this is so.

Before we delve into the details of designing URLs, we need to understand the structure of a URL.

URL anatomy

Technically, URLs belong to a more general family of identifiers called Uniform Resource Identifiers (URIs). Hence, a URL has the same structure as a URI.

A URI is composed of several parts:

URI = Scheme + Net Location + Path + Query + Fragment

For example, a URI (http://dev.example.com:80/gallery/videos?id=217#comments) can be deconstructed in Python using the urlparse function:

>>> from urllib.parse import urlparse
>>> urlparse("http://dev.example.com:80/gallery/videos?id=217#comments")
ParseResult(scheme='http', netloc='dev.example.com:80', path='/gallery/videos', params='', query='id=217', fragment='comments')

The URI parts can be depicted graphically as follows:

URL anatomy

Even though Django documentation prefers to use the term URLs, it might more technically correct to say that you are working with URIs most of the time. We will use the terms interchangeably in this book.

Django URL patterns are mostly concerned about the 'Path' part of the URI. All other parts are tucked away.

What happens in urls.py?

It is often helpful to consider urls.py as the entry point of your project. It is usually the first file I open when I study a Django project. Essentially, urls.py contains the root URL configuration or URLConf of the entire project.

It would be a Python list returned from patterns assigned to a global variable called urlpatterns. Each incoming URL is matched with each pattern from top to bottom in a sequence. In the first match, the search stops, and the request is sent to the corresponding view.

Here, in considerably simplified form, is an excerpt of urls.py from Python.org, which was recently rewritten in Django:

urlpatterns = patterns(
    '',
    # Homepage
    url(r'^$', views.IndexView.as_view(), name='home'),
    # About
    url(r'^about/$',
        TemplateView.as_view(template_name="python/about.html"),
        name='about'),
    # Blog URLs
    url(r'^blogs/', include('blogs.urls', namespace='blog')),
    # Job archive
    url(r'^jobs/(?P<pk>d+)/$',
        views.JobArchive.as_view(),
        name='job_archive'),
    # Admin
    url(r'^admin/', include(admin.site.urls)),
)

Some interesting things to note here are as follows:

  • The first argument of the patterns function is the prefix. It is usually blank for the root URLConf. The remaining arguments are all URL patterns.
  • Each URL pattern is created using the url function, which takes five arguments. Most patterns have three arguments: the regular expression pattern, view callable, and name of the view.
  • The about pattern defines the view by directly instantiating TemplateView. Some hate this style since it mentions the implementation, thereby violating separation of concerns.
  • Blog URLs are mentioned elsewhere, specifically in urls.py inside the blogs app. In general, separating an app's URL pattern into its own file is good practice.
  • The jobs pattern is the only example here of a named regular expression.

In future versions of Django, urlpatterns should be a plain list of URL pattern objects rather than arguments to the patterns function. This is great for sites with lots of patterns, since urlpatterns being a function can accept only a maximum of 255 arguments.

If you are new to Python regular expressions, you might find the pattern syntax to be slightly cryptic. Let's try to demystify it.

The URL pattern syntax

URL regular expression patterns can sometimes look like a confusing mass of punctuation marks. However, like most things in Django, it is just regular Python.

It can be easily understood by knowing that URL patterns serve two functions: to match URLs appearing in a certain form, and to extract the interesting bits from a URL.

The first part is easy. If you need to match a path such as /jobs/1234, then just use the "^jobs/d+" pattern (here d stands for a single digit from 0 to 9). Ignore the leading slash, as it gets eaten up.

The second part is interesting because, in our example, there are two ways of extracting the job ID (that is, 1234), which is required by the view.

The simplest way is to put a parenthesis around every group of values to be captured. Each of the values will be passed as a positional argument to the view. For example, the "^jobs/(d+)" pattern will send the value "1234" as the second argument (the first being the request) to the view.

The problem with positional arguments is that it is very easy to mix up the order. Hence, we have name-based arguments, where each captured value can be named. Our example will now look like "^jobs/(?P<pk>d+)/" . This means that the view will be called with a keyword argument pk being equal to "1234".

If you have a class-based view, you can access your positional arguments in self.args and name-based arguments in self.kwargs. Many generic views expect their arguments solely as name-based arguments, for example, self.kwargs["slug"].

Mnemonic – parents question pink action-figures

I admit that the syntax for name-based arguments is quite difficult to remember. Often, I use a simple mnemonic as a memory aid. The phrase "Parents Question Pink Action-figures" stands for the first letters of Parenthesis, Question mark, (the letter) P, and Angle brackets.

Put them together and you get (?P< . You can enter the name of the pattern and figure out the rest yourself.

It is a handy trick and really easy to remember. Just imagine a furious parent holding a pink-colored hulk action figure.

Another tip is to use an online regular expression generator such as http://pythex.org/ or https://www.debuggex.com/ to craft and test your regular expressions.

Names and namespaces

Always name your patterns. It helps in decoupling your code from the exact URL paths. For instance, in the previous URLConf, if you want to redirect to the about page, it might be tempting to use redirect("/about"). Instead, use redirect("about"), as it uses the name rather than the path.

Here are some more examples of reverse lookups:

>>> from django.core.urlresolvers import reverse
>>> print(reverse("home"))
"/"
>>> print(reverse("job_archive", kwargs={"pk":"1234"}))
"jobs/1234/"

Names must be unique. If two patterns have the same name, they will not work. So, some Django packages used to add prefixes to the pattern name. For example, an application named blog might have to call its edit view as 'blog-edit' since 'edit' is a common name and might cause conflict with another application.

Namespaces were created to solve such problems. Pattern names used in a namespace have to be only unique within that namespace and not the entire project. It is recommended that you give every app its own namespace. For example, we can create a 'blog' namespace with only the blog's URLs by including this line in the root URLconf:

url(r'^blog/', include('blog.urls', namespace='blog')),

Now the blog app can use pattern names, such as 'edit' or anything else as long as they are unique within that app. While referring to a name within a namespace, you will need to mention the namespace, followed by a ':' before the name. It would be "blog:edit" in our example.

As Zen of Python says—"Namespaces are one honking great idea—let's do more of those." You can create nested namespaces if it makes your pattern names cleaner, such as "blog:comment:edit". I highly recommend that you use namespaces in your projects.

Pattern order

Order your patterns to take advantage of how Django processes them, that is, top-down. A good rule of thumb is to keep all the special cases at the top. Broader patterns can be mentioned further down. The broadest—a catch-all—if present, can go at the very end.

For example, the path to your blog posts might be any valid set of characters, but you might want to handle the About page separately. The right sequence of patterns should be as follows:

urlpatterns = patterns(
    '',
    url(r'^about/$', AboutView.as_view(), name='about'),
    url(r'^(?P<slug>w+)/$', ArticleView.as_view(), name='article'),
)  

If we reverse the order, then the special case, the AboutView, will never get called.

URL pattern styles

Designing URLs of a site consistently can be easily overlooked. Well-designed URLs can not only logically organize your site but also make it easy for users to guess paths. Poorly designed ones can even be a security risk: say, using a database ID (which occurs in a monotonic increasing sequence of integers) in a URL pattern can increase the risk of information theft or site ripping.

Let's examine some common styles followed in designing URLs.

Departmental store URLs

Some sites are laid out like Departmental stores. There is a section for Food, inside which there would be an aisle for Fruits, within which a section with different varieties of Apples would be arranged together.

In the case of URLs, this means that you will find these pages arranged hierarchically as follows:

http://site.com/ <section> / <sub-section> / <item>

The beauty of this layout is that it is so easy to climb up to the parent section. Once you remove the tail end after the slash, you are one level up.

For example, you can create a similar structure for the articles section, as shown here:

# project's main urls.py
urlpatterns = patterns(
    '',
    url(r'^articles/$', include(articles.urls), namespace="articles"),
)

# articles/urls.py
urlpatterns = patterns(
    '',
    url(r'^$', ArticlesIndex.as_view(), name='index'),
    url(r'^(?P<slug>w+)/$', ArticleView.as_view(), name='article'),
)

Notice the 'index' pattern that will show an article index in case a user climbs up from a particular article.

RESTful URLs

In 2000, Roy Fielding introduced the term Representational state transfer (REST) in his doctoral dissertation. Reading his thesis (http://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm) is highly recommended to better understand the architecture of the web itself. It can help you write better web applications that do not violate the core constraints of the architecture.

One of the key insights is that a URI is an identifier to a resource. A resource can be anything, such as an article, a user, or a collection of resources, such as events. Generally speaking, resources are nouns.

The web provides you with some fundamental HTTP verbs to manipulate resources: GET, POST, PUT, PATCH, and DELETE. Note that these are not part of the URL itself. Hence, if you use a verb in the URL to manipulate a resource, it is a bad practice.

For example, the following URL is considered bad:

http://site.com/articles/submit/

Instead, you should remove the verb and use the POST action to this URL:

http://site.com/articles/

Tip

Best Practice

Keep verbs out of your URLs if HTTP verbs can be used instead.

Note that it is not wrong to use verbs in a URL. The search URL for your site can have the verb 'search' as follows, since it is not associated with one resource as per REST:

http://site.com/search/?q=needle

RESTful URLs are very useful for designing CRUD interfaces. There is almost a one-to-one mapping between the Create, Read, Update, and Delete database operations and the HTTP verbs.

Note that the RESTful URL style is complimentary to the departmental store URL style. Most sites mix both the styles. They are separated for clarity and better understanding.

Tip

Downloading the example code

You can download the example code fies for all Packt books you have purchasedfrom your account at http://www.packtpub.com. If you purchased this bookelsewhere, you can visit http://www.packtpub.com/support and register tohave the fies e-mailed directly to you. Pull requests and bug reports to the SuperBook project can be sent to https://github.com/DjangoPatternsBook/superbook.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.140.185.170