Status page

Our status page needs to show the user the status of their complete infrastructure in one view. To do so, a table feels like the appropriate design component. As the most important information to the user is going to be the latest status of their servers, our status page will need to show only one table row per node and list only the latest values for each different data type that we have in the database for that node.

For the sample data that I added to my database, we would ideally want a table similar to this on the status page:

Status page

As you can see, we only mention each node once, and group the different data types so that all information about a node is shown in one place and the user doesn't have to search through the table to find what they are looking for. As a bonus, we also show the last updated time in a nice way instead of just showing the time of the latest data point.

If you are thinking that showing our data points in a nice and consolidated manner like this will not be simple, I'm afraid you are right. We could get all the data points from the database in one query using DataPoint.objects.all() and then group them in our Python code, but that would be inefficient once the number of data points in our database grows. For server monitoring solutions, it isn't uncommon to have a couple of million data points. We can't go about getting and grouping all the million data points every time the user wants to see the status page. That would make loading the page unbearably slow.

Luckily for us, SQL—the language used to query data from databases—provides us with some very powerful constructs that we can use to get just the information we want, without having to go through all the rows of data that we could have in our data points table. Let's think about what we need.

For starters, we'd like to know the different node names in our database. For each node name, we also need to know the types of data that are available for it. In our example, while both web01 and web02 have the load and disk_usage data type available, the dbmaster node only has data for the disk_usage data type (or metric). For cases like this, the SQL language provides us with a DISTINCT query clause. Adding DISTINCT to our query instructs the database to return unique rows only. That is, all duplicate rows are only returned once. This way, we can get a list of all the different nodes and data types in our database without having to go over each record.

We need to experiment a bit to figure out how to translate the SQL query into something we can use with the Django ORM. We could write our view code and then keep changing it to figure out the right way to get the data we want, but that is very cumbersome. Instead, Django provides us with a very convenient shell to do these kinds of experimentations.

If you remember, at the start of this chapter, I showed you why you couldn't just start a Python shell and import the model. Django complained about not having been set up properly before being used. Instead, Django has its own way of launching a Python shell, one that makes sure that all the dependencies of setting up Django are met before you start using the shell. To start this shell, type the following:

> python manage.py shell

Like before, this will put you in a Python shell, which you can tell by the changed prompt. Now, let's try importing our DataPoint model:

>>> from data_collector.models import DataPoint

This time you shouldn't get any errors. Now type the following:

>>
> DataPoint.objects.all()
[<DataPoint: DataPoint for web01. load = 5.0>, <DataPoint: DataPoint for web01. load = 1.0>, <DataPoint: DataPoint for web01. load = 1.5>, <DataPoint: DataPoint for web02. load = 7.0>, <DataPoint: DataPoint for web02. load = 9.0>, <DataPoint: DataPoint for dbmaster. disk_usage = 0.8>, <DataPoint: DataPoint for dbmaster. disk_usage = 0.95>, <DataPoint: DataPoint for web01. disk_usage = 0.5>, <DataPoint: DataPoint for web02. disk_usage = 0.85>]

As you can see, you can query the model and see the output of the query immediately. The Django shell is one of the most useful components in Django, and you will often find yourself experimenting in the shell to figure out the correct way to do something before you write the final code in your views.

So, back to our problem of getting distinct node names and data types from our database. If you search the Django documentation for the distinct keyword, you should see this link in the results:

https://docs.djangoproject.com/en/stable/ref/models/querysets/#distinct.

If you read what the documentation says, you should figure out that this is exactly what we need in order to use the DISTINCT clause. But how do we use it? Let's try it out in the shell:

>>> DataPoint.objects.all().distinct()
[<DataPoint: DataPoint for web01. load = 5.0>, <DataPoint: DataPoint for web01. load = 1.0>, <DataPoint: DataPoint for web01. load = 1.5>, <DataPoint: DataPoint for web02. load = 7.0>, <DataPoint: DataPoint for web02. load = 9.0>, <DataPoint: DataPoint for dbmaster. disk_usage = 0.8>, <DataPoint: DataPoint for dbmaster. disk_usage = 0.95>, <DataPoint: DataPoint for web01. disk_usage = 0.5>, <DataPoint: DataPoint for web02. disk_usage = 0.85>]

Hmm? That didn't change anything. Why not? Let's think about what is happening here. We asked Django to query the database for all the data points and then return only one row for each duplicated data. If you are familiar with SQL, the distinct clause works by comparing each field in the rows of data you selected. However, as by default Django selects all the rows from a database table when querying a model, the data that the SQL query sees also includes the primary key, which is by definition unique for each row. This is why we see all of our data, even though we have used the distinct clause.

In order to make use of the distinct clause, we need to limit the fields in the data that we are asking the database to return to us. For our particular use case, we only need to know the unique pairs of node name and data type. The Django ORM provides another method, values, that we can use to limit the fields that Django selects. Let's try it out first without the distinct clause to see what data it returns:

>>> DataPoint.objects.all().values('node_name', 'data_type')
[{'data_type': u'load', 'node_name': u'web01'}, {'data_type': u'load', 'node_name': u'web01'}, {'data_type': u'load', 'node_name': u'web01'}, {'data_type': u'load', 'node_name': u'web02'}, {'data_type': u'load', 'node_name': u'web02'}, {'data_type': u'disk_usage', 'node_name': u'dbmaster'}, {'data_type': u'disk_usage', 'node_name': u'dbmaster'}, {'data_type': u'disk_usage', 'node_name': u'web01'}, {'data_type': u'disk_usage', 'node_name': u'web02'}]

That seems to do the trick. Now our data only includes the two fields that we wanted to run the distinct query on. Let's add the distinct clause as well and see what we get:

>>> DataPoint.objects.all().values('node_name', 'data_type').distinct()
[{'data_type': u'load', 'node_name': u'web01'}, {'data_type': u'load', 'node_name': u'web02'}, {'data_type': u'disk_usage', 'node_name': u'dbmaster'}, {'data_type': u'disk_usage', 'node_name': u'web01'}, {'data_type': u'disk_usage', 'node_name': u'web02'}]

Voila! That seems to have done the trick. Now our Django ORM query only returns unique pairs of node names and data types, which is exactly what we needed.

One important thing to note is that after we added the values method to the ORM query, the returned data was no longer our DataPoint model class. Instead, it was dictionaries with just the field values that we asked for. Thus, any functions that you have defined on the model are not accessible on these dictionaries. If you think about it, this is obvious because without the complete fields, Django has no way to populate the model objects. It also won't matter if you listed all of your model fields in the values method arguments. It will still only return dictionaries, not the model objects.

Now that we have figured out how to get the data in the format we want, without having to loop over each row of data in our database, let's create the template, view, and URL configuration for our status page. Starting with the view code, change data_collector/views.py to have these contents:

from django.views.generic import TemplateView

from data_collector.models import DataPoint


class StatusView(TemplateView):
    template_name = 'status.html'

    def get_context_data(self, **kwargs):
        ctx = super(StatusView, self).get_context_data(**kwargs)

        nodes_and_data_types = DataPoint.objects.all().values('node_name', 'data_type').distinct()

        status_data_dict = dict()
        for node_and_data_type_pair in nodes_and_data_types:
            node_name = node_and_data_type_pair['node_name']
            data_type = node_and_data_type_pair['data_type']

            data_point_map = status_data_dict.setdefault(node_name, dict())
            data_point_map[data_type] = DataPoint.objects.filter(
                node_name=node_name, data_type=data_type
            ).latest('datetime')

        ctx['status_data_dict'] = status_data_dict

        return ctx

It's a little complicated, so let's break it up into parts. First, we get a list of node name and data type pairs, using the query we came up with before. The result of the query, which we store in nodes_and_data_types, is similar to the following:

[{'data_type': u'load', 'node_name': u'web01'}, {'data_type': u'load', 'node_name': u'web02'}, {'data_type': u'disk_usage', 'node_name': u'dbmaster'}, {
'data_type': u'disk_usage', 'node_name': u'web01'}, {'data_type': u'disk_usage', 'node_name': u'web02'}]

As we've seen before, this is a list of all unique node name and data type pairs in our database. So, as our dbmaster node doesn't have any data for the load data type, you won't find that pair in this list. I'll explain in a while why running the distinct query helps us reduce the amount of load we need to put on our database.

Next, we loop over each of these pairs; that's the for loop you can see in the code. For each node name and data type pair, we run a query to get us the latest data point. First, we filter down to only the data points that we are interested in—those that match the node name and data type we specified. Then, we call the latest method and get the most recently updated data point.

The latest method takes the name of a field, orders the query using this field, and then returns the last row of data as per that ordering. It should be noted that latest can work with any field type that can be ordered, including numbers, not just date time fields.

I would like to point out the use of setdefault here. Calling setdefault on a dictionary makes sure that if the key provided doesn't exist in the dictionary already, the value passed as the second parameter is set for that key. This is a pretty useful pattern that I and a lot of Python programmers use when we are creating a dictionary in which all the keys need to have values of the same type—a dictionary in this case.

This allows us to ignore the scenario in which the key does not previously exist in the dictionary. Without using setdefault, we would first have to check whether the key exists. If it did, we would modify that. If it didn't, we would create a new dictionary, modify that, and then assign it to status_data_dict.

The setdefault method returns the value of the given key as well, whether it had to set it to the default value or not. We save that in the data_point_map variable in our code.

Finally, we add the status_data_dict dictionary to the context and return it. We'll see in our template how we go over this data and display it to the user. I said earlier that I would explain how the distinct query helped us reduce the load on the database. Let's look at an example scenario. Assume that we have the same three nodes in our infrastructure that we saw in our sample data: web01, web02, and dbmaster. Let's say that we have had monitoring running for one whole day, collecting stats for both load and disk usage on all three nodes every minute. Doing the math, we should have the following:

Number of nodes x number of data types x number of hours x 60:

3 x 2 x 24 x 60 = 8640

Thus, our data base has 8,640 data point objects. Now, with the code that we have in our view, we will only need to retrieve six data point objects from the database to show the user an updated status page plus the one distinct query. If we had to get all the data points, we would have to transfer the data for all those 8,640 data points from the database, and then use only six of them.

For the template, create a folder called templates in the data_collector directory. Then, create a status.html file in the template folder and give it the following content:

{% extends "base.html" %}

{% load humanize %}

{% block content %}
<h1>Status</h1>

<table>
    <tbody>
        <tr>
            <th>Node Name</th>
            <th>Metric</th>
            <th>Value</th>
            <th>Last Updated</th>
        </tr>

        {% for node_name, data_type_to_data_point_map in status_data_dict.items %}
            {% for data_type, data_point in data_type_to_data_point_map.items %}
            <tr>
                <td>{% if forloop.first %}{{ node_name }}{% endif %}</td>
                <td>{{ data_type }}</td>
                <td>{{ data_point.data_value }}</td>
                <td>{{ data_point.datetime|naturaltime }}</td>
            </tr>
            {% endfor %}
        {% endfor %}
    </tbody>
</table>
{% endblock %}

There shouldn't be many surprises here. Ignoring the load humanize line, our template simply creates a table using the data dictionary that we generated in our view earlier. The two nested for loops might look a bit complicated, but looking at the data that we are looping over should make things clear:

{u'dbmaster': {u'disk_usage': <DataPoint: DataPoint for dbmaster. disk_usage = 0.95>},
 u'web01': {u'disk_usage': <DataPoint: DataPoint for web01. disk_usage = 0.5>,
            u'load': <DataPoint: DataPoint for web01. load = 1.5>},
 u'web02': {u'disk_usage': <DataPoint: DataPoint for web02. disk_usage = 0.85>,
            u'load': <DataPoint: DataPoint for web02. load = 9.0>}}

The first for loop gets the node name and dictionary mapping data type to the latest data point. The inner for loop then iterates over the data type and the latest data point for the type and generates the table rows. We use the forloop.first flag to print the node name only if the inner loop is running for the first time. Django provides a few other useful flags related to for loops in templates. Take a look at the documentation at https://docs.djangoproject.com/en/stable/ref/templates/builtins/#for.

When we print the datetime field for a data point, we use the naturaltime filter. This filter is part of the humanize template tags provided as part of Django, which is why we needed to use the load humanize line at the start of the template. The naturaltime template filter outputs a date time value in a format that is easy for humans to understand, for example, two seconds ago, one hour ago, 20 minutes ago, and so on. You need to add django.contrib.humanize to the list of INSTALLED_APPS in djagios/settings.py before you can load the humanize template tags.

The final step in completing our status page is to add it to the URL configuration. As the status page is what the user wants to see most often from a monitoring system, let's make it the home page. Make the URL configuration file at djagios/urls.py contain the following content:

from django.conf.urls import url

from data_collector.views import StatusView


urlpatterns = [
    url(r'^$', StatusView.as_view(), name='status'),
]

That's it. Run the development server:

> python manage.py runserver

Access the status page at http://127.0.0.1:8000. If you have followed the steps so far, you should see a status page similar to the following one. Of course, your page will show data from your database:

Status page
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.188.137.37