Chapter 9: Best Practices to Improve Your Web Applications

From Chapter 1, Introduction to Sanic and Async Frameworks, through Chapter 8, Running a Sanic Server, we learned how to build a web application from conception through deployment. Pat yourself on the back and give yourself a round of applause. Building and deploying a web application is not a simple feat. So, what have we learned? We, of course, spent time learning about all of the fundamental tools that Sanic provides: route handlers, blueprints, middleware, signals, listeners, decorators, exception handlers, and so on. More importantly, however, we spent some time thinking about how HTTP works and how we can use these tools to design and build applications that are secure, scalable, maintainable, and easily deployable.

There have been a lot of specific patterns in this book for you to use, but also, quite intentionally, I have left a lot of ambiguity. You have continually read statements such as it depends upon your application's needs. After all, one of the goals of the Sanic project is to remain unopinionated.

That's all well and good, and flexibility is great. But what if you are a developer that has not yet determined what patterns work and which do not? The difference between writing a Hello, world application and a production-ready, real-world application is huge. If you only have limited experience in writing applications, then you also only have had limited experience in making mistakes. It is through those mistakes (whether made by yourself or from lessons learned by others who have made them) that I truly believe we become better developers. Like so many other things in life, failure leads to success.

The purpose of this chapter, therefore, is to include several examples and preferences that I have learned from my 25+ years of building web applications. That means for every best practice you will learn in this chapter, there is probably some mistake that I made to go along with it. These are a set of base-level best practices that I think are critical for any professional-grade application to include from the beginning.

In this chapter, we are going to look at the following:

  • Implementing practical real-world exception handlers
  • Setting up a testable application
  • Gaining insight from logging and tracing
  • Managing database connections

Technical requirements

There are no new technical requirements that you have not already seen. By this point, you should hopefully have a nice environment available for building Sanic, along with all the tools, such as Docker, Git, Kubernetes, and cURL, that we have been using all along. You can follow along with the code examples on the GitHub repository: https://github.com/PacktPublishing/Python-Web-Development-with-Sanic/tree/main/Chapter09.

Implementing practical real-world exception handlers

Exception handling is not a new concept at this point. We explored the topic in the Implementing proper exception handling section in Chapter 6, Operating Outside the Response Handler. I emphasized the importance of creating our own set of exceptions that include default status messages and response codes. This useful pattern was meant to get you up and running very quickly to be able to send useful messages to your users.

For example, imagine we are building an application for travel agents to book airline tickets for customers. You can imagine one of the steps of the operation might be to assist in matching flights through connecting airports.

Well, what if the customer selected two flights where the time between the flights was too short? You might do something like this:

from sanic.exceptions import SanicException

class InsufficientConnection(SanicException):

    status_code = 400

    message = "Selected flights do not leave enough time for connection to be made."

I love this pattern because it makes it super easy for us to now repeatably raise an InsufficientConnection exception and have a known response for the user. But responding properly to the user is only half of the battle. When something goes wrong in our applications in the real world, we want to know about it. Our applications need to be able to report back so that if there is indeed a problem, then we can fix it.

So, how do we go about solving this problem? Logging is, of course, essential (we will look at that in the Gaining insight from logging and tracing section later). Having a reliable way to get to your system logs is an absolute must for a lot of reasons. But do you want to monitor your logs all day long, every day, looking for a traceback? Of course not!

Somehow, in some way, you need to set up some alerts to notify you that an exception happened. Creating proper notifications is an important part of maintaining a web application since they tell you when something is not operating as you intended. However, receiving a notification on every issue may become very noisy and overwhelming. In some applications, you easily become lost and stop paying attention to notifications if there are too many, or if they are difficult to consume. Luckily, not all exceptions are created equal, and only sometimes will you actually want to be notified. Some errors are fine to just display to the user and to remain ignorant of their existence. If a customer forgets to input valid data, you do not need your mobile phone waking you up at 3 a.m. while you are on call. While setting up system monitoring and alerting tools is outside the scope of this book, the point that I am trying to make is that your application should be proactive about warning you when certain things happen and ignoring the issues that you do not care about. Sometimes bad things will happen, and you want to make sure that you are able to sift through the noise and not miss out on the issues that really matter. A simple form of this might be to send an email when something particularly bad happens.

Knowing what you do about Sanic so far, if I came to you and asked you to build a system that sent me an email whenever PinkElephantError is raised, how would you do it?

I hope this is not your answer:

if there_is_a_pink_elephant():

    await send_adam_an_email()

    raise PinkElephantError

"Why?" you might ask. For starters, what if this needs to be implemented in a few locations, and then we need to change the notification from send_adam_an_email() to build_a_fire_and_send_a_smoke_signal()? You now need to go searching through all of your code to make sure it is done consistently and hope you did not miss anything.

What else could you do? How can you simply write the following code in your application and have it know that it needs to send me an email?

if there_is_a_pink_elephant():

    raise PinkElephantError

Let's learn that next.

Catching errors with middleware

Adding the notification mechanism right next to where we raise the exception, as in the preceding example, would work, but is not the best solution. The goal is to run send_adam_an_email() at the same time that we raise PinkElephantError. One solution would be to catch the exception with response middleware and send out the alert from there. The problem with this is that the response is not likely to have an easily parseable exception. If PinkElephantError raises a 400 response, how would you be able to distinguish it from any other 400 response? You could, of course, have JSON formatting and check the exception type, or try and read the exception message. But that will only work in DEBUG mode because in PRODUCTION mode, you may not have that information available.

One creative solution I have seen is to attach an arbitrary exception code and rewrite it in the middleware as follows:

class PinkElephantError(SanicException):

    status_code = 4000

    message = "There is a pink elephant in the room"

@app.on_response

async def exception_middleware(request: Request, response: HTTPResponse):

    if response.status == 4000:

        response.status = 400

        await send_adam_an_email()

This solution will likely become very tedious to maintain and it is not at all obvious to anyone (including your future self) what is happening. It reminds me of the old-school style of error coding. I am talking about those errors where you need a lookup table to translate a number to an error description, which undoubtedly will be incomprehensible because of a lack of standardization or documentation. Just thinking about seeing E19 on my coffee machine as I race around to find the owner's manual to look up what that means is enough to raise my stress levels. What I am trying to say is: Save yourself the hassle and try to find a nicer solution for identifying exceptions than attaching some otherwise hard-to-understand error codes that you later need to translate. We need a better solution.

Catching errors with signals

Remember our old friend signals from way back in the Leveraging signals for intra-worker communication section in Chapter 6, Operating Outside the Response Handler? If you recall, Sanic can dispatch event signals when certain things occur. One of them is when an exception is raised. Better yet, the signal context includes the exception instance, making it much easier to identify which exception occurred.

A cleaner and more maintainable solution to the aforementioned code would look like this:

@app.signal("http.lifecycle.exception")

async def exception_signal(request: Request, exception: Exception):

    if isinstance(exception, PinkElephantError):

        await send_adam_an_email()

I think you can already see this is a much classier and fitting solution. For a lot of use cases, this might very well be the best solution for you. Therefore, I suggest you commit this simple four-line pattern to memory. Now, when we need to change send_adam_an_email() to build_a_fire_and_send_a_smoke_signal(), that will be a super-simple change to our code.

Long-time builders of Sanic applications may be looking at this example and wondering whether we can just use app.exception. This is certainly an acceptable pattern, but not without its potential pitfalls. Let's look at that next.

Catching the error and responding manually

When an exception is raised, Sanic stops the regular route handling process and moves it over to an ErrorHandler instance. This is a single object that exists throughout the lifespan of your application instance. Its purpose is to act as a sort of mini-router to take incoming exceptions and make sure they are passed off to the proper exception handler. If there is none, then it uses the default exception handler. As we have seen already, the default handler is what we can modify by using the error_format argument.

Here is a quick example of what an exception handler looks like in Sanic:

@app.exception(PinkElephantError)

async def handle_pink_elephants(request: Request, exception: Exception):

    ...

The problem with this pattern is that because you took over the actual handling of the exception, it is now your job to respond appropriately. If you build an application with 10, 20, or even more of these exception handlers, keeping their responses consistent becomes a chore.

It is for this reason that I genuinely try to avoid custom exception handling unless I need to. In my experience, I get much better results by controlling formatting, as discussed in the Fallback handling section in Chapter 6, Operating Outside the Response Handler. I try to avoid one-off response customizations that only target a single use case. While building an application, we likely need to build error handlers for many types of exceptions and not just PinkElephantError. Therefore, I tend to disfavor using exception handlers when I need to do something with the error—such as sending an email—and not just deal with how it is an output for the user.

Okay, okay, I give in. I will let you in on a secret: you can still use the app.exception pattern to intercept the error and still use the built-in error formatting. You can even still trigger some action with it—such as sending an email, lighting a smoke signal, or triggering your coffee machine—so you can get to work fixing the bug. If you like the exception handler pattern better than the signal, then it is possible to use it without my concern of formatting too many custom error responses.

Let's see how we can take an action with the error handler and still retain a consistent error formatting experience:

  1. First, let's make a simple endpoint to throw our error and report back in text format:

    class PinkElephantError(SanicException):

        status_code = 400

        message = "There is a pink elephant in the room"

        quiet = True

    @app.get("/", error_format="text")

    async def handler(request: Request):

        raise PinkElephantError

I have added quiet = True to the exception because that will suppress the traceback from being logged. This is a helpful technique when the traceback is not important to you and it just gets in the way.

  1. Next, create an exception handler to send the email, but still use the default error response:

    async def send_adam_an_email():

        print("EMAIL ADAM")

    @app.exception(PinkElephantError)

    async def handle_pink_elephants(request: Request, exception: Exception):

        await send_adam_an_email()

        return request.app.error_handler.default(request, exception)

We can access the default ErrorHandler instance using our application instance, as shown in the preceding code.

I would like you to hit that endpoint using curl so you can see that this works as expected. You should get the default text response and see that a mock email was sent to me as faked in the logs.

As you can also see, we are using the error_handler object that exists application-wide. In our next section, we will look at modifying that object.

Modifying ErrorHandler

When Sanic starts up, one of the first things that it does is create an ErrorHandler instance. We saw in the previous example that we can access it from the application instance. Its purpose is to make sure that when you define an exception handler, the request is responded to from the proper location.

One of the other benefits of this object is that it is easily customizable and is triggered on every single exception. Therefore, in the days before Sanic introduced signals, it was the easiest way to get some arbitrary code to run on every exception, such as our error-reporting utility.

Modifying the default ErrorHandler instance might have looked something like this:

  1. Create ErrorHandler and inject the reporting code:

    from sanic.handlers import ErrorHandler

    class CustomErrorHandler(ErrorHandler):

        def default(self, request: Request, exception: Exception):

            ...

  2. Instantiate your application using your new handler:

    from sanic import Sanic

    app = Sanic(..., error_handler=CustomErrorHandler())

That's it. Personally, I would almost always go for the signals solution when dealing with alerting or other error reporting. Signals have the benefit of being a much more succinct and targeted solution. It does not require me to subclass or monkey patch any objects. However, it is helpful to know how to create a custom ErrorHandler instance, as we have just seen, as you will see it out there in the wild.

For example, you will see them in third-party error-reporting services. These services are platforms that you can subscribe to that will aggregate and track exceptions in your application. They can be incredibly helpful in identifying and debugging problems in production applications. Usually, they work by hooking into your normal exception handling process. Since overriding ErrorHandler used to be the best method for low-level access to all exceptions in Sanic, many of these providers will provide sample code or libraries that implement this strategy.

Whether you use a custom ErrorHandler or signals is still a matter of personal taste. The biggest benefit, however, of signals is that they are run in a separate asyncio task. This means that Sanic will efficiently manage the concurrent response to the user with the reporting (provided you do not introduce other blocking code).

Does this mean that subclassing ErrorHandler is not a worthwhile effort? Of course not. In fact, if you are unhappy with the default error formats that Sanic uses, I would recommend that you change it using the previous example with CustomErrorHandler.

With this in mind, you now have the ability to format all of your errors as needed. An alternative strategy to this would be to manage this with exception handlers like in the app.exception pattern. The problem with that method is that you potentially lose out on Sanic's built-in auto-formatting logic. As a reminder, one of the great benefits of the default ErrorHandler is that it will attempt to respond with an appropriate format, such as HTML, JSON, or plain text, depending upon the circumstances.

Exception handling is an incredibly important component of any professional-grade web application. Make sure to put some thought into your application needs when designing a strategy. You very well may find that you need a mixture of signals, exception handlers, and a custom ErrorHandler.

We'll now turn our attention to another important aspect of professional-grade application development that may also not be exciting for some people to build: testing.

Setting up a testable application

Imagine this scenario: inspiration strikes you and you have a great application idea. Your excitement and creative juices are flowing as you start formulating ideas in your head about what to build. Of course, you do not rush straight into building it because you have read all the earlier chapters in this book. You take some time to plan it out, and in a caffeine-induced marathon, you start hacking away. Slowly, you start to see the application take shape and it is working beautifully. Hours go by, maybe it's days or weeks—you are not sure because you are in the zone. Finally, after all that work, you have a minimum viable product (MVP). You deploy it and go for some much-deserved sleep.

The problem is that you never set up testing. Undoubtedly, when you now come online and check out the error-handling system that you set up with advice from the previous section, you notice that it is swamped with errors. Uh oh! Users are doing things in your application that you did not anticipate. Data is not behaving as you thought it might. Your application is broken.

I would venture to guess that most people that have developed a web application or done any software development can relate to this story. We have all been there before. For many newcomers and experienced developers alike, testing is not fun. Maybe you are one of those rare breeds of engineers that completely love setting up a testing environment. If so, with all honesty, I tip my hat to you. For the rest of us, suffice it to say that if you want to build a professional application, you need to find the patience in you to develop a test suite.

Testing is a huge field, and I will not cover it here. There are plenty of testing strategies out there, including the often-celebrated test-driven design (TDD). If you know what this is and it works for you, great! If not, I will not judge you. If you are unfamiliar with it, I do suggest that you take some time and do some internet research on the topic. TDD is a fundamental part of many professional development workflows and many companies have adopted it.

Similarly, there are a lot of testing terms, such as unit testing and integration testing. We will use my simplified definitions of these terms: unit testing is when you test a single component or endpoint and integration testing is when you test the component or endpoint interacting with another system (such as a database).

What we care about in this book is how you can test your Sanic application in both unit and integration tests. Therefore, while I hope the general idea and approaches here are useful, to truly have a well-tested application, you will need to go beyond the pages of this book.

The last ground rule that we need to get out of the way is that the tests here will all assume that you are using pytest. It is one of the most widely used testing frameworks with many plugins and resources.

Getting started with sanic-testing

The Sanic Community Organization (the community of developers that maintain the project) also maintains a testing library for Sanic. Although its primary utility is by the Sanic project itself to achieve a high level of test coverage, it nonetheless has found a home and use case for developers working with Sanic. We will use it extensively because it provides a convenient interface for interacting with Sanic.

To start, we will need to install it in your virtual environment. While we are at it, we will install pytest too:

$ pip install sanic-testing pytest

So, what does sanic-testing do? It provides an HTTP client that you can use to reach your endpoints.

A typical barebones implementation would look like this:

  1. First, you will have your application defined in some module or factory. For now, it will be a global-scoped variable, but later in the chapter, in the Testing a full application section, we will start working with factory pattern applications where the application instance is defined inside of a function:

    # server.py

    from sanic import Sanic

    app = Sanic(__name__)

    @app.get("/")

    async def handler(request: Request):

        return text("...")

  2. Then, in your testing environment, you initialize a test client. Since we are using pytest, let's set that up in a conftest.py file as a fixture so we can easily access it:

    # conftest.py

    import pytest

    from sanic_testing.testing import SanicTestClient

    from server import app

    @pytest.fixture

    def test_client():

        return SanicTestClient(app)

  3. You will now have access to the HTTP client in your unit tests:

    # test_sample.py

    def test_sample(test_client):

        request, response = test_client.get("/")

        assert response.status == 200

  4. Running your tests now is a matter of executing the pytest command. It should look something like this:

    $ pytest

    ================= test session starts =================

    platform linux -- Python 3.9.7, pytest-6.2.5, py-1.11.0, pluggy-1.0.0

    rootdir: /path/to/testing0

    plugins: anyio-3.3.4

    collected 1 item

    test_sample.py . [100%]

    ================= 1 passed in 0.09s ===================

So, what just happened here? What happened is that the test client took your application instance and actually ran it locally on your operating system. It initiated the Sanic server, binding it to a host and port address on your operating system, and ran whatever event listeners were attached to your application. Then, once the server was running, it used httpx as an interface to send an actual HTTP request to the server. It then bundled up both the Request and HTTPResponse objects and provided them as the return value.

The code for this example can be found in the GitHub repository: https://github.com/PacktPublishing/Python-Web-Development-with-Sanic/tree/main/Chapter09/testing0.

This is something that I cannot stress enough. Just about every time that someone has come to me with a question about or problem using sanic-testing, it is because the person failed to understand that the test client is actually running your application. This happens on every single call.

For example, consider the following:

request, response = test_client.get("/foo")

request, response = test_client.post("/bar")

When you run this, it will first start up the application and send a GET request to /foo. The server then goes through the full shutdown. Next, it stands up the application again and sends a POST request to /bar.

For most test cases, this starting and stopping of the server is preferred. It will make sure that your application runs in a clean environment every time. It happens very quickly, and you can still whip through a bunch of unit tests without feeling this as a performance penalty.

There are some other options that we will explore later in the following sections.

A more practical test client implementation

Now that you have seen how the test client works, I am going to let you in on a little secret: you do not actually need to instantiate the test client. In fact, other than the previous example, I have never used sanic-testing like this in a real application.

The Sanic application instance has a built-in property that can set up the test client for you if sanic-testing has been installed. Since we already installed the package, we can just go ahead and start using it. All that you need is access to your application instance.

Setting up an application fixture

Before going further, we will revisit the pytest fixtures. If you are unfamiliar with them, they might seem somewhat magical to you. In brief, they are a pattern in pytest to declare a function that will return a value. That value can then be used to inject an object into your individual tests.

So, for example, in our last use case, we defined a fixture in a special file called conftest.py. Any fixtures that are defined there will be available anywhere in your testing environment. That is why we were able to inject test_client as an argument in our test case.

I find it almost always beneficial to do this with the application instance. Whether you are using a globally defined instance or a factory pattern, you will make testing much easier with fixtures.

Therefore, I will always do something like this in my conftest.py:

import pytest

from server import app as application_instance

@pytest.fixture

def app():

    return application_instance

I now have access to my application instance everywhere in the test environment without importing it:

def test_sample(app):

    ...

Tip

There is one more quick trick you should know about fixtures. You can use the yield syntax here to help you inject code before and after your test. This is particularly helpful with an application if you need to do any sort of cleanup after the test runs. To achieve this, do the following:

@pytest.fixture

def app():

    print("Running before the test")

    yield application_instance

    print("Running after the test")

With access to our app instance using fixtures, we can now rewrite the previous unit test like this:

def test_sample(app: Sanic):

    request, response = app.test_client.get("/")

    assert response.status == 200

To make our lives a little simpler, I added the type annotation for the fixture so that my integrated development environment (IDE) knows that it is a Sanic instance. Even though the main purpose of type hinting is to catch mistakes early, I also like to use it in cases like this to just make my IDE experience nicer.

This example shows that access to the test client is simply a matter of using the app.test_client property. By doing that, Sanic will automatically instantiate the client for you as long as the package is installed. This makes it super simple to write unit tests like this.

Testing blueprints

Sometimes, you may run across a scenario where you want to test some functionality that exists on a blueprint alone. In this case, we are assuming that any application-wide middleware or listeners that run before the blueprint are not relevant to our test. This means that we are testing some functionality that is entirely contained within the boundaries of the Blueprint.

I love situations like this and actively seek them out. The reason is that they are super easy to test, as we will see in a minute. These types of testing patterns are probably best understood as they contrast to what we will do in the Testing a full application section. The main difference is that in these tests, our endpoints do not rely upon the existence of a third-party system, such as a database. Perhaps more accurately, I should say that they do not rely upon the impacts that a third-party system might have. The functionality and business logic are self-contained, and therefore very conducive to unit testing.

When I find a situation like this, the first thing that I do is add a new fixture to my conftest.py file. It will act as a dummy application that I can use for testing. Each unit test I create can use this dummy application with my target blueprint attached and nothing else. This allows my unit test to be more narrowly focused on my single example. Let's see how that looks next:

  1. Here, we will create a new fixture that creates a new application instance:

    # conftest.py

    import pytest

    from sanic import Sanic

    @pytest.fixture

    def dummy_app():

        return Sanic("DummyApp")

  2. We can now stub out a test in our blueprint tests:

    # test_some_blueprint.py

    import pytest

    from path.to.some_blueprint import bp

    @pytest.fixture

    def app_with_bp(dummy_app):

        dummy_app.blueprint(bp)

        return dummy_app

    def test_some_blueprint_foobar(app_with_bp):

        ...

In this example, we see that I created a fixture that is localized to this one module. The point of this is to create a reusable application instance that has my target blueprint attached to it.

A simple use case for this kind of testing might be input validation. Let's add a blueprint that does some input validation. The blueprint will have a simple POST handler that looks at the incoming JSON body and just checks that the key exists and the type matches the expectation:

  1. First, we will create a schema that will be the keys and the value type that we expect our endpoint to be able to test:

    from typing import NamedTuple

    class ExpectedTypes(NamedTuple):

        a_string: str

        an_int: int

  2. Second, we will make a simple type checker that responds with one of three values depending upon whether the value exists, and is of the expected type:

    def _check(

        exists: bool,

        value: Any,

        expected: Type[object],

    ) -> str:

        if not exists:

            return "missing"

        return "OK" if type(value) is expected else "WRONG"

  3. Finally, we will create our endpoint that will take the request JSON and respond with a dictionary about whether the passed data was valid:

    from sanic import Blueprint, Request, json

    bp = Blueprint("Something", url_prefix="/some")

    @bp.post("/validation")

    async def check_types(request: Request):

        valid = {

            field_name: _check(

                field_name in request.json,

                request.json.get(field_name), field_type

            )

            for field_name, field_type in

            ExpectedTypes.__annotations__.items()

        }

        expected_length = len(ExpectedTypes.__annotations__)

        status = (

            200

            if all(value == "OK" for value in valid.values())

            and len(request.json) == expected_length

            else 400

        )

        return json(valid, status=status)

As you can see, we have now created a very simplistic data checker. We loop over the definitions in the schema and check each to see whether it is as expected. All of the values should be "OK" and the request data should be the same length as the schema.

We can now test this out in our test suite. The first thing that we could test is to make sure that all the required fields are present. There are three potential scenarios here: the input has missing fields, the input has only the correct fields, and the input has extra fields. Let's take a look at these scenarios and create some tests for them:

  1. First, we will create a test to check that there are no missing fields:

    def test_some_blueprint_no_missing(app_with_bp):

        _, response = app_with_bp.test_client.post(

            "/some/validation",

            json={

                "a_string": "hello",

                "an_int": "999",

            },

        )

        assert not any(

            value == "MISSING"

            for value in response.json.values()

        )

        assert len(response.json) == 2

In this test, we sent some bad data. Notice how the an_int value is actually a string. But we do not care about that right now. What this is meant to test is that all the proper fields were sent.

  1. Next up is a test that should contain all of the inputs, of the correct types, but nothing more:

    def test_some_blueprint_correct_data(app_with_bp):

        _, response = app_with_bp.test_client.post(

            "/some/validation",

            json={

                "a_string": "hello",

                "an_int": 999,

            },

        )

        assert response.status == 200

Here, all we need to assert is that the response is a 200 since we know that it will be a 400 if it is bad data.

  1. Lastly, we will create a test that checks that extraneous information is not sent:

    def test_some_blueprint_bad_data(app_with_bp):

        _, response = app_with_bp.test_client.post(

            "/some/validation",

            json={

                "a_string": "hello",

                "an_int": 999,

                "a_bool": True,

            },

        )

        assert response.status == 400

In this final test, we are sending known bad data since it contains the exact same payload as the previous test, except for the additional "a_bool": True. Therefore, we should assert that the response will be 400.

Looking at these tests, it seems very repetitive. While the general rule of don't repeat yourself (DRY) is often cited as a reason to abstract logic, be careful with this in testing. I would prefer to see repetitive testing code over some highly abstracted, beautiful, shiny factory pattern. In my experience—yes, I have been burned by this many times in the past—adding fancy abstraction layers to testing code is a recipe for disaster. Some abstraction might be helpful (creating the dummy_app fixture is an example of good abstraction), but too much will become difficult to maintain and update as your application needs to change. Testing code should be simple to read and easy to edit. This is certainly one of those areas where development straddles the line between science and art. Creating a powerful testing suite with a proper balance of repetition and abstraction will take some practice and is highly subjective.

With that warning out of the way, there is an abstraction layer that I do really like. It makes use of pytest.parametrize. This is a super-helpful feature that allows you to create a test and run it against multiple inputs. We are not abstracting our tests, per se, but instead are testing the same code with a variety of inputs.

Using pytest.mark.parametrize, we can actually condense those three tests into a single test:

  1. We create a decorator that has two arguments: a string containing a comma-delimited list of argument names and an iterable that contains values to be injected into the test:

    @pytest.mark.parametrize(

    "input,has_missing,expected_status",

    (

        (

            {

                "a_string": "hello",

            }, True, 400,

        ),

        (

            {

                "a_string": "hello",

                "an_int": "999",

            }, False, 400,

        ),

        (

            {

                "a_string": "hello",

                "an_int": 999,

            }, False, 200,

        ),

        (

            {

                "a_string": "hello","an_int": 999,

                "a_bool": True,

            }, False, 400,

        ),

    ),

    )

We have three values that we are going to inject into our test: input, has_missing, and expected_status. The test is going to run multiple times, and each time it will pull one of the tuples of arguments to inject into the test function.

  1. Our test function can now be abstracted to use these arguments:

    def test_some_blueprint_data_validation(

        app_with_bp,

        input,

        has_missing,

        expected_status,

    ):

        _, response = app_with_bp.test_client.post(

            "/some/validation",

            json=input,

        )

        assert any(

            value == "MISSING"

            for value in response.json.values()

        ) is has_missing

        assert response.status == expected_status

In this way, it is much easier for us to write multiple unit tests across different use cases. You may have noticed that I actually just created a fourth test. Since it was so simple to add more tests using this method, I included one use case that we had not previously tested. I hope you see the huge benefit that this creates and come to learn to love testing with @pytest.mark.parametrize.

In this example, we are defining the inputs and what our expected outcome should be. By parametrizing the single test, it actually turns this into multiple tests inside pytest.

The code for these examples can be found in the GitHub repository: https://github.com/PacktPublishing/Python-Web-Development-with-Sanic/tree/main/Chapter09/testing2.

Mocking out services

The sample blueprint that we were testing against is obviously not something we would ever use in real life. In that example, we were not actually doing anything with the data. The oversimplified example removed the need to worry about how to handle interactions with services such as a database access layer. What if we are testing a real endpoint? And, by a real endpoint, I mean one that is meant to interface with a database. For example, how about a registration endpoint? How can we test that the registration endpoint actually does what it is supposed to do and injects data as expected? Mocking is the answer. We will look at how we can use Python's mocking utilities to pretend we have a real database layer. We will also still use the dummy_app pattern for testing. Let's see what that will look like now:

  1. First, we will need to refactor our blueprint so that it looks like something you might actually encounter in the wild:

    @bp.post("/")

    async def check_types(request: Request):

        _validate(request.json, RegistrationSchema)

        connection: FakeDBConnection = request.app.ctx.db

        service = RegistrationService(connection)

        await service.register_user(request.json["username"], request.json["email"])

        return json(True, status=201)

We are still doing the input validation. However, instead of simply storing the registration details to memory, we will send them off to a database for writing to disk. You can check out the full code at https://github.com/PacktPublishing/Python-Web-Development-with-Sanic/tree/main/Chapter09/testing3 to see the input validation. The important things to note here are that we have RegistrationService, which is calling a register_user method.

  1. Since we still have not looked at the usage of object relationship mapping (ORM), our database storage function will ultimately just call some raw SQL queries. We will look at ORM in more detail in the Managing database connections section later in the chapter, but for now, let's create the registration service:

    from .some_db_connection import FakeDBConnection

    class RegistrationService:

        def __init__(self, connection: FakeDBConnection) -> None:

            self.connection = connection

        async def register_user(

            self, username: str, email: str

        ) -> None:

            query = "INSERT INTO users VALUES ($1, $2);"

            await self.connection.execute(query, username, email)

  2. The registration service calls into our database to execute some SQL. We will also need a connection to our database. For the sake of the example, I am using a fake class, but this would (and should) be the actual object that your application uses to connect to the database. Therefore, imagine that this is a proper database client:

    from typing import Any

    class FakeDBConnection:

        async def execute(self, query: str, *params: Any):

            ...

  3. With this in place, we can now create a new fixture that will take the place of our data access layer. Normally, you would create something like this to instantiate the client:

    from sanic import Sanic

    from .some_db_connection import FakeDBConnection

    app = Sanic.get_app()

    @app.before_server_start

    async def setup_db_connection(app, _):

        app.ctx.db = FakeDBConnection()

Imagine that this code exists on our actual application. It initiates the database connection and allows us to access the client within our endpoints, as shown in the preceding code, because our connection uses the application ctx object. Since our unit tests will not have access to a database, we need to create a mock database instead and attach that to our dummy application.

  1. To do that, we will create our dummy_app and then import the actual listener used by the real application to instantiate the fake client:

    @pytest.fixture

    def dummy_app():

        app = Sanic("DummyApp")

        import_module("testing3.path.to.some_startup")

        return app

  2. To force our client to use a mocked method instead of actually sending a network request to a database, we are going to monkeypatch the database client using a feature of pytest. Set up a fixture like this:

    from unittest.mock import AsyncMock

    @pytest.fixture

    def mocked_execute(monkeypatch):

        execute = AsyncMock()

        monkeypatch.setattr(

            testing3.path.to.some_db_connection.FakeDBConnection, "execute", execute

        )

        return execute

We now have a mock object in place of the real execute method, and we can proceed to build out a test on our registration blueprint. One of the great benefits of using the unittest.mock library is that it allows us to assert that the database client would have been called. We will see what that looks like next.

  1. Here, we create a test with some assertions that help us to know that the correct data will make its way to the data access layer:

    @pytest.mark.parametrize(

        "input,expected_status",

        (

            (

                {

                    "username": "Alice",

                    "email": "[email protected]",

                },

                201,

            ),

        ),

    )

    def test_some_blueprint_data_validation(

        app_with_bp,

        mocked_execute,

        input,

        expected_status,

    ):

        _, response = app_with_bp.test_client.post(

            "/registration",

            json=input,

        )

        assert response.status == expected_status

        if expected_status == 201:

            mocked_execute.assert_awaited_with(

                "INSERT INTO users VALUES ($1, $2);", input["username"], input["email"]

            )

Just like before, we are using parametrize so that we can run multiple tests with different inputs. The key takeaway is the usage of the mocked execute method. We can ask pytest to provide that mocked object to us so that our test can make assertions upon it and we know that it was executed as expected.

This is certainly helpful for testing isolated issues, but what if there needs to be application-wide testing? We will look at that next.

Testing a full application

As an application progresses from its infancy, there is likely to be a network of middleware, listeners, and signals that process requests that are not just limited to the route handler. In addition, there are likely to be connections to other services (such as databases) that complicate the entire process. A typical web application cannot be run in a vacuum. When it starts up, it needs to connect to other services. These connections are critical to the proper performance of the application, and therefore if they do not exist, then the applications cannot start. Testing these can be very troublesome. Do not just throw your hands up and give up. Resist the temptation. In the previous tests, there was a glimpse of how this can be achieved quite simply. We did in fact successfully test against our database. But what if that is not enough?

Sometimes testing against dummy_app is not sufficient.

This is why I really like applications that are created by a factory pattern. The GitHub repository for this chapter provides an example of a factory pattern that I use a lot. It has some very helpful features in it. Essentially, the end result is a function that returns a Sanic instance with everything attached to it. Through the implementation of the Sanic standard library, the function crawls through your source code looking for things to attach to it (routes, blueprints, middleware, signals, listeners, and much more) and is set up to avoid circular import issues. We talked about factory patterns and their benefits back in Chapter 2, Organizing a Project.

What is particularly important right now is that the factory in the GitHub repository can selectively choose what to instantiate. This means we can use our actual application with targeted functionality. Let me provide an example.

Once, I was building an application. It was critical to know exactly how it was performing in the real world. Therefore, I created middleware that would calculate some performance metrics and then send them off to a vendor for analysis. Performance was critical—which was part of my decision to use Sanic to begin with. When I tried to do some testing, I realized that I could not run the application in my test suite if it did not connect to the vendor. Yes, I could have mocked it out. However, a better strategy was to just skip the operation altogether. Sometimes, there really is no need to test every bit of functionality.

To make this concrete, here is a real quick explanation of what I am talking about. Here is a middleware code snippet that calculates runtime at the beginning and end of the request and sends it off:

from time import time

from sanic import Sanic

app = Sanic.get_app()

@app.on_request

async def start_timer(request: Request) -> None:

    request.ctx.start_time = time()

@app.on_response

async def stop_timer(request: Request, _) -> None:

    end_time = time()

    total = end_time - request.ctx.start_time

    async send_the_value_somewhere(total)

One solution to my problem of contrasting testing versus production behavior could be to change the application code to only run in production:

if app.config.ENVIRONMENT == "PRODUCTION":

    ...

But in my opinion, a better solution is to skip this middleware altogether. Using the factory pattern shown in the repo, I could do this:

from importlib import import_module

from typing import Optional, Sequence

from sanic import Sanic

DEFAULT = ("path.to.some_middleware.py",)

def create_app(modules: Optional[Sequence[str]] = None) -> Sanic:

    app = Sanic("MyApp")

    if modules is None:

        modules = DEFAULT

    for module in modules:

        import_module(module)

    return app

In this factory, we are creating a new application instance and looping through a list of known modules to import them. In normal usage, we would create an application by calling create_app(), and the factory would import the DEFAULT known modules. By importing them, they will attach to our application instance. More importantly, however, this factory allows us to send an arbitrary list of modules to load. This allows us the flexibility to create a fixture in our tests that uses the actual factory pattern for our application but has the control to pick and choose what to load.

In our use case, we decided that we do not want to test the performance middleware. We can skip it by creating a test fixture that simply ignores that module:

from path.to.factory import create_app

@pytest.fixture

def dummy_app():

    return create_app(modules=[])

As you can see, this opens up the ability for me to create tests that are specifically targeting parts of my actual application, and not just a dummy application. Using a factory through the use of inclusion and exclusion, I can create unit tests with only the functionality that I need and avoid the unneeded functionality.

I hope your mind is now racing with possibilities that this opens up for you. Testing becomes so much easier when the application is itself composable. This awesome trick is one way you can really take your application development to the next level. An easily composable application becomes an easily testable application. This leads to the application being well tested and now you are truly on your way to becoming a next-level developer.

If you have not already begun, I highly suggest that you use a factory like mine. Go ahead and copy it. Just promise me that you will use it to create some unit tests.

Using ReusableClient for testing

Up until this point, we have been using a test client that starts and stops a service on every call to it. The sanic-testing package ships with it another test client that can be manually started and stopped. Therefore, it is possible to reuse it between calls, or even tests. In the next subsection, we will learn about this reusable test client.

Running a single test server per test

You may sometimes need to have multiple calls to your API running on the same instance. For example, this could be useful if you were storing some temporary state in between calls in memory. This is obviously not a good solution in most use cases because storing the state in memory makes horizontal scaling difficult. Leaving that issue aside, let's take a quick look at how you might implement this:

  1. We will first create an endpoint that just spits out a counter:

    from sanic import Sanic, Request, json

    from itertools import count

    app = Sanic("test")

    @app.before_server_start

    def setup(app, _):

        app.ctx.counter = count()

    @app.get("")

    async def handler(request: Request):

        return json(next(request.app.ctx.counter))

In this simplified example, every time that you hit the endpoint, it will increment a number.

  1. We can test this endpoint that maintains an internal state by using a ReusableClient instance, as follows:

    from sanic_testing.reusable import ReusableClient

    def test_reusable_context(app):

        client = ReusableClient(app, host="localhost", port=9999)

        with client:

            _, response = client.get("/")

            assert response.json == 0

            _, response = client.get("/")

            assert response.json == 1

            _, response = client.get("/")

            assert response.json == 2

As long as you are using the client inside that with context manager, then you will be hitting the exact same instance of your application in each call.

  1. We can simplify the preceding code by using fixtures:

    from sanic_testing.reusable import ReusableClient

    import pytest

    @pytest.fixture

    def test_client(app):

        client = ReusableClient(app, host="localhost", port=9999)

        client.run()

        yield client

        client.stop()

Now, when you set up a unit test, it will keep the server running for as long as the test function is executing.

  1. This unit test could be written as follows:

    def test_reusable_fixture(test_client):

        _, response = test_client.get("/")

        assert response.json == 0

        _, response = test_client.get("/")

        assert response.json == 1

        _, response = test_client.get("/")

        assert response.json == 2

As you can see, this is a potentially powerful strategy if you want to run only a single server for the duration of your test function.

What if you want to keep the instance running for the entire duration of your testing? The simplest way would be to change the scope of the fixture to session:

@pytest.fixture(scope="session")

def test_client():

    client = ReusableClient(app, host="localhost", port=9999)

    client.run()

    yield client

    client.stop()

With this setup, no matter where you are running tests in pytest, it will be using the same application. While I personally have never felt the need for this pattern, I can definitely see its utility.

The code for this example can be found in the GitHub repository: https://github.com/PacktPublishing/Python-Web-Development-with-Sanic/tree/main/Chapter09/testing4.

With both proper exception management and testing out of the way, the next critical addition of any true professional application is logging.

Gaining insight from logging and tracing

When it comes to logging, I think that most Python developers fall into three main categories:

  • People that always use print statements
  • People that have extremely strong opinions and absurdly complex logging setups
  • People that know they should not use print but do not have the time or energy to understand Python's logging module

If you fall into the second category, you might as well skip this section. There is nothing in it for you except if you want to criticize my solutions and tell me there is a better way.

If you fall into the first category, then you really need to learn to change your habits. Don't get me wrong, print is fantastic. However, it does not have a place in professional-grade web applications because it does not provide the flexibility that the logging module offers. "Wait a minute!" I hear the first-category people shouting already. "If I deploy my application with containers and Kubernetes, it can pick up my print output and redirect it from there." If you are deadset against using logging, then I suppose I might not be able to change your mind. But I am still going to try.

I used to fall into the third category and taking the time to learn about the logging module changed the way I develop. If you are like me, then I hope to finally convince you to make the switch as we break down the mystery of Python logging.

Leaving aside the configuration complexity, consider that the logging module provides a rich API to send messages at different levels and with meta context. If you want to take a giant leap forward from an amateur to a professional, then I suggest that you change from print to logging.

Let's examine the standard Sanic access logs. The message that the access logger sends out is actually blank. Take a look for yourself in the Sanic codebase if you want. The access log is this:

access_logger.info("")

The message is an empty string. What you actually see is something more like this:

[2021-10-21 09:39:14 +0300] - (sanic.access)[INFO][127.0.0.1:58388]: GET http://localhost:9999/  200 13

How does the logged message have all data from an empty string? Embedded in that line is a bunch of metadata that is both machine-friendly and human-readable, thanks to the logging module. In fact, you can store arbitrary data with logs that some logging configurations will store for you, something like this:

log.info("Some message", extra={"arbitrary": "data"})

If I have convinced you and you want to learn more about how to use logging in Sanic, let's continue.

Types of Sanic loggers

Sanic ships with three loggers. You can access all of them in the log module:

from sanic.log import access_logger, error_logger, logger

Feel free to use these in your own applications. Especially in smaller projects, I will often use the Sanic logger object for convenience. These are, of course, actually intended for use by Sanic itself, but nothing is stopping you from using them. In fact, it might be convenient as you know that all of your logs are formatted consistently. My only word of caution is that it's best to leave the access_logger object alone since it has a highly specific job.

Why would you want to use both error_logger and a regular logger? I think the answer depends upon what you want to happen to your logs. There are many options to choose from. The simplest form is obviously just to output to the console. This is not a great idea for error logs, however, since you have no way to persist the message and review them when something bad happens. Therefore, you might take the next step and output your error_logger to a file. This, of course, could become cumbersome, so you might decide instead to use a third-party system to ship off your logs to another application to store and make them accessible. Whatever setup you desire, using multiple loggers may play a particular role in how the logging messages are handled and distributed.

Creating your own loggers, my first step in application development

When I approach a new project, one of the things I ask myself is what will happen with my production logs? This is, of course, a question highly dependent upon your application, and you will need to decide this for yourself. Asking the question, though, highlights a very important point: there is a distinction between development logs and production logs. More often than not, I have no clue what I want to do with them in production yet. We can defer that question for another day.

Before I even begin writing my application, I will create a logging framework. I know that the goal is to have two sets of configurations, so I begin with my development logs.

I want to emphasize this again: the very first step in building an application is to make a super-simple framework for standing up an application with logging. So, let's go through that setup process now:

  1. The first thing we are going to do is make a super-basic scaffold following the patterns that we established in Chapter 2, Organizing a Project:

    .

    ├── Dockerfile

    ├── myapp

    │   ├── common

    │   │   ├── __init__.py

    │   │   └── log.py

    │   ├── __init__.py

    │   └── server.py

    └── tests

This is the application structure that I like to work with because it makes it very easy for me to develop on. Using this structure, we can easily create a development environment focused upon running the application locally, testing the application, logging, and building images. Here, we obviously are concerned with running the application locally with logging.

  1. The next thing I like to create is my application factory with a dummy route on it that I will remove later. This is how we can begin server.py. We will continue to add to it:

    from sanic import Sanic, text

    from myapp.common.log import setup_logging, app_logger

    def create_app():

        app = Sanic(__name__)

        setup_logging()

        @app.route("")

        async def dummy(_):

            app_logger.debug("This is a DEBUG message")

            app_logger.info("This is a INFO message")

            app_logger.warning("This is a WARNING message")

            app_logger.error("This is a ERROR message")

            app_logger.critical("This is a CRITICAL message")

            return text("")

        return app

There is a very important reason that I call setup_logging after creating my app instance. I want to be able to use the configuration logic from Sanic to load environment variables that may be used in creating my logging setup.

Here's a quick aside that I want to point out before continuing. There are two different camps when it comes to creating a Python logger object. One side says that it is best practice to create a new logger in every module. In this scenario, you would put the following code at the top of every single Python file:

from logging import getLogger

logger = getLogger(__name__)

The benefit of this approach is having the module name of where it was created closely related to the logger name. This is certainly helpful in tracking down where a log came from. The other camp, however, says that it should be a single global variable that is imported and reused since that may be easier to configure and control. Besides, we can specifically target filenames and line numbers quickly with proper log formatting, so it is unnecessary to include the module name in the logger name. While I do not discredit the localized, per-module approach, I too prefer the simplicity of importing a single instance like this:

from logging import getLogger

logger = getLogger("myapplogger")

If you dive really deep into logging, this also provides you with a much greater ability to control how different logger instances operate. Similar to the conversation about exception handlers, I would rather limit the number of instances I need to control. In the example that I just showed for server.py, I chose the second option to use a single global logging instance. This is a personal choice and there is no wrong answer in my opinion. There are benefits and detriments of both strategies, so choose which makes sense to you.

  1. The next step is to create the basic log.py. For now, let's keep it super simple, and we will build from there:

    import logging

    app_logger = logging.getLogger("myapplogger")

    def setup_logging():

        ...

  2. With this in place, we are ready to run the application and test it out. But wait! Where is the app that we pass to our sanic command?

We previously used this to run our application:

$ sanic src.server:app -p 7777 --debug --workers=2

Instead, we will tell the Sanic CLI the location of the create_app function and let it run that for us. Change your startup to this:

$ sanic myapp.server:create_app --factory -p 7777 --debug --workers=2

You should now be able to hit your endpoint and see some basic messages output to your terminal. You likely will not have the DEBUG message since the logger is still probably set to only INFO and above. You should see something basic like this:

This is a WARNING message

This is a ERROR message

This is a CRITICAL message

Awesome, we now have the basics of logging down. Next, we will look to see how we can inject some more helpful information into our logs.

Configuring logging

The preceding logging messages are exactly what using print could provide. The next thing that we need to add is some configuration that will output some metadata and format the messages. It is important to keep in mind that some logging details may need to be customized to suit the production environment:

  1. We, therefore, will start by creating a simple configuration:

    DEFAULT_LOGGING_FORMAT = "[%(asctime)s] [%(levelname)s] [%(filename)s:%(lineno)s] %(message)s"

    def setup_logging(app: Sanic):

        formatter = logging.Formatter(

            fmt=app.config.get("LOGGING_FORMAT", DEFAULT_LOGGING_FORMAT),

            datefmt="%Y-%m-%d %H:%M:%S %z",

        )

        handler = logging.StreamHandler()

        handler.setFormatter(formatter)

        app_logger.addHandler(handler)

Make sure to note that we changed the signature function of setup_logging to now take the application instance as an argument. Make sure to go back to update your server.py file to reflect this change.

As a side note, sometimes you might want to simplify your logging to force Sanic to use the same handlers. While you can certainly go through the process of updating the Sanic logger configuration (see https://sanic.dev/en/guide/best-practices/logging.html#changing-sanic-loggers), I find that to be much too tedious. A simpler approach is to set up the logging handlers and then simply apply them to the Sanic loggers, as follows:

from sanic.log import logger, error_logger

def setup_logging(app: Sanic):

    ...

    logger.handlers = app_logger.handlers

    error_logger.handlers = app_logger.handlers

It is good practice to always have StreamHandler. This will be used to output your logs to the console. But what if we want to add some additional logging utilities for production? Since we are not 100% sure yet what our production requirements will be, we will set up logging to a file for now. This can always be swapped out at another time.

  1. Change your log.py to look like this:

    def setup_logging(app: Sanic):

        formatter = logging.Formatter(

            fmt=app.config.get("LOGGING_FORMAT", DEFAULT_LOGGING_FORMAT),

            datefmt="%Y-%m-%d %H:%M:%S %z",

        )

        handler = logging.StreamHandler()

        handler.setFormatter(formatter)

        app_logger.addHandler(handler)

        if app.config.get("ENVIRONMENT", "local") == "production":

            file_handler = logging.FileHandler("output.log")

            file_handler.setFormatter(formatter)

            app_logger.addHandler(file_handler)

You can easily see how this could be configured with a different kind of logging handler or formatting that might more closely match your needs in different environments.

All of the configurations shown used programmatic controls of the logging instance. One of the great flexibilities of the logging library is that all of this can be controlled with a single dict configuration object. You, therefore, will find it a very common practice to keep YAML files containing logging configurations. These files are easy to update and swap in and out of build environments to control production settings.

Adding color context

The preceding setup is entirely functional, and you could stop there. However, to me, this is not enough. When I am developing a web application, I always have my terminal open spitting out logs. In a sea of messages, it might be hard to sift through all of the text. How can we make this better? We will achieve this through the appropriate use of color.

Because I generally do not need to add color to my production output, we will go through adding color formatting in my local environment only:

  1. We will begin by setting up a custom logging formatter that will add colors based upon the logging level. Any debug messages are blue, warnings are yellow, errors are red, and a critical message will be red with a white background to help them stand out (in a dark-colored terminal):

    class ColorFormatter(logging.Formatter):

        COLORS = {

            "DEBUG": "33[34m",

            "WARNING": "33[01;33m",

            "ERROR": "33[01;31m",

            "CRITICAL": "33[02;47m33[01;31m",

        }

        def format(self, record) -> str:

            prefix = self.COLORS.get(record.levelname)

            message = super().format(record)

            if prefix:

                message = f"{prefix}{message}33[0m"

            return message

We are using the standard color escape codes that most terminals understand to apply the colors. This will color the entire message. You, of course, could get much fancier by coloring only parts of your messages, and if that interests you, I suggest you play around with this formatter to see what you can achieve.

  1. After we create this, we will make a quick internal function to decide which formatter to use:

    import sys

    def _get_formatter(is_local, fmt, datefmt):

        formatter_type = logging.Formatter

        if is_local and sys.stdout.isatty():

            formatter_type = ColorFormatter

        return formatter_type(

            fmt=fmt,

            datefmt=datefmt,

        )

If we are in a local environment, that is, a TTY terminal, then we use our color formatter.

  1. We need to change the start of our setup_logging function to account for these changes. We will also abstract some more details to our configuration for easy access to change them per environment:

    DEFAULT_LOGGING_FORMAT = "[%(asctime)s] [%(levelname)s] [%(filename)s:%(lineno)s] %(message)s"

    DEFAULT_LOGGING_DATEFORMAT = "%Y-%m-%d %H:%M:%S %z"

    def setup_logging(app: Sanic):

        environment = app.config.get("ENVIRONMENT", "local")

        logging_level = app.config.get(

            "LOGGING_LEVEL", logging.DEBUG if environment == "local" else logging.INFO

        )

        fmt = app.config.get("LOGGING_FORMAT", DEFAULT_LOGGING_FORMAT)

        datefmt = app.config.get("LOGGING_DATEFORMAT", DEFAULT_LOGGING_DATEFORMAT)

        formatter = _get_formatter(environment == "local", fmt, datefmt)

        ...

Besides dynamically getting a formatter, this example adds another new piece to the puzzle. It is using a configuration value to determine the logging level of your logger.

Adding some basic tracing with request IDs

A common problem with logs is that they can become noisy. It might be tough to correlate a specific log with a specific request. For example, you might be handling multiple requests at the same time. If there is an error, and you want to look back at earlier messages, how do you know which logs should be grouped together?

There are entire third-party applications that add what is known as tracing. This is particularly helpful if you are building out a system of inter-related microservices that work together to respond to incoming requests. While we're not necessarily diving into microservice architecture, it is worth mentioning here that tracing is an important concept that should be added to your application. This is true regardless of whether your application architecture uses microservices or not.

For our purpose, what we want to achieve is to add a request identifier to every single request. Whenever that request attempts to log something, that identifier will automatically be injected into our request format. In order to accomplish this goal, we do the following:

  1. First, we need a mechanism to inject the request object into every logging operation.
  2. Second, we need a way to show the identifier whether it exists or ignore it if it does not.

Before we get to the code implementation, I would like to point out that the second part could be handled in a couple of ways. The simplest might be to create a specific logger that will only be used inside of a request context. This means that you would have one logger that is used in startup and shutdown operations, and another that is used only for requests. I have seen this approach used well.

The problem is that we are again using multiple loggers. To be entirely honest, I really do prefer the simplicity of having just a single instance that works for all of my use cases. This way, I do not need to bother thinking about which logger I should reach for. Therefore, I will show you here how to build option two: an omni-logger that can be used anywhere in your application. If you instead prefer the more targeted types, then I challenge you to take my concepts here and build out two loggers instead of one.

We will get started by tackling the issue of passing the request context. Remember, because Sanic operates asynchronously, there is no way to guarantee which request will be handled in what order. Luckily, the Python standard library has a utility that works great with asyncio. It is the contextvars module. What we will do to start is create a listener that sets up a context that we can use to share our request object and pass it to the logging framework:

  1. Create a file called ./middleware/request_context.py. It should look like this:

    from contextvars import ContextVar

    from sanic import Request, Sanic

    app = Sanic.get_app()

    @app.after_server_start

    async def setup_request_context(app: Sanic, _):

        app.ctx.request = ContextVar("request")

    @app.on_request

    async def attach_request(request: Request):

        request.app.ctx.request.set(request)

What is happening here is that we are creating a context object that can be accessed from anywhere that has access to our app. Then, on every single request, we will attach the current request to the context variable to make it accessible from anywhere the application instance is accessible.

  1. The next thing that needs to happen is to create a logging filter that will grab the request (if it exists) and add it to our logging record. In order to do this, we will actually override Python's function that creates logging records in our log.py file:

    old_factory = logging.getLogRecordFactory()

    def _record_factory(*args, app, **kwargs):

        record = old_factory(*args, **kwargs)

        record.request_info = ""

        if hasattr(app.ctx, "request"):

            request = app.ctx.request.get(None)

            if request:

                display = " ".join([str(request.id), request.method, request.path])

                record.request_info = f"[{display}] "

        return record

Make sure you notice that we need to stash the default record factory because we want to make use of it. Then, when this function is executed, it will check to see whether there is a current request by looking inside that request context we just set up.

  1. We also need to update our format to use this new bit of information. Make sure to update this value:

    DEFAULT_LOGGING_FORMAT = "[%(asctime)s] [%(levelname)s] [%(filename)s:%(lineno)s] %(request_info)s%(message)s"

  2. Finally, we can inject the new factory as shown:

    from functools import partial

    def setup_logging(app: Sanic):

        ...

        logging.setLogRecordFactory(partial(_record_factory, app=app))

Feel free to check this book's GitHub repository to make sure that your log.py looks like mine: https://github.com/PacktPublishing/Python-Web-Development-with-Sanic/tree/main/Chapter09/tracing/myapp.

  1. With all of this in place, it is time to hit our endpoint. You should now see some nice, pretty colors in your terminal, and some request information inserted:

    [2021-10-21 12:22:48 +0300] [DEBUG] [server.py:12] [b5e7da51-68b0-4add-a850-9855c0a16814 GET /] This is a DEBUG message

    [2021-10-21 12:22:48 +0300] [INFO] [server.py:13] [b5e7da51-68b0-4add-a850-9855c0a16814 GET /] This is a INFO message

    [2021-10-21 12:22:48 +0300] [WARNING] [server.py:14] [b5e7da51-68b0-4add-a850-9855c0a16814 GET /] This is a WARNING message

    [2021-10-21 12:22:48 +0300] [ERROR] [server.py:15] [b5e7da51-68b0-4add-a850-9855c0a16814 GET /] This is a ERROR message

    [2021-10-21 12:22:48 +0300] [CRITICAL] [server.py:16] [b5e7da51-68b0-4add-a850-9855c0a16814 GET /] This is a CRITICAL message

After running through these examples, one thing you might have noticed and not seen before is request.id. What is this and where does it come from?

Using X-Request-ID

It is a common practice to use Universally Unique Identifier (UUIDs) to track requests. This makes it very easy for client applications to also track requests and correlate them to specific instances. This is why you will often hear them called correlation IDs. If you hear the term, they are the exact same thing.

As a part of the practice of correlating requests, many client applications will send an X-Request-ID header. If Sanic sees that header in an incoming request, then it will grab that ID and use it to identify the request. If not, then it will automatically generate a UUID for you. Therefore, you should be able to send the following request to our logging application and see that ID populated in the logs:

$ curl localhost:7777 -H 'x-request-id: abc123'

For the sake of simplicity, I am not using a UUID.

Your logs should now reflect this:

[2021-10-21 12:36:00 +0300] [DEBUG] [server.py:12] [abc123 GET /] This is a DEBUG message

[2021-10-21 12:36:00 +0300] [INFO] [server.py:13] [abc123 GET /] This is a INFO message

[2021-10-21 12:36:00 +0300] [WARNING] [server.py:14] [abc123 GET /] This is a WARNING message

[2021-10-21 12:36:00 +0300] [ERROR] [server.py:15] [abc123 GET /] This is a ERROR message

[2021-10-21 12:36:00 +0300] [CRITICAL] [server.py:16] [abc123 GET /] This is a CRITICAL message

Logging is a critical component of professional-grade web applications. It really does not need to be that complicated. I have seen super lengthy and overly verbose configurations that quite honestly scared me away. With a little bit of attention to detail, however, you can make a truly fantastic logging experience without much effort. I encourage you to grab the source code for this and hack it until it meets your needs.

We'll next turn our attention to another critical component of web applications: database management.

Managing database connections

This book above all else is really hoping to provide you with confidence to build applications your way. This means we are actively looking to stomp out copy/paste development. You know what I mean. You go to Stack Overflow or some other website, copy code, paste it, and then move on with your day without thinking twice about it.

This sort of copy/paste mentality is perhaps most prevalent when it comes to database connections. Time for a challenge. Go start up a new Sanic app and connect it to a database. Some developers might approach this challenge by heading to some other codebase (from another project, an article, documentation, or a help website), copying some basic connection functions, changing the credentials, and calling it a day. They may never have put much thought into what it means to connect to a database: if it works, then it must be okay. I know I certainly did that for a long time.

This is not what we are doing here. Instead, we will consider a couple of common scenarios, think through our concerns, and develop a solution around them.

To ORM or not to ORM, that is the question

For the benefit of anyone that does not know what ORM is, here is a quick definition.

ORM is a framework used to build Python-native objects. Those objects are related directly to a database schema and are also used to build queries to fetch data from the database to be used when building the Python objects. In other words, they are a data access layer that has the capability of two-way translation from Python and to the database. When people talk about ORM, they are typically referring to one that is intended to be used with an SQL-based database.

The question about whether to use ORM or not is answered with some strong opinions. In some contexts, people might think you are living in the Stone Age if instead of using it you are hand-writing your SQL queries. On the other hand, some people will think ORM is a nuisance and leads to both overly simplistic yet grotesquely complicated and inefficient queries. I suppose to an extent both groups are correct.

Ideally, I cannot tell you what you should or should not do. The implementation details and the use case are highly relevant to any decision. In my projects, I tend to shy away from it. I like to use the databases project (https://github.com/encode/databases) to build custom SQL queries, and then map the results to dataclass objects. After handcrafting my SQL, I use some utilities to hydrate them from raw, unstructured values into schema-defined Python objects. I have also, in the past, made extensive use of ORM tools such as peewee (https://github.com/coleifer/peewee) and SQLAlchemy (https://github.com/sqlalchemy/sqlalchemy). And, of course, since I developed in Django for many years, I have done a lot of work in its internal ORM tool.

When should you use ORM? First and foremost, for most projects, using ORM should probably be the default option. ORM tools are great at adding the required safety and security to make sure that you do not accidentally introduce a security bug. By enforcing types, they can be extremely beneficial in maintaining data integrity. And, of course, there is the benefit of abstracting away a lot of the database knowledge. Where ORM falls short, perhaps, is in its ability to handle complexity. As a project grows in the number of tables and interconnected relationships, it may be more difficult to continue using ORM. There also are a lot of more advanced options in SQL languages, such as PostgreSQL, that you simply cannot accomplish by using an ORM tool to build your queries. I find them to really shine in more simplistic create/read/update/delete (CRUD) applications, but actually get in the way of more complex database schemas.

Another potential downside to ORM is that it makes it super easy to sabotage your own project. A little mistake in building an inefficient query could be the difference between absurdly long response times and super-fast responses. Speaking from experience as someone who was bit by this bug, I find that applications that are built with ORM tools tend to over fetch data and inefficiently run more network calls than are needed. If you feel comfortable with SQL and know that your data will become fairly complicated, then perhaps you are better off writing your own SQL queries. The biggest benefit of using hand-crafted SQL is that it overcomes the complexity-scaling issue of ORM.

Even though this book is not about SQL, after much consideration, I think the best use of our time is to build a custom data layer and not use an off-the-shelf ORM tool. This option will force us into making good choices about maintaining our connection pools and developing secure and practical SQL queries. Moreover, anything that is discussed here in regard to implementation can easily be swapped out to a more fully featured ORM tool. If you are more familiar and comfortable with SQLAlchemy (which now has async support), then feel free to swap out my code accordingly.

Creating a custom data access layer in Sanic

When deciding upon which strategy to use for this book, I explored a lot of the options out there. I looked at all of the popular ORM tools that I see people using with Sanic. Some options, such as SQLAlchemy, have so much material out there that I could not possibly do it justice. Other options encouraged lower-quality patterns. Therefore, we turn to one of my favorites, the databases package, using asyncpg to connect to Postgres (my relational database of choice). The goal will be to implement good connection management, provide a simple and intuitive pattern to query the data, and output a set of models that will make building applications easier and more consistent.

I highly encourage you to look at the code in the repository at https://github.com/PacktPublishing/Python-Web-Development-with-Sanic/tree/main/Chapter09/hikingapp. This is one of the first times that we have created a complete application. By that, I mean an example of an application that goes out to fetch some data. Going back to the discussion from Chapter 2, Organizing a Project, about project layout, you will see an example of how we might structure a real-world application. There is also a lot going on in there that is somewhat outside of the scope of the discussion here (which is much more narrowly focused on database connections), so we will not dive too deeply into it right now. But do not worry, we will come back to the application's patterns again in Chapter 11, A Complete Real-World Example, when we build out a full application.

In the meantime, it might be a good opportunity for you to review that source code now. Try to understand how the project is structured, run it, and then test out some of the endpoints. Instructions are in the repository: https://github.com/PacktPublishing/Python-Web-Development-with-Sanic/blob/main/Chapter09/hikingapp/README.md.

I also would like to point out that since our applications are growing with the addition of another service, I am going to start running the application using docker-compose and Docker containers locally. All the build materials are in the GitHub repository for you to copy for your own needs. But, of course, you would not dare to just copy and paste the code without actually understanding it, so let's make sure that you do.

The application we are talking about is a web API for storing details about hiking. It connects its database of known hiking trails to users who can keep track of the total distance they have hiked and when they hiked certain trails. When you spin up the database, there should be some information prepopulated for you.

The first thing that we must do is make sure that our connection details are coming from environment variables. Never store them in the project files. Besides the security concerns associated with this, it is super helpful to make changes by redeploying your application with different values if you need to change the size of your connection pool or rotate your passwords. Let's begin:

  1. Store your connection settings using docker-compose, Kubernetes, or any other tool you are using to run your containers. If you are not running Sanic in a container (for example, you plan to deploy to a PaaS solution that offers environment variables for you through a graphical user interface (GUI)), an option that I like to use for local development is dotenv (https://github.com/theskumar/python-dotenv).

The config values that we care about right now are the data source name (DSN) and the connection pool settings. If you are not familiar with a DSN, it is a string that contains all of the information needed to connect to a database in a form that might look familiar to you as a URL.

A connection pool is an object that holds open a set number of connections to your database, and then allows your application to make use of those connections as needed. Imagine a scenario without a connection pool where a web request comes in and then your application goes and opens a network socket to your database. It fetches information, serializes it, and sends it back to the client. But it also closes that connection because there is no common pool to draw from. The next time that a request comes to your application, it will need to reopen a connection to the same database. This is hugely inefficient. Instead, your application can warm up several connections by opening them and holding them in reserve in the common pool.

  1. Then, when the application needs a connection, instead of opening a new connection, it can simply connect to your database using a connection pool and store that object in your application ctx:

    # ./application/hiking/worker/postgres.py

    @app.before_server_start

    async def setup_postgres(app: Sanic, _):

        app.ctx.postgres = Database(

            app.config.POSTGRES_DSN,

            min_size=app.config.POSTGRES_MIN,

            max_size=app.config.POSTGRES_MAX,

        )

    @app.after_server_start

    async def connect_postgres(app: Sanic, _):

        await app.ctx.postgres.connect()

    @app.after_server_stop

    async def shutdown_postgres(app: Sanic, _):

        await app.ctx.postgres.disconnect()

As you can see, three main things are happening:

  1. The first is we are creating the Database object, which stores our connection pool and acts as the interface for querying. We store it in the app.ctx object so that it will be easily accessible from anywhere in the application. This was placed inside of the before_server_start listener since it alters the state of our application.
  2. The second is that the listener actually opens up the connections to the database and holds them at the ready until they are needed. We are warming up the connection pool prematurely so that we do not need to spend the overhead at query time.
  3. Lastly, of course, the important step we do is to make sure that our application properly shuts down its connections.
  1. The next thing we need to do is create our endpoints. In this example, we will use class-based views:

    from sanic import Blueprint, json, Request

    from sanic.views import HTTPMethodView

    from .executor import TrailExecutor

    bp = Blueprint("Trails", url_prefix="/trails")

    class TrailListView(HTTPMethodView, attach=bp):

        async def get(self, request: Request):

            executor = TrailExecutor(request.app.ctx.postgres)

            trails = await executor.get_all_trails()

            return json({"trails": trails})

Here, the GET endpoint on the root level of the /trails endpoint is meant to provide a list of all trails in the database (forgetting about pagination). TrailExecutor is one of those objects that I do not want to dive too deeply into right now. But, as you can probably guess from this code, it takes the instance of our database (which we initiated in the last step) and provides methods to fetch data from the database.

One of the reasons that I really like the databases package is that it makes it incredibly easy to handle connection pooling and session management. It basically does it all for you under the hood. But one thing that is a good habit to get into (regardless of what system you are using) is to wrap multiple consecutive writes to your database in a transaction.

Imagine that you needed to do something like this:

executor = FoobarExecutor(app.ctx.postgres)

await executor.update_foo(value=3.141593)

await executor.update_bar(value=1.618034)

Often, when you have multiple database writes in a single function, you want either all of them to succeed or all of them to fail. Having a mixture of success and failures might, for example, leave your application in a bad state. When you identify situations like this it is almost always beneficial to nest your functions inside of a single transaction. To implement such a transaction within our sample, it would look something like this:

executor = FoobarExecutor(app.ctx.postgres)

async with app.ctx.postgres.transaction():

    await executor.update_foo(value=3.141593)

    await executor.update_bar(value=1.618034)

Now, if one of the queries fails for whatever reason, the database state will be rolled back to where it was before the change. I highly encourage you to adopt a similar practice no matter what framework you use for connecting to your database.

Of course, a discussion of databases is not necessarily limited to SQL databases. There are plenty of NoSQL options out there, and you, of course, should figure out what works for your needs. We will next take a look at connecting my personal favorite database option to Sanic: Redis.

Connecting Sanic to Redis

Redis is a blazingly fast and simple database to work with. Many people think of it simply as a key/value store, which is something that it does extremely well. It also has a lot of other features that could be thought of as a sort-of shared primitive data type. For example, Redis has hashes, lists, and sets. These correspond nicely to Python's dict, list, and set. It is for this reason that I often recommend this as a solution to someone that needs to share data across a horizontal scale-out.

In our example, we will use Redis as a caching layer. For this, we are relying upon its hashmap capability to store a dict-like structure with details about a response. We have an endpoint that might take several seconds to generate a response. Let's simulate that now:

  1. First, create a route that will take a while to generate a response:

    @app.get("/slow")

    async def wow_super_slow(request: Request):

        wait_time = 0

        for _ in range(10):

            t = random.random()

            await asyncio.sleep(t)

            wait_time += t

        return text(f"Wow, that took {wait_time:.2f}s!")

  2. Check to see that it works:

    $ curl localhost:7777/slow

    Wow, that took 5.87s!

The response took 5.87 seconds, which is very slow for a response time. To make this faster on subsequent requests, we will create a decorator that will serve precached responses if existing:

  1. First, we will install aioredis:

    $ pip install aioredis

  2. Create a database connection pool similar to what we did in the previous section:

    from sanic import Sanic

    import aioredis

    app = Sanic.get_app()

    @app.before_server_start

    async def setup_redis(app: Sanic, _):

        app.ctx.redis_pool = aioredis.BlockingConnectionPool.from_url(

            app.config.REDIS_DSN, max_connections=app.config.REDIS_MAX

        )

        app.ctx.redis = aioredis.Redis(connection_pool=app.ctx.redis_pool)

    @app.after_server_stop

    async def shutdown_redis(app: Sanic, _):

        await app.ctx.redis_pool.disconnect()

  3. Next, we will create a decorator to use with our endpoints:

    def cache_response(build_key, exp: int = 60 * 60 * 72):

        def decorator(f):

            @wraps(f)

            async def decorated_function(request: Request, *handler_args, **handler_kwargs):

                cache: Redis = request.app.ctx.redis

                key = make_key(build_key, request)

                if cached_response := await get_cached_response(request, cache, key):

                    response = raw(**cached_response)

                else:

                    response = await f(request, *handler_args, **handler_kwargs)

                    await set_cached_response(response, cache, key, exp)

                return response

            return decorated_function

        return decorator

What is happening here is pretty simple. First, we generate some keys that will be used to look up and store values. Then, we check to see whether anything exists for that key. If yes, then use that to build a response. If no, then execute the actual route handler (which we know takes some time).

  1. Let's see what we have accomplished in action. First, we will hit the endpoint again. To emphasize my point, I will include some stats from curl:

    $ curl localhost:7777/v1/slow

    Wow, that took 5.67s!

    status=200  size=21 time=5.686937 content-type="text/plain; charset=utf-8"

  2. Now, we will try it again:

    $ curl localhost:7777/v1/slow           

    Wow, that took 5.67s!

    status=200  size=21 time=0.004090 content-type="text/plain; charset=utf-8"

Wow! It returned almost instantly! In the first attempt, it took just under 6 seconds to respond. In the second, because the information has been stored in Redis, we got an identical response in about 4/1,000 of a second. And, don't forget that in those 4/1,000 of a second, Sanic went to fetch data from Redis. Amazing!

Using Redis as a caching layer is incredibly powerful as it can be used to significantly boost your performance. The flip side—as anyone that has worked with caching before knows—is that you need to have an appropriate use case and a mechanism for invalidating your cache. In this example, it is accomplished in two ways. If you check the source code at GitHub (https://github.com/PacktPublishing/Python-Web-Development-with-Sanic/blob/main/Chapter09/hikingapp/application/hiking/common/cache.py#L43), you will see that we are expiring the value automatically after 72 hours, or if someone sends a ?refresh=1 query argument to the endpoint.

Summary

Since we are past the point of talking about basic concepts in application development, we have graduated to the level of exploring some best practices that I have learned over the years of developing web applications. This is clearly just the tip of the iceberg, but they are some very important foundational practices that I encourage you to adopt. The examples from this chapter could become a great foundation for starting your next web application process.

First, we saw how you can use smart and repeatable exception handling to create a consistent and thoughtful experience for your users. Second, we explored the importance of creating a testable application, and some techniques to make it easily approachable. Third, we discussed implementing logging in both development and production environments, and how you could use those logs to easily debug and trace requests through your application. Finally, we spent time learning how databases could be integrated into your application.

In the next chapter, we will continue to expand upon the basic platform that we have built. You will continue to see a lot of the same patterns (such as logging) in our examples as we look at some common use cases of Sanic.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.103.10