Chapter 5. Testing your software

This chapter covers

  • Understanding the anatomy of a test
  • Using different testing approaches for your application
  • Writing tests with the unittest framework
  • Writing tests with the pytest framework
  • Adopting test-driven development

I’ve talked in previous chapters about writing clear code using well-named functions for maintainability, but that’s only part of the picture. As you add feature after feature, can you be sure the application still does what you meant it to? Any application you hope will live on long into the future needs some assurances of its longevity. Tests can help you make sure new features are built correctly, and you can run these tests again each time you update your code to make sure it stays correct.

Testing can be a strict, formal process for applications that must not fail, like launching shuttles and keeping planes in flight. Such tests are rigorous and often mathematically provable. That’s pretty cool, but it goes way beyond what’s necessary or practical for most Python applications. In this chapter, you’ll learn about the methodology and tools Python developers use to test their code, and you’ll get a chance to write some tests yourself.

5.1. What is software testing?

Loosely speaking, software testing is the practice of verifying that software behaves the way you expect. This can range from making sure a function produces the expected output when given a specific input to making sure your application can handle the stress of 100 users at once. As developers, we constantly do some form of this subconsciously. If you’re developing a website, you probably run the server locally and check your changes in the browser as you code. This is a form of testing.

You might think that spending more time validating that your code works means less time shipping software. In the immediate term, this is true, especially as you get acquainted with the tools and processes related to testing. The idea in the long term, though, is that testing will save you time by limiting the recurrence of behavior and performance bugs and by providing a scaffolding you can use to confidently refactor code in the future. The more critical a piece of code is to your business, the more time you’ll want to spend testing it thoroughly.

5.1.1. Does it do what it says on the tin?

One reason to test a piece of software is to determine whether it really does what it claims. A well-named function describes its intent to the reader, but, as they say, the road to hell is paved with good intentions. I can’t count the number of times I wrote a function, fully believing it was faithfully carrying out its intended purpose, only to find out later that I’d made a mistake.

Sometimes these mistakes are easy to catch—a typo or exception in an area of code you’re familiar with might be easy to track down. The trickier bugs to find are those that don’t cause immediate issues but cascade as the application progresses. With good testing, problems can be found early, and you can guard your application from similar issues in the future. A number of categories of testing exist, each focused on identifying particular kinds of problems. I’ll cover a few here, but you can be sure this is not an exhaustive list.

5.1.2. The anatomy of a functional test

You saw earlier that testing can make sure software produces the right output for a given input. This type of testing is called functional testing because it makes sure that a piece of code functions correctly. This is in contrast to other types of testing, such as performance testing, which I’ll cover in section 5.6.

Although functional testing strategies vary in scale and approach, the basic anatomy of a functional test remains consistent. Because they check that software gives the right output based on a given input, all functional tests need to perform a few specific tasks, including the following:

  1. Prepare the inputs to the software.
  2. Identify the expected output of the software.
  3. Obtain the actual output of the software.
  4. Compare the actual and the expected outputs to see if they match.

The preparation of inputs and identification of expected outputs are where most of your work as a developer will be when creating tests, whereas obtaining and comparing the actual output is a matter of executing your code, as shown in figure 5.1.

Figure 5.1. The basic flow of a functional test

Structuring your tests this way has another beneficial effect: you can read your tests as a specification of how the code works. This pays off when you revisit code you wrote long ago (or last week, for me). A good test for a calculate_mean function might read like this:

Given the list of integers [1, 2, 3, 4], the expected output of calculate_mean is 2.5. Verify that the actual output of calculate_mean matches this expectation.

This format scales to larger functional workflows. In an e-commerce system, the “input” might be clicking a product and then clicking Add to Cart. The expected “output” is the item being added to the cart. A functional test for that workflow would read like this:

Given I visit the page for product 53-DE-232 and click Add to Cart, I expect to see 53-DE-232 in my cart.

Ultimately, it’s nice when your tests not only verify that your code works, but also act as documentation on how to use it. In the next section, you’ll see how this recipe for writing a functional test applies to some different testing approaches.

5.2. Functional testing approaches

Functional testing takes on many forms in practice. From the constant little checks we do as developers to fully automated tests that get kicked off before every production deployment, there is a spectrum of practices and capabilities. You’ll recognize some of the following types of testing, but I recommend reading about each of them to understand the similarities and differences between them.

5.2.1. Manual testing

Manual testing is the practice of running your application, giving it some inputs, and checking whether it does what you expect. For example, if you’re writing a registration workflow for a website, you would enter a username and password and make sure a new user is created. If you have password requirements, you would want to check that using an invalid password does not create a new user. Similarly, you’d test for the case where a user with the username you choose already exists.

Registering on a website is generally a small (and one-time) part of the product experience for most users, but, as you can see, you already have to verify several cases. If any of these things go wrong, your users either can’t register or might have their account information overwritten. With this code being so important, relying on manual testing for too long will eventually cause you to miss something. Manually exploring the application for new bugs or new things to test is still a valuable activity, but it should be viewed as a supplement to other types of testing.

5.2.2. Automated testing

In contrast to manual testing, automated testing allows you to write a great number of tests that can then be executed as many times as you like, without the risk that you’ll miss a check when you’re trying to leave the office the Friday of a long weekend. If this hypothetical situation seems overly specific, that’s because it’s not hypothetical. I’ve lived it.

Automated testing tightens the feedback loop so that you can see quickly whether a change you’ve made has broken an expected behavior. The time you’ll save compared to manual testing will free you up to do more creative exploratory testing of the application. As you uncover things to fix, you should incorporate them into your automated tests. You can think of this as locking in a verification that will make sure the particular bug doesn’t happen again. Most of the testing you’ll see in the rest of this chapter can be, and often is, automated.

5.2.3. Acceptance testing

Closest in nature to the Add to Cart workflow test, acceptance testing verifies the high-level requirements of a system. Software that passes these tests is acceptable based on the specified requirements. As shown in figure 5.2, acceptance tests answer questions like, “Can the user successfully go through the purchase workflow and buy the product they want?” These are the mission-critical checks for the business—things that keep the lights on.

Figure 5.2. Acceptance tests verify workflows from a user’s perspective.

Acceptance tests are often carried out manually by a business stakeholder, but they can also be automated to a degree with end-to-end testing. End-to-end testing makes sure a set of actions can be carried out (from one end to the other) with the appropriate data flowing through where needed. If the workflow is expressed from the viewpoint of the user, it begins to look almost exactly like the Add to Cart workflow.

Testing is for everyone

Libraries like Cucumber (https://cucumber.io) enable you to describe end-to-end tests in natural language as high-level actions, like “click the Submit button.” These tests are often much easier to understand than a big mess of code. Writing steps in natural language documents the system in a way most anyone in the organization can understand.

This idea of behavior-driven development (BDD) allows you to collaborate with others on end-to-end testing, even if they don’t have experience with software development in a coding capacity. BDD is used in many organizations as a way to define the desired outcomes first, only implementing the code to make the tests pass afterward.

End-to-end tests commonly verify areas of high value for the business—if the cart doesn’t work, no one can buy products, and you lose revenue—but they are also the most susceptible to breaking because they span such a wide swathe of functionality. If any one step in the workflow doesn’t work, the whole end-to-end test fails. Creating a set of tests that vary in granularity will help indicate not only whether the whole workflow is healthy, but also which steps are failing specifically. This allows you to pinpoint problems faster.

End-to-end tests are some of the least granular, so what’s on the other end of the spectrum?

5.2.4. Unit testing

Unit testing is perhaps the most important thing you can take away from this chapter. Unit tests make sure all the little bits of your software are working, and they lay a strong foundation for larger testing efforts like end-to-end testing. I’ll show you how to get started with unit testing in Python in section 5.4.

Definition

A unit is a small, fundamental piece of software—like the “unit” in “unit circle.” What constitutes a unit is the source of much philosophical waxing, but a good working definition is that it’s a piece of code that can be isolated for testing. Functions are generally considered units—they can be executed in isolation by calling them with the appropriate inputs. Lines of code within those functions can’t be isolated, so they’re smaller than a unit. Classes contain many pieces that can be isolated further, so they’re generally bigger than a unit, but they are occasionally treated as units.

Unit testing seeks to verify that all the individual units of code in your application work correctly, that each small piece of the software does what it says it does. These are the most fundamental tests you can write and are therefore a great place to get started with testing.

Functions are the most common target of functional unit tests. “Function” is right there in the name, after all. This is because of functions’ input-output nature. If you’ve separated the concerns of your code into small functions, testing them will be a straightforward application of the functional testing recipe.

It turns out that one of the great benefits of structuring your code using separation of concerns, encapsulation, and loose coupling is that it makes code easier to test. Testing can feel tedious, so any opportunity to reduce friction is welcome. The easier the code is to test, the more likely it is that you’ll write those tests in the first place, so you can reap the reward of confidence in your software. Units are the small, separated pieces you naturally arrive at by sticking with the practices you’ve learned so far.

Most unit tests in Python compare expected and actual outputs using a simple equality comparison. You can do one of these yourself right now. Open the Python REPL and create this calculate_mean function:

>>> def calculate_mean(numbers):
...     return sum(numbers) / len(numbers)

Now you can test your expectations of this function with a few different inputs, comparing the output to your expected results:

>>> 2.5 == calculate_mean([1, 2, 3, 4])
True
>>> 5.5 == calculate_mean([5, 5, 5, 6, 6, 6])
True

Try a few other lists of numbers in the REPL now to verify that calculate_mean is giving the right results. Think of useful sets of inputs that might change the behavior of the function:

  • Does it work correctly with negative numbers?
  • Does it work when the list of numbers contains 0?
  • Does it work when the list is empty?

These kinds of curiosities are worth writing tests for. They occasionally uncover questions you haven’t accounted for in your code, which gives you an opportunity to address those questions before someone finds out the hard way that a particular use case wasn’t considered.

>>> 0.0 == calculate_mean([-1, 0, 1])
True
>>> 0.0 == calculate_mean([])             1
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in calculate_mean
ZeroDivisionError: division by zero

  • 1 Raises an exception for a case you haven’t considered yet

You can fix calculate_mean by returning 0 if the list is empty:

>>> def calculate_mean(numbers):
...     if not numbers:
...         return 0
...     return sum(numbers) / len(numbers)
>>> 0.0 == calculate_mean([])
True

Great—calculate_mean has passed all the cases we’ve thrown at it. Remember that unit tests are the foundation that enables success in larger testing efforts, like end-to-end testing. To understand that relationship better, we’ll look at two other testing categories in the following sections.

5.2.5. Integration testing

Whereas unit tests are all about making sure the individual pieces of your code work as expected, integration testing focuses on making sure those units all work in tandem to produce the right behavior (see figure 5.3). You may have 10 fully functional units of software, but if they can’t be put together to do what you want, they aren’t too useful. Whereas end-to-end workflow tests are usually framed from the perspective of a user, integration tests focus more on the behavior of the code. They’re at different levels of abstraction.

Figure 5.3. Integration tests focus on how operations work together.

Integration testing carries several caveats, though. Because integration tests need to thread multiple pieces of code together, it’s common to build tests that are structured much like the code they’re testing. This introduces tight coupling between the tests and the code—changes in the code that produce the same outcome might still cause the tests to break, because the tests are too concerned with how the outcome is achieved.

Integration tests may take significantly longer to execute than unit tests. They generally do more than execute some functions and check the output; they might use a database to create and manipulate records, as an example. The interaction being tested is more complex, so the time required to carry it out can grow. For these reasons, integration tests are usually fewer in number than unit tests.

5.2.6. The testing pyramid

Now that you’ve seen manual, unit, and integration testing, let’s recap the interplay between them. The idea of a testing pyramid like that in figure 5.4 indicates that you should liberally apply functional tests like unit and integration tests, but be more conservative with long, brittle, and manual tests.[1] Each has merit, and your mileage will depend on the application and the resources at your disposal, but it’s a decent rule of thumb about where to invest time.

1

Testing pyramids were first described by Mike Cohn in Succeeding with Agile (Addison-Wesley Professional, 2009).

Figure 5.4. The testing pyramid

You’ll get the most bang for your buck by making sure the little pieces of software are all working, then making sure they all work together. Again, automating this process will empower you to use the time you’ve freed up to think of new ways your software might break. You can then incorporate those ideas as new tests and slowly build confidence that will carry you forward.

5.2.7. Regression testing

Regression testing is less an approach to testing per se, and more a process to follow as you develop your applications. When you write a test, the assumption is that you’re saying, “I want to make sure the code keeps working this way.” If you change your code in a way that changes the behavior you tested, that would be a regression. A regression is a shift to an undesirable (or at least unexpected) state and is usually A Bad Thing.

Regression testing is the practice of running your existing suite of tests after each code change before shipping your code to production. A test suite is the collection of tests you’ve built up over time, either written to verify code as unit/integration tests or to fix things found in exploratory manual testing. Many development teams run these test suites in a continuous integration (CI) environment, where changes to an application are frequently combined and tested before being released. A full discussion of CI is beyond the scope of this book, but the idea is to set yourself up for success by running all your tests against all your changes. I highly recommend checking out Travis CI (https://docs.travis-ci.com/user/for-beginners/) or CircleCI (https://circleci.com/ docs/2.0/about-circleci/) to learn more.

Version control hooks

One practice for automating unit tests in source control systems is using a precommit hook. Each time you commit your code, the hook triggers the tests to run. If any failures occur, the commit fails, and you’re reminded to fix them before committing your code. Most unit-testing tools should integrate with this approach pretty well. Running the tests again in a continuous integration environment makes sure that they pass just before the code is deployed.

As new features are added, new tests get added to the test suite. These get locked in as regression tests for future changes. Similarly, it’s common to add tests for bugs that you find, so that you can build confidence that a particular bug won’t reoccur. Like code, test suites won’t always be perfect. But leaning on a robust suite to tell you when things go awry can help you focus on other areas, like innovation and performance.

With that, let’s see how you can start writing tests in Python.

5.3. Statements of fact

The next step toward creating real tests is to assert that a particular comparison holds true. Assertions are statements of fact; if you make an assertion that doesn’t hold true, either some assumption you’ve made is incorrect or the assertion itself is incorrect. If you assert that “you can see the sun on the horizon every morning,” it holds true most of the time. But when there are clouds on the horizon, your assertion doesn’t hold true. If you update your assumptions to include that the sky is clear, your assertion becomes true again.

Assertions in software are similar. They assert that some expression must hold true, and they fail loudly if that assertion fails. In Python, assertions can be written using the assert keyword. When assertions fail, they raise an AssertionError.

You can test calculate_mean with assertions by adding assert in front of your comparisons. A passing assertion will have no output; a failing one will show you the traceback for the AssertionError:

>>> assert 10.0 == calculate_mean([0, 10, 20])
>>> assert 1.0 == calculate_mean([1000, 3500, 7_000_000])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AssertionError

This behavior is what many Python testing tools are built on. Using the recipe for a functional test (set up input, identify expected output, obtain actual output, and compare), these tools help you do the comparison and provide valuable context when your assertions fail. Read on to see how two of the most widely used testing tools in Python handle making assertions about your code.

5.4. Unit testing with unittest

Unittest is Python’s built-in testing framework. Although it’s called unittest, it can also be used for integration testing. Unittest provides features for making assertions about your code, and also the tool for running the tests. In this section, you’ll see how tests are organized and how to run them, and you’ll finally get some practice writing real tests. Let’s get to it!

5.4.1. Test organization with unittest

Unittest provides a set of features for performing assertions. You previously saw how to write raw assert statements to test code, but unittest provides a TestCase class with custom assertion methods for more understandable testing output. Your tests will inherit from this class and use methods to make assertions.

I encourage you to use these test classes as a strategy for grouping your tests. The classes are flexible—you can use them to group any tests you like. If you have many tests for a class, putting them in their own TestCase is a good idea. If you have many tests for a single method within a class, you could even create a TestCase only for those. You learned to use cohesion, namespacing, and separation of concerns for code, and you can apply the same ideas to tests.

5.4.2. Running tests with unittest

Unittest provides a test runner that you can use by typing python -m unittest in your terminal. When you run the unittest test runner, it will look for tests by

  1. Looking in the current directory (and any subdirectories) for modules named test_* or *_test
  2. Looking in those modules for classes that inherit from unittest.TestCase
  3. Looking in those classes for methods that start with test_

Some people like to put their tests as close to the relevant code as possible, making it easier to find tests for a particular module of interest. Others like to put all their tests in a tests/ directory that lives at the root of their project to keep them separate from the code. I’ve done it both ways and don’t have a strong preference myself. Do what works for you, your team, or the community you’re writing software with.

5.4.3. Writing your first test with unittest

Now that you’ve got an idea of how unittest does things, you need something to test. The following listing lays out a class you’ll use to get some testing practice.

Listing 5.1. A product class for an e-commerce system
class Product:
    def __init__(self, name, size, color):   1
        self.name = name
        self.size = size
        self.color = color

    def transform_name_for_sku(self):
        return self.name.upper()

    def transform_color_for_sku(self):
        return self.color.upper()

    def generate_sku(self):                  2
        """
        Generates a SKU for this product.

        Example:
            >>> small_black_shoes = Product('shoes', 'S', 'black')
            >>> small_black_shoes.generate_sku()
            'SHOES-S-BLACK'
        """
        name = self.transform_name_for_sku()
        color = self.transform_color_for_sku()
        return f'{name}-{self.size}-{color}'

  • 1 The product attributes are specified when a Product instance is created.
  • 2 A SKU uniquely identifies the product attributes.

This class represents a product for purchase in an e-commerce system. A product has a name and options for size and color, and each combination of these attributes produces a stock keeping unit (SKU). A SKU is a unique, internal ID used by companies for pricing and inventory that often uses an all-uppercase format. Place this class definition in a product.py module.

After you’ve created your product module, you’re ready to start writing your first test. Create a test_product.py module in the same directory as product.py. Start by importing unittest and creating an empty ProductTestCase class that inherits from the base TestCase class:

import unittest

class ProductTestCase(unittest.TestCase):
    pass

If you run python -m unittest at this point, with only product.py and your empty test case in test_product.py, it will say that it ran no tests:

$ python -m unittest

----------------------------------------------------------------------
Ran 0 tests in 0.000s

OK

It likely found the test_product module and the ProductTestCase class, but you haven’t written any tests there yet. You can check this by adding an empty test method to the class:

import unittest

class ProductTestCase(unittest.TestCase):
    def test_working(self):
        pass

Try running the test runner again; you should see that it ran one test this time:

$ python -m unittest
.
----------------------------------------------------------------------
Ran 1 test in 0.000s

OK

Now you’re ready for the real magic. Remember the anatomy of a functional test:

  1. Set up the inputs.
  2. Identify the expected output.
  3. Obtain the actual output.
  4. Compare the expected and actual outputs.

If you want to test the transform_name_for_sku method from the Product class, this recipe becomes

  1. Create an instance of Product with a name, size, and color.
  2. Observe that transform_name_for_sku returns name.upper(); the expected result is the name in uppercase.
  3. Call the Product instance’s transform_name_for_sku method and save it in a variable.
  4. Compare the expected result to the saved actual result.

You can write the first three steps using regular code for creating a Product instance and getting the value of transform_name_for_sku. Using an assert statement for the fourth step would work, but AssertionError doesn’t provide much information in its traceback by default. This is where the custom assertion methods in unittest come into play. The most common one to use for comparing two values is assertEqual, which accepts expected and actual values as arguments. It raises an AssertionError and provides additional information showing the difference between the two values if they aren’t equal. This added context can help you find issues more easily.

Here’s what the test might look like on a first pass:

import unittest

from product import Product


class ProductTestCase(unittest.TestCase):
    def test_transform_name_for_sku(self):
        small_black_shoes = Product('shoes', 'S', 'black')            1
        expected_value = 'SHOES'                                      2
        actual_value = small_black_shoes.transform_name_for_sku()     3
        self.assertEqual(expected_value, actual_value)                4

  • 1 Prepares the setup for transform_name_for_sku: the product with its attributes
  • 2 States the expected result for generate_sku with the given inputs
  • 3 Obtains the actual result of generate_sku for comparison
  • 4 Uses the special equality assertion method to compare two values

Running the test runner now should still show Ran 1 test, and if the test passes (it should), you won’t see much additional output.

It’s a good idea to see your tests fail to verify that they’ll actually catch a problem with your code if one arises. Change the expected value 'SHOES' to 'SHOEZ' and run the test again. Now, unittest will raise an AssertionError stating that 'SHOEZ' != 'SHOES':

$ python -m unittest
F
======================================================================
FAIL: test_transform_name_for_sku (test_product.ProductTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/dhillard/test/test_product.py", line 11, in
 test_transform_name_for_sku
    self.assertEqual(expected_value, actual_value)
AssertionError: 'SHOEZ' != 'SHOES'
- SHOEZ
?     ^
+ SHOES
?     ^

----------------------------------------------------------------------
Ran 1 test in 0.001s

FAILED (failures=1)

Confident that the test is keeping an eye over your code, you can change it back to the appropriate value and move on to another test.

5.4.4. Writing your first integration test with unittest

Now that you’ve seen what units are and how they can be tested, it’s time to look at how the integration of multiple units can be tested. Unit tests are meant to examine the behavior of small pieces of software in isolation, so without integration tests it’s difficult to say if these small pieces work together to produce something useful as a whole (see figure 5.5).

Figure 5.5. Unit tests and integration tests

Now that you can manage products in your inventory with a SKU system, people should be able to start buying them. A new ShoppingCart class with the ability to add and remove products would be a good first step. The cart stores products as a dictionary that looks like this:

{
    'SHOES-S-BLACK': {  1
        'quantity': 2,  2
        ...
    },
    'SHOES-M-BLUE': {
        'quantity': 1,
        ...
    },
}

  • 1 The keys are the product SKUs.
  • 2 A nested dictionary of metadata about the cart item, like quantity

The ShoppingCart class contains methods to add and remove a product by managing the data in this dictionary.

from collections import defaultdict


class ShoppingCart:
    def __init__(self):
        self.products = defaultdict(lambda: defaultdict(int))            1

    def add_product(self, product, quantity=1):                          2
        self.products[product.generate_sku()]['quantity'] += quantity

    def remove_product(self, product, quantity=1):                       3
        sku = product.generate_sku()
        self.products[sku]['quantity'] -= quantity
        if self.products[sku]['quantity'] == 0:
            del self.products[sku]

  • 1 Using defaultdict simplifies the logic for checking if a product is already in the cart dictionary.
  • 2 Adds quantity of a product to the cart
  • 3 Removes quantity of a product from the cart

The ShoppingCart behavior now presents a couple of integration points that should be tested:

  • The cart relies on (integrates with) the Product instance’s generate_sku method.
  • Adding and removing products must work in tandem; a product that’s been added must also be able to be removed.

Testing these integrations will look a lot like unit testing; the difference is in how much of your software is executed during the test. Where a unit test generally only executes the code in one method and asserts that the output is as expected, an integration test may run many methods and make assertions about a few things along the way.

In the case of ShoppingCart, a useful test would be to initialize the cart, add a product, remove it, and make sure the cart is empty, as shown in the following listing.

Listing 5.2. An integration test for a ShoppingCart class
import unittest

from cart import ShoppingCart
from product import Product


class ShoppingCartTestCase(unittest.TestCase):      1
    def test_add_and_remove_product(self):
        cart = ShoppingCart()                       2
        product = Product('shoes', 'S', 'blue')     3

        cart.add_product(product)                   4
        cart.remove_product(product)                5

        self.assertDictEqual({}, cart.products)     6

  • 1 The test setup is comparable to the earlier unit test.
  • 2 Creates a cart to add products to
  • 3 Creates some small blue shoes
  • 4 Adds shoes to the cart
  • 5 Removes shoes from the cart
  • 6 The cart should be empty!

This test calls the cart’s __init__ method, the product’s generate_sku method, and the cart’s add_product and remove_product methods. There’s a lot going on. As you might expect, integration tests are often quite a bit longer as a result.

5.4.5. Test doubles

You’ll often have to write tests for code that interacts with another system, whether it’s a database or an API call. These calls might do destructive things to real data, so calling them for real when you run your tests might have bad consequences. They may also be slow, with the effect being magnified if your test suite executes that area of code multiple times. These other systems may not even be under your control. It often makes sense to imitate them instead of using the real thing.

There are several subtly different ways to imitate these systems with test doubles:

  • Faking—Using a system that behaves a lot like the real one, but avoids expensive or destructive actions
  • Stubbing—Using a predetermined value as a response instead of getting one from a live system
  • Mocking—Using a system with the same interface as the real one, but that also records interactions for later inspection and assertions

Faking and stubbing in Python involve writing up your own imitations as functions or classes and telling your code to use them during test execution. Mocking, on the other hand, is most commonly done using the unittest.mock module.

Suppose your code calls an API endpoint to get some tax information for your product sales. You don’t want to really use this endpoint in your test because you’ve seen it take a few seconds to respond. On top of that, it returns dynamic data, so you can’t be sure what value you should make assertions about in the test. If the code looks like this:

from urllib.request import urlopen


def add_sales_tax(original_amount, country, region):
    sales_tax_rate =
 urlopen(f'https://tax-api.com/{country}/{region}').read().decode()
    return original_amount * float(sales_tax_rate)

a unit test with mocking could look like this:

import io
import unittest
from unittest import mock

from tax import add_sales_tax


class SalesTaxTestCase(unittest.TestCase):
    @mock.patch('tax.urlopen')   1
    def test_get_sales_tax_returns_proper_value_from_api(
            self,
            mock_urlopen  2
    ):
        test_tax_rate = 1.06
        mock_urlopen.return_value = io.BytesIO(  3
            str(test_tax_rate).encode('utf-8')
        )

        self.assertEqual(  4
            5 * test_tax_rate,
            add_sales_tax(5, 'USA', 'MI')
        )

  • 1 The mock.patch decorator mocks the object or method specified.
  • 2 The test function receives the mocked object or method.
  • 3 The mocked urlopen call will now return the mocked response with the expected test tax rate.
  • 4 Asserts that the add_sales_tax method calculates the new value from the tax rate returned by the API

Testing in this way allows you to declare, “The code I control behaves in this way given these assumptions,” where the assumptions are created using test doubles. If you have fair confidence that the requests library works as it says it does, you can use test doubles to avoid coupling yourself to it. If you need to use a different HTTP client library in the future, or need to change which API you get your tax information from, the test will not have to change.

It’s possible to overuse test doubles. I’m most certainly guilty of this from time to time. Usually you’ll want to use test doubles to avoid the slow, expensive, or destructive behaviors mentioned before, but sometimes it’s tempting to mock your own code to perfectly isolate the unit you’re trying to test. This can lead to brittle tests that break often when you change your code, in part because they mirror the structure of the implementation too closely. Change the implementation, and you have to change your tests.

Try to write tests that verify what you need but are flexible regarding changes in the underlying implementation. This is loose coupling, once again. Loose coupling applies to test code as much as implementation code.

5.4.6. Try it out

How would you test the other methods in the Product and ShoppingCart classes? Keeping in mind the recipe for functional tests, try adding additional tests for the remaining methods. A thorough test suite will contain assertions for each method and for each different outcome you might expect from the method. You might even find a subtle bug! As a hint, try testing what happens when you remove more things from the cart than it contains.

Some of the values you need to test are dictionaries. Unittest has a special method, assertDictEqual, that provides useful output specific to dictionaries when the test fails.

For short tests like the one you wrote already, you can skip saving the expected and actual values as variables. Enter the expressions directly as arguments to assertEqual:

def test_transform_name_for_sku(self):
    small_black_shoes = Product('shoes', 'S', 'black')
    self.assertEqual(
        'SHOES',
        small_black_shoes.transform_name_for_sku(),
    )

When you’ve given it a try, come back and check the following listing to see how you did. Remember to use the unittest test runner after writing or changing a test to see if the test continues to pass.

Listing 5.3. A test suite for Product and ShoppingCart
class ProductTestCase(unittest.TestCase):
    def test_transform_name_for_sku(self):
        small_black_shoes = Product('shoes', 'S', 'black')
        self.assertEqual(
            'SHOES',
            small_black_shoes.transform_name_for_sku(),
        )

    def test_transform_color_for_sku(self):
        small_black_shoes = Product('shoes', 'S', 'black')
        self.assertEqual(
            'BLACK',
            small_black_shoes.transform_color_for_sku(),
        )

    def test_generate_sku(self):
        small_black_shoes = Product('shoes', 'S', 'black')
        self.assertEqual(
            'SHOES-S-BLACK',
            small_black_shoes.generate_sku(),
        )


class ShoppingCartTestCase(unittest.TestCase):
    def test_cart_initially_empty(self):
        cart = ShoppingCart()
        self.assertDictEqual({}, cart.products)

    def test_add_product(self):
        cart = ShoppingCart()
        product = Product('shoes', 'S', 'blue')

        cart.add_product(product)

        self.assertDictEqual({'SHOES-S-BLUE': {'quantity': 1}},
 cart.products)

    def test_add_two_of_a_product(self):
        cart = ShoppingCart()
        product = Product('shoes', 'S', 'blue')

        cart.add_product(product, quantity=2)
        self.assertDictEqual({'SHOES-S-BLUE': {'quantity': 2}},
 cart.products)

    def test_add_two_different_products(self):
        cart = ShoppingCart()
        product_one = Product('shoes', 'S', 'blue')
        product_two = Product('shirt', 'M', 'gray')

        cart.add_product(product_one)
        cart.add_product(product_two)

        self.assertDictEqual(
            {
                'SHOES-S-BLUE': {'quantity': 1},
                'SHIRT-M-GRAY': {'quantity': 1}
            },
            cart.products
        )

    def test_add_and_remove_product(self):
        cart = ShoppingCart()
        product = Product('shoes', 'S', 'blue')

        cart.add_product(product)
        cart.remove_product(product)

        self.assertDictEqual({}, cart.products)

    def test_remove_too_many_products(self):
        cart = ShoppingCart()
        product = Product('shoes', 'S', 'blue')

        cart.add_product(product)
        cart.remove_product(product, quantity=2)

        self.assertDictEqual({}, cart.products)

You can fix the bug in the shopping cart by updating remove_product to delete a product from the cart if its quantity is less than or equal to 0:

if self.products[sku]['quantity'] <= 0:
            del self.products[sku]

5.4.7. Writing interesting tests

Good tests will use inputs that affect the behavior of the method being tested. SKUs are typically all uppercase, and they usually don’t contain spaces either—only letters, numbers, and dashes. But what if the product name contains a space? You’ll want to remove the spaces before the name gets put in the SKU. A tank top SKU should start with 'TANKTOP', for example.

This is a new requirement, so you can write a new test that describes how the code should behave.

def test_transform_name_for_sku(self):
    medium_pink_tank_top = Product('tank top', 'M', 'pink')
    self.assertEqual(
        'TANKTOP',
        medium_pink_tank_top.transform_name_for_sku(),
    )

This test fails because the current code returns 'TANK TOP'. That’s okay because you haven’t built support for products with spaces in the name yet. Seeing this test fail for the expected reason means that when you write the code to correctly handle spaces, the test should pass.

Thinking of interesting tests like this yourself is valuable because it can surface questions like this earlier in the development process. Then you can survey other stakeholders and ask, “What are all the possible product name formats we might need to support?” If their answer gives you new information, you can incorporate it into the code and the tests to deliver better software.

Now that you understand the benefits of unittest, it’s time to learn about pytest.

5.5. Testing with pytest

Although unittest is a full-featured and mature framework built into Python, it has a few drawbacks. For some, it feels “un-Pythonic” because it uses camelCase instead of snake_case for method names (a relic of its JUnit history). Unittest also requires a fair amount of boilerplate that makes the underlying tests a bit more difficult to comprehend.

Pythonic code

Code is often said to be Pythonic if it uses the features and common style guidelines for the Python language. Pythonic code uses snake_case for variable and method names, list comprehensions instead of simple for loops, and so on.

For those who like succinct, straight-to-the-point tests, pytest is an answer (https:// docs.pytest.org/en/latest/getting-started.html). Once you’ve installed pytest, you can get back to the raw assert statements you saw earlier. Pytest performs a bit of hidden magic under the hood to make this work, but it produces a smooth experience.

Pytest produces more readable output by default, telling you about the system, the number of tests it finds, the result of individual tests, and a summary of the overall test results:

$ pytest
========== test session starts ==========
platform darwin -- Python 3.7.3, pytest-5.0.1, py-1.8.0,  1
 pluggy-0.12.0
rootdir: /path/to/ecommerce/project
collected 15 items                                        2

test_cart.py ............    [ 80%]                       3
test_product.py ..           [ 93%]
test_tax.py .                [100%]

======= 15 passed in 0.12 seconds =======                 4

  • 1 Information about the system
  • 2 The number of tests pytest discovered
  • 3 The status of each test from each module, with an overall progress indicator
  • 4 A summary of the full test suite results

5.5.1. Test organization with pytest

Pytest does automatic discovery of your tests like unittest does. It will even discover any unittest tests you have lying around. One key difference is that proper pytest test classes are named Test* and don’t need to inherit from a base class (like unittest.TestCase) to work.

The command for running tests with pytest is simpler:

pytest

Because pytest doesn’t require you to inherit from a base class or use any special methods, you don’t strictly need to organize your tests into classes. I still recommend it, though, because it remains a good organizational tool. Pytest will include the test class name in failure output and the like, which can help you understand where the tests live and what they’re about. On the whole, pytest tests can be organized similarly to those for unittest.

5.5.2. Converting unittest tests to pytest

Because pytest will discover your existing unittest tests, you can incrementally convert your tests to pytest as you wish (and if you wish, I suppose). For the test suite you’ve written so far, the conversion looks like this:

  • Remove the unittest import from test_product.py.
  • Rename the ProductTestCase class to TestProduct and remove the inheritance from unittest.TestCase.
  • Replace any self.assertEqual(expected, actual) with assert actual == expected.

The test case from earlier looks more like the following under pytest.

Listing 5.4. A test case in pytest
class TestProduct:                                                    1
    def test_transform_name_for_sku(self):
        small_black_shoes = Product('shoes', 'S', 'black')
        assert small_black_shoes.transform_name_for_sku() == 'SHOES'  2

    def test_transform_color_for_sku(self):
        small_black_shoes = Product('shoes', 'S', 'black')
        assert small_black_shoes.transform_color_for_sku() == 'BLACK'

    def test_generate_sku(self):
        small_black_shoes = Product('shoes', 'S', 'black')
        assert small_black_shoes.generate_sku() == 'SHOES-S-BLACK'

  • 1 No need to inherit from any base class
  • 2 self.assertEqual goes away; uses raw assert statements instead

As you can see, pytest leads to shorter and arguably more readable test code. It also provides its own framework of features that make setting up the environment and dependencies for your tests easier. For a great in-depth look at all pytest has to offer, I highly recommend Brian Okken’s book, Python Testing with pytest: Simple, Rapid, Effective, and Scalable (Pragmatic Bookshelf, 2017).

You now have some unit and integration testing under your belt; read on to learn briefly about non-functional testing.

5.6. Beyond functional testing

You spent the majority of this chapter learning about functional tests. Making code work and making it right both come before making it fast, so functional testing precedes testing the speed of your code. Once you’ve made sure the code is working, making sure it’s performant is a good next step.

5.6.1. Performance testing

Performance testing tells you how the changes you make affect things like memory, CPU, and disk usage. In chapter 4, you learned about some of the tools available for performance testing the units of your code. You used the timeit module, and that’s what I use to see what my options are for specific lines of code and functions. These aren’t measurements you’ll usually do in an automated way; they’re meant for ad hoc comparison of two approaches, and they’re quick to write when you’re trying to see which of two implementations is faster.

As you develop larger applications with a number of critical operations that need to remain efficient, it may behoove you to integrate some automated performance testing into your process. Automated performance testing looks quite like regression testing in practice; if you deploy a change and notice that the application begins consuming 20% more memory, it’s a good sign that you should investigate the change. It’s also great for celebrating the moments when you fix a slow piece of code and can watch your app speed up.

Unlike unit testing, which produces binary pass/fail results, performance testing is more qualitative. If you see your application trending slower over time (or a sudden jump after a deployment), that’s something to look into. The nature of this kind of testing makes it a bit more difficult to automate and monitor, but solutions are out there.

5.6.2. Load testing

Load testing is a type of performance testing, but it gives you information about how far you can push your application until it falls over. Maybe it consumes too much CPU, memory, or network bandwidth, or it gets too slow for users to use it reliably. Whatever the case, load testing provides metrics you can use to fine-tune the resources you give your application. In more substantial cases, it may motivate you to change the design of part of the system so it’s more efficient.

Load testing entails more infrastructure and strategy than something like unit testing. To get a clear picture of performance under load, you need to mimic your production environment closely in both architecture and user behavior. Due to the complexity of application-level load testing, in my mind it sits somewhere above integration testing in the testing pyramid (figure 5.6).

Figure 5.6. Load testing in the testing pyramid

Load testing helps you performance-test your applications in scenarios that more closely mimic real-world user behavior.

5.7. Test-driven development: A primer

A whole school of thought exists around driving development using unit and integration testing in software. The general name for these practices is test-driven development (TDD). TDD can help you commit to testing up front, so you reap the benefits of testing that we’ve discussed so far.

5.7.1. It’s a mindset

For me, the real benefit of TDD is the mindset it puts me in. The stereotype of a quality assurance engineer is that they can always find something in your code to break. This is generally said with some disdain, but I think it’s remarkable. Enumerating all the ways a system can blow up is both useful and impressive.

Netflix takes this to an extreme with the idea of chaos engineering. They actively think about the ways systems can fail, but they also introduce some amount of unpredictable failure.[2] This leads to innovative ways of responding to failure.

2

To learn more about Netflix’s advances in the area of chaos engineering, check out their collection of blog posts on the subject: https://medium.com/netflix-techblog/tagged/chaos-engineering.

As you write tests, try to be a chaos engineer. Deliberately try to think of the extremes that your code can endure, and throw them at it. There’s a limit, of course—it doesn’t make sense for all code to respond predictably to all inputs. But in Python, the exception system allows your code to respond in a predictable way to rare or unexpected situations.

5.7.2. It’s a philosophy

TDD has a subculture around it, and the only opinions stronger than how to do it correctly are how not to do it correctly. It’s an art form that produces as many styles and critics as any other movement. I’ve found it useful to learn how different teams handle the testing aspects of their process; once you do this, you can identify the pieces you like and incorporate them into your own work.

Some TDD literature advocates making sure every line of your code is covered by tests. Although it’s good to have strong coverage of the different cases your code can handle, increasing the coverage beyond a certain point can have diminishing returns. Sometimes covering those last few lines with your tests means introducing tighter coupling between the tests and the implementation with an integration test.

If you find that testing some aspect of a function’s behavior is awkward or difficult, try to determine if it’s because the code’s concerns aren’t well separated or if it’s inherently awkward to test. If awkwardness must be incorporated, it’s better for it to be in the tests than the real code. Don’t refactor code only to make testing easier or coverage stronger—do it to make testing easier and to make the code more coherent.

Summary

  • Functional tests make sure code produces the expected output from a given input.
  • Testing saves you time in the long run by catching bugs and making refactoring code easier.
  • Manual testing isn’t scalable and should be used to supplement automated testing.
  • Unittest and pytest are two popular unit and integration testing frameworks for Python.
  • Test-driven development puts the tests first, guiding you to a working implementation based on the requirements.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.135.206.254