Testing with py.test

The Python unittest module is very verbose and requires a lot of boilerplate code to set up and initialize tests. It is based on the very popular JUnit testing framework for Java. It even uses the same method names (you may have noticed they don't conform to the PEP-8 naming standard, which suggests underscores be used to separate words in a method name, rather than CamelCase) and test layout. While this is effective for testing in Java, it's not necessarily the best design for Python testing.

Because Python programmers like their code to be elegant and simple, other test frameworks have been developed, outside the standard library. Two of the more popular ones are py.test and nose. The latter is not yet supported on Python 3, so we'll focus on py.test here.

Since py.test is not part of the standard library, you'll need to download and install it yourself; you can get it from the py.test homepage at http://pytest.org/. The website has comprehensive installation instructions for a variety of interpreters and platforms.

py.test has a substantially different layout from the unittest module. It doesn't require test cases to be classes. Instead, it takes advantage of the fact that functions are objects, and allows any properly named function to behave like a test. Rather than providing a bunch of custom methods for asserting equality, it simply uses the assert statement to verify results. This makes tests more readable and maintainable.

When we run py.test, it will start in the current folder and search for any modules in that folder or subpackages whose names start with the characters test_. If there are any functions in this module that also start with test, they will be executed as individual tests.

Further, if there are any classes in the module whose name starts with Test, any methods on that class that start with test_ will also be executed in the test environment.

So let's take the simplest possible unittest example we wrote earlier and port it to py.test:

	def test_int_float():
		assert 1 == 1.0

For the exact same test, we've written two lines of more readable code, in comparison to the six lines we used in our first unittest example.

However, we are not precluded from writing class-based tests. Classes can be useful for grouping related tests together or for tests that need to access related attributes or methods on the class. This example shows an extended class with a passing and a failing test; we'll see that the error output is more comprehensive than that provided by the unittest module:

	class TestNumbers:
		def test_int_float(self):
			assert 1 == 1.0
			
		def test_int_str(self):
			assert 1 == "1"

Notice that the class doesn't have to extend any special objects to be picked up as a test. If we run py.test on this file, the output looks like this:

	============== test session starts ==============
	python: platform linux2 -- Python 3.1.2 -- pytest-1.2.1
	test object 1: class_pytest.py
	
	class_pytest.py .F
	=================== FAILURES====================
	___________ TestNumbers.test_int_str ____________
	
	self = <class_pytest.TestNumbers object at 0x85b4fac>
	
		def test_int_str(self):
	>	 assert 1 == "1"
	E 		 assert 1 == '1'
	
	class_pytest.py:7: AssertionError
	====== 1 failed, 1 passed in 0.10 seconds =======

The output starts with some useful information about the platform and interpreter. This can be useful for sharing bugs across disparate systems. The third line tells us the name of the file being tested (if there are multiple test modules picked up, they will all be displayed) followed by the familiar .F we saw in the unittest module; the . indicates a passing test, while the F demonstrates a failure.

After all tests have run, the error output for each of them is displayed. It presents a summary of local variables (there is only one in this example: the self parameter passed into the function), the source code where the error occurred, and a summary of the error message. In addition, if an exception other than an AssertionError is raised, py.test will present us with a complete traceback including source code references.

By default, py.test suppresses output from print statements if the test is successful. This is extremely useful for test debugging; if we have a test that is failing, we can add print statements to the test to check the values of specific variables and attributes as the test progresses. If the test fails, the values will be output to help with diagnostics. However, if the test is successful, the output of the print statements will not be displayed, allowing them to easily be ignored. Most importantly, we don't have to "clean up" the output by removing print statements; we can leave them in the tests and if the tests ever fail again, due to changes in our code, the debugging output will be immediately available.

One way to do setup and cleanup

py.test supports setup and teardown methods similar to those used in unittest, but it provides even more flexibility. We'll discuss these briefly, since they are familiar, but they are not used as extensively as in the unittest module, as py.test provides us with a powerful funcargs facility, which we'll discuss in the next section.

If we are writing class-based tests, we can use two methods called setup_method and teardown_method in basically the same way that setUp and tearDown are called in unittest. They are called before and after each method in the class to do any setup and cleanup duties. There is one difference from the unittest methods though. Both methods accept an argument: the function object representing the method being called.

In addition, py.test provides other setup and teardown functions to give us more control over when setup and cleanup code is executed. The setup_class and teardown_class methods are expected to be class methods; they accept a single argument (there is no self argument) representing the class in question.

Finally, we have the setup_module and teardown_module methods, which are run immediately before and after all tests (in functions or classes) in that module. These can be useful for "one time" setup, such as creating a socket or database connection that will be used by all tests in the module. Be careful with this one, as it can accidentally introduce dependencies between tests if the object being set up stores state.

That short description probably doesn't do a great job of explaining exactly when these setup and teardown methods are called, so let's look at an example that tells us exactly when it happens:


	def setup_module(module):
		print("setting up MODULE {0}".format(
			module.__name__))
	
	def teardown_module(module):
		print("tearing down MODULE {0}".format(
			module.__name__))

	def test_a_function():
		print("RUNNING TEST FUNCTION")
	
	class BaseTest:
		def setup_class(cls):
			print("setting up CLASS {0}".format(
				cls.__name__))
		
		def teardown_class(cls):
			print("tearing down CLASS {0}
".format(
				cls.__name__))
	
		def setup_method(self, method):
			print("setting up METHOD {0}".format(
				method.__name__))

		def teardown_method(self, method):
			print("tearing down METHOD {0}".format(
				method.__name__))

	class TestClass1(BaseTest):
		def test_method_1(self):
			print("RUNNING METHOD 1-1")

		def test_method_2(self):
			print("RUNNING METHOD 1-2")

	class TestClass2(BaseTest):
		def test_method_1(self):
			print("RUNNING METHOD 2-1")

		def test_method_2(self):
			print("RUNNING METHOD 2-2")

The sole purpose of the BaseTest class is to extract four methods that would be otherwise identical to the test classes and use inheritance to reduce the amount of duplicate code. So, from the point of view of py.test, the two subclasses have not only two test methods each, but also two setup and two teardown methods (one at the class level, one at the method level).

If we run these tests using py.test, the output shows us when the various functions are called in relation to the tests themselves. We also have to disable the suppression of output for the print statements to execute; this is done by passing the -s (or --capture=no) flag to py.test:

	
py.test setup_teardown.py -s
	setup_teardown.py
	setting up MODULE setup_teardown
	RUNNING TEST FUNCTION
	.setting up CLASS TestClass1
	setting up METHOD test_method_1
	RUNNING METHOD 1-1
	.tearing down METHOD test_method_1
	setting up METHOD test_method_2
	RUNNING METHOD 1-2
	.tearing down METHOD test_method_2
	tearing down CLASS TestClass1
	setting up CLASS TestClass2
	setting up METHOD test_method_1
	RUNNING METHOD 2-1
	.tearing down METHOD test_method_1
	setting up METHOD test_method_2
	RUNNING METHOD 2-2
	.tearing down METHOD test_method_2
	tearing down CLASS TestClass2
	
	tearing down MODULE setup_teardown

The setup and teardown methods for the module are executed at the beginning and end of the session. Then the lone module-level test function we added is run. Next, the setup method for the first class is executed, followed by the two tests for that class. The tests, however, are each individually wrapped in separate setup_method and teardown_method calls. After the methods have been executed, the class teardown method is called. The same sequence happens for the second class, before the teardown_module method is finally called, exactly once.

A completely different way to set up variables

One of the most common uses for the various setup and teardown functions is to ensure certain class or module variables are available with a known value before each test method is run.

py.test offers a completely different way to do this using what are known as funcargs, short for function arguments. Funcargs are basically named variables that are previously set up in a test configuration file. This allows us to separate configuration from execution of tests, and allows the funcargs to be used across multiple classes and modules.

To use them, we simply add parameters to our test function. The names of the parameters are used to look up specific arguments in specially named functions. For example, if we wanted to test the StatsList class we used earlier, while demonstrating unittest, we would again want to repeatedly test a list of valid integers. But we can write our tests like so instead of using setup methods:

	from stats import StatsList
	
	def pytest_funcarg__valid_stats(request):
		return StatsList([1,2,2,3,3,4])
	
	
	def test_mean(valid_stats):
		assert valid_stats.mean() == 2.5

	def test_median(valid_stats):
		assert valid_stats.median() == 2.5
		valid_stats.append(4)
		assert valid_stats.median() == 3
	
	def test_mode(valid_stats):
		assert valid_stats.mode() == [2,3]
		valid_stats.remove(2)
		assert valid_stats.mode() == [3]

Each of the three test methods accepts a parameter named valid_stats; this parameter is created afresh by calling the pytest_funcarg__valid_stats function defined at the top of the file. It can also be defined in a file called conftest.py if the funcarg is needed by multiple modules. The conftest.py file is parsed by py.test to load any "global" test configuration; it is a sort of catchall for customizing the py.test experience. It's actually normal to put funcargs in this module instead of your test file, in order to completely separate the configuration from the test code.

As with other py.test features, the name of the factory for returning a funcarg is important; funcargs are simply functions that are named pytest_funcarg__<valid_identifier>, where<valid_identifier> is a valid variable name that can be used as a parameter in a test function. This function accepts a mysterious request parameter, and returns the object that should be passed as an argument into the individual test functions. The funcarg is created afresh for each call to an individual test function; this allows us, for example, to change the list in one test and know that it will be reset to its original values in the next test.

Funcargs can do a lot more than return simple variables. That request object passed into the funcarg factory gives us some extremely useful methods and attributes to modify the funcarg's behavior. The module, cls, and function attributes allow us to see exactly which test is requesting the funcarg. The config attribute allows us to check command-line arguments and other configuration data. We don't have room to go into detail on this topic, but custom command-line arguments can be used to customize the test experience by running certain tests only if an argument is passed (useful for slow tests that need to be run less often) or supplying connection parameters to a database, file, or hardware device.

More interestingly, the request object provides methods that allow us to do additional cleanup on the funcarg or to reuse it across tests. The former allows us to use funcargs instead of writing custom teardown functions to clean up open files or connections, while the latter can help reduce the time it takes to run a test suite if the setup of a common funcarg is time consuming. This is often used for database connections, which are slow to create and destroy and do not need to be reinitialized after each test (although the database still typically needs to be reset to a known state between tests).

The request.addfinalizer method accepts a callback function that does any cleanup after each test function that uses a funcarg has been called. This can provide the equivalent of a teardown method, allowing us to clean up files, close connections, empty lists or reset queues. For example, the following code tests the os.mkdir functionality by creating a temporary directory funcarg:

	import tempfile
	import shutil
	import os.path
	
	def pytest_funcarg__temp_dir(request):
		dir = tempfile.mkdtemp()
			print(dir)
			
			def cleanup():
				shutil.rmtree(dir)
			request.addfinalizer(cleanup)
			return dir

	def test_osfiles(temp_dir):
		os.mkdir(os.path.join(temp_dir, 'a'))
		os.mkdir(os.path.join(temp_dir, 'b'))
		dir_contents = os.listdir(temp_dir)
		assert len(dir_contents) == 2
		assert 'a' in dir_contents
		assert 'b' in dir_contents

The funcarg creates a new empty temporary directory for files to be created in. Then it adds a finalizer call to remove that directory (using shutil.rmtree, which recursively removes a directory and anything inside it) after the test has completed. The file system is then left in the same state in which it started.

Then we have the request.cached_setup method, which allows us to create function argument variables that last longer than one test. This is useful when setting up an expensive operation that can be reused by multiple tests without the resource reuse breaking the atomic or unit nature of the tests (so that one test does not rely on and is not impacted by a previous one). For example, if we want to test the following echo server, we may want to run only one instance of the server in a separate process and then have multiple tests connect to that instance.

	import socket
	
	s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
	s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
	s.bind(('localhost',1028))
	s.listen(1)
	
		while True:
			client, address = s.accept()
			data = client.recv(1024)
			client.send(data)
			client.close()

All this code does is listen on a specific port and wait for input from a client socket. When it receives input, it just sends the same value back. To test this, we can start the server in a separate process and cache the result for use in multiple tests. Here's how the test code might look:

	import subprocess
	import socket
	import time

	def pytest_funcarg__echoserver(request):
		def setup():
			p = subprocess.Popen(
					['python3', 'echo_server.py'])
			time.sleep(1)
			return p
			
		def cleanup(p):
			p.terminate()

		return request.cached_setup(
				setup=setup,
				teardown=cleanup,
				scope="session")
				
	def pytest_funcarg__clientsocket(request):
		s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
		s.connect(('localhost', 1028))
		request.addfinalizer(lambda: s.close())
		return s

	def test_echo(echoserver, clientsocket):
		clientsocket.send(b"abc")
		assert clientsocket.recv(3) == b'abc'
		
	def test_echo2(echoserver, clientsocket):
		clientsocket.send(b"def")
		assert clientsocket.recv(3) == b'def'

We've created two funcargs here. The first runs the echo server in a separate process, and returns the process object. The second instantiates a new socket object for each test, and closes it when the test has completed, using addfinalizer. The first funcarg is the one we're currently interested in. It looks much like a traditional unit test setup and teardown. We create a setup function that accepts no parameters and returns the correct argument, in this case, a process object that is actually ignored by the tests, since they only care that the server is running. Then we create a cleanup function (the name of the function is arbitrary since it's just an object we pass into another function), which accepts a single argument: the argument returned by setup. This cleanup code simply terminates the process.

Instead of returning a funcarg directly, the parent function returns the results of a call to request.cached_setup. It accepts two arguments for the setup and teardown functions (which we just created), and a scope argument. This last argument should be one of the three strings "function", "module", or "session"; it determines just how long the argument will be cached. We set it to "session" in this example, so it is cached for the duration of the entire py.test run. The process will not be terminated or restarted until all tests have run. The "module" scope, of course, caches it only for tests in that module, and the "function" scope treats the object more like a normal funcarg, in that it is reset after each test function is run.

Test skipping with py.test

As with the unittest module, it is frequently necessary to skip tests in py.test, for a variety of reasons; the code being tested hasn't been written yet, the test only runs on certain interpreters or operating systems, or the test is time consuming and should only be run under certain circumstances.

We can skip tests at any point in our code using the py.test.skip function. It accepts a single argument: a string describing why it has been skipped. This function can be called anywhere; if we call it inside a test function, the test will be skipped. If we call it in a module, all the tests in that module will be skipped. If we call it inside a funcarg function, all tests that call that funcarg will be skipped.

Of course, in all these locations, it is often desirable to skip tests only if certain conditions are or are not met. Since we can execute the skip function at any place in Python code, we can execute it inside an if statement. So we may write a test that looks like this:

	import sys
	import py.test
	
	def test_simple_skip():
		if sys.platform != "fakeos":
				py.test.skip("Test works only on fakeOS")
		
		fakeos.do_something_fake()
		assert fakeos.did_not_happen

That's some pretty silly code, really. There is no Python platform named fakeos, so this test will skip on all operating systems. It shows how we can skip conditionally, and since the if statement can check any valid conditional, we have a lot of power over when tests are skipped. Often, we check sys.version_info to check the Python interpreter version, sys.platform to check the operating system, or some_library.__version__ to check if we have a recent enough version of a given API.

Since skipping an individual test method or function based on a certain conditional is one of the most common uses of test skipping, py.test provides a convenience decorator that allows us to do this in one line. The decorator accepts a single string, which can contain any executable Python code that evaluates to a Boolean value. For example, the following test will only run on Python 3 or higher:

	import py.test
	
	@py.test.mark.skipif("sys.version_info <= (3,0)")
	def test_python3():
		assert b"hello".decode() == "hello"

The py.test.mark.xfail decorator behaves similarly, except that it marks a test as expected to fail, similar to unittest.expectedFailure(). If the test is successful, it will be recorded as a failure, if it fails, it will be reported as expected behavior. In the case of xfail, the conditional argument is optional; if it is not supplied, the test will be marked as expected to fail under all conditions.

py.test extras

py.test is an incredibly powerful library and it can do much much more than the basics we've discussed here. We haven't even started on its distributed testing framework (which allows tests to be run across a network of different platforms and interpreter versions), its numerous built-in or third-party plugins, how incredibly easy it is to write our own plugins, or the extensive customization and configuration architecture the framework supplies. You'll have to read the documentation at http://pytest.org/ for all the juicy details.

However, before we leave for the day, we should discuss some of the more useful command-line arguments built into py.test. As with most command-line applications, we can get a list of the available command-line arguments by running the command py.test --help. However, unlike many programs, the available command-line options depends on what py.test plugins are installed and whether we've written any arguments of our own into the conftest.py for the project.

First, we'll look at a couple of arguments that help us with debugging tests. If we have a large test suite with many tests failing (because we've done invasive code changes, such as porting the project from Python 2 to Python 3), the py.test output can quickly get away from us. Passing the -x or --exitfirst command-line argument to py.test forces the test runner to exit after the first failure. We can then fix whatever problems are causing that test to fail before running py.test again and checking out the next failure.

The --pdb argument is similar, except that instead of exiting after a test fails, it drops to a python debugger shell. If you know how to use the debugger, this feature can allow you to quickly introspect variables or step through broken code.

py.test also supports an interesting --looponfail or -f argument, although it's only available if the py.test xdist plugin is installed. This plugin is available from http://pypi.python.org/pypi/pytest-xdist. If it's installed and we pass the --looponfail option to py.test, the test suite will automatically rerun itself when a failing test is encountered. This means that we can wait for a test to fail, then edit the test and fix the broken code. When we save the file, the test will automatically run again to tell us if our fix was successful. It's basically like using the --exitfirst argument repeatedly as we fix one test at a time, but automates the boring restarting bits!

The most important of the py.test arguments is the -k argument, which accepts a keyword to search for tests. It is used to run specific tests that contain the given keyword argument in the full name (including package, module, class, and test name). For example, if we have the following structure:

	package: something/
		module: 	test_something.py
			 class: TestSomething
					method: test_first
					method: test_second

We can run py.test -k test_first or even just py.test -k first to run just the one test_first method. Or, if there are other methods that have that name, we can run py.test -k TestSomething.test_first or even something.test_something.TestSomething.test_first. py.test, which will first collect the complete test name into a dot-separated string, and then checks to see if the string contains the requested keyword.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.188.131.255