Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 12. Testing

Like visits to the dentist, thorough testing of any program is something that you should be doing if you want to avoid the pain of having to trace a problem that you thought you'd taken care of. This lesson is one that normally takes a programmer many years to learn, and to be honest, you're still going to be working on it for many years. However, the one thing that is of the utmost importance is that testing must be organized; and to be the most effective, you must start writing your programs knowing that it will be tested as you go along, and plan around having the time to write and confirm your test cases.

Fortunately, Python offers an excellent facility for organizing your testing called PyUnit. It is a Python port of the Java JUnit package, so if you've worked with JUnit you're already on firm ground when testing in Python — but if not, don't worry.

In this chapter you learn:

The concept and use of assertions
The basic concepts of unit testing and test suites
A few simple example tests to show you how to organize a test suite
Thorough testing of the search utility from Chapter 11

The beauty of PyUnit is that you can set up testing early in the software development life cycle, and you can run it as often as needed while you're working. By doing this, you can catch errors early on, before they're painful to rework — let alone before anybody else sees them. You can also set up test cases before you write code, so that as you write, you can be sure that your results match what you expect! Define your test cases before you even start coding, and you'll never find yourself fixing a bug only to discover that your changes have spiraled out of control and cost you days of work.

Note that PyUnit is not the only framework available for testing your Python programs. There are literally dozens of others out there. At the time of this writing, the vast majority of those have not been updated to work with Python 3.1, but they are definitely worth a look once they get updated.

Assertions

An assertion in Python is in practice similar to an assertion in day-to-day language. When you speak and you make an assertion, you have said something that isn't necessarily proven but that you believe to be true. Of course, if you are trying to make a point, and the assertion you made is incorrect, your entire argument falls apart.

In Python, an assertion is a similar concept. Assertions are statements that can be made within the code while you are developing it that you can use to test the validity of your code, but if the statement doesn't turn out to be true, an AssertionError is raised, and the program will be stopped if the error isn't caught (in general, they shouldn't be caught, because AssertionErrors should be taken as a warning that you didn't think something through correctly!)

Assertions enable you to think of your code in a series of testable cases. That way, you can make sure that while you develop, you can make tests along the lines of "this value is not None" or "this object is a String" or "this number is greater than zero." All of these statements are useful while developing to catch errors in terms of how you think about the program.

Try It Out: Using Assert

Creating a set of simple cases, you can see how the assert language feature works:

# Demonstrate the use of assert()
large = 1000
string = "This is a string"
float = 1.0
broken_int = "This should have been an int"

assert large > 500
assert type(string) == type("")
assert type(float) != type(1)
assert type(broken_int) == type(4)

Try running the preceding with python -i.

How It Works

The output from this simple test case looks like this:

Traceback (most recent call last):
  File "<pyshell#8>", line 1, in <module>
    assert type(broken_init)==type(4)
NameError: name 'broken_init' is not defined

You can see from this stack trace that this simply raises the error. assert is implemented very simply. If a special internal variable called __debug__ is True, assertions are checked; and if any assertion doesn't succeed, an AssertionError is raised. Because assert is actually a combination of an if statement that, when there's a problem, will raise an exception, you are allowed to specify a custom message, just as you would with raise. You should experiment by replacing the last assertion with this code and running it:

try:
assert type(broken_int)==type(4),"broken_int is broken"
except AssertionError: print("Handle the error here.)

The variable __debug__, which activates assert, is special; it's immutable after Python has started up, so in order to turn it off you need to specify the -O (a dash, followed by the capital letter O) parameter to Python. -O tells Python to optimize code, which among other things for Python means that it removes assert tests, because it knows that they'll cause the program to slow down (not a lot, but optimization like this is concerned with getting every little bit of performance). -O is intended to be used when a program is deployed, so it removes assertions that are considered to be development-time features.

As you can see, assertions are useful. If you even think that you may have made a mistake and want to catch it later in your development cycle, you can put in an assertion to catch yourself, and move on and get other work done until that code is tested. When your code is tested, it can tell you what's going wrong if an assertion fails instead of leaving you to wonder what happened. Moreover, when you deploy and use the -O flag, your assertion won't slow down the program.

Assert lacks a couple of things by itself. First, assert doesn't provide you with a structure in which to run your tests. You have to create a structure, and that means that until you learn what you want from tests, you're liable to make tests that do more to get in your way than confirm that your code is correct.

Second, assertions just stop the program and they provide only an exception. It would be more useful to have a system that would give you summaries, so you can name your tests, add tests, remove tests, and compile many tests into a package that let you summarize whether or not your program tests out. These ideas and more make up the concepts of unit tests and test suites.

Test Cases and Test Suites

Unit testing revolves around the test case, which is the smallest building block of testable code for any circumstances that you're testing. When you're using PyUnit, a test case is a simple object with at least one test method that runs code; and when it's done, it then compares the results of the test against various assertions that you've made about the results.

Note

PyUnit is the name of the package as named by its authors, but the module you import is called the more generic-sounding name unittest.

Each test case is subclassed from the TestCase class, which is a good, memorable name for it. The simplest test cases you can write just override the runTest method of TestCase and enable you to define a basic test, but you can also define several different test methods within a single test case class, which can enable you to define things that are common to a number of tests, such as setup and cleanup procedures.

A series of test cases run together for a particular project is called a test suite. You can find some simple tools for organizing test suites, but they all share the concept of running a bunch of test cases together and recording what passed, what failed, and how, so you can know where you stand.

Because the simplest possible test suite consists of exactly one test case, and you've already had the simplest possible test case described to you, in the following Try It Out you write a quick testing example so you can see how all this fits together. In addition, just so you really don't have anything to distract you, you test arithmetic, which has no external requirements on the system, the file system, or, really, anything.

Try It Out: Testing Addition

Use your favorite editor to create a file named test1.py in a directory named ch12. Using your programming editor, edit your file to have the following code:

import unittest

class ArithTest (unittest.TestCase):
    def runTest (self):
        """ Test addition and succeed. """
        self.failUnless (1+1==2, 'one plus one fails!')
        self.failIf (1+1 != 2, 'one plus one fails again!')
        self.failUnlessEqual (1+1, 2, 'more trouble with one plus one!')

def suite():
    suite = unittest.TestSuite()
    suite.addTest (ArithTest())
    return suite


if __name__ == '__main__':
    runner = unittest.TextTestRunner()
    test_suite = suite()
    runner.run (test_suite)

Now run the code using python:

.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Ran 1 tests in 0.026s

How It Works

In step 1, after you've imported unittest (the module that contains the PyUnit framework), you define the class ArithTest, which is a subclass of the class from unittest, TestCase. ArithTest has only defined the runTest method, which performs the actual testing. Note how the runTest method has its docstring defined. It is at least as important to document your tests as it is to document your code. Lastly, a series of three assertions takes place in runTest.

TestCase classes beginning with fail, such as failUnless, failIf, and failUnlessEqual, come in additional varieties to simplify setting up the conditions for your tests. When you're programming, you'll likely find yourself resistant to writing tests (they can be very distracting; sometimes they are boring; and they are rarely something other people notice, which makes it harder to motivate yourself to write them). PyUnit tries to make things as easy as possible for you.

After the unit test is defined in ArithTest, you may like to define the suite itself in a callable function, as recommended by the PyUnit developer, Steve Purcell, in the modules documentation. This enables you to simply define what you're doing (testing) and where (in the function you name). Therefore, after the definition of ArithTest, you have created the suite function, which simply instantiates a vanilla, unmodified test suite. It adds your single unit test to it and returns it. Keep in mind that the suite function only invokes the TestCase class in order to make an object that can be returned. The actual test is performed by the returned TestCase object.

As you learned in Chapter 6, only when this is being run as the main program will Python invoke the TextTestRunner class to create the runner object. The runner object has a method called run that expects to have an object of the unittests.TestSuite class. The suite function creates one such object, so test_suite is assigned a reference to the TestSuite object. When that's finished, the runner.run method is called, which uses the suite in test_suite to test the unit tests defined in test_suite.

The actual output in this case is dull, but in that good way you'll learn to appreciate because it means everything has succeeded. The single period tells you that it has successfully run one unit test. If, instead of the period, you see an F, it means that a test has failed. In either case, PyUnit finishes off a run with a report. Note that arithmetic is run very, very fast.

Now, see what failure looks like.

Try It Out: Testing Faulty Addition

Use your favorite text editor to add a second set of tests to test1.py. These will be based on the first example. Add the following to your file:

class ArithTestFail (unittest.TestCase):
    def runTest (self):
        """ Test addition and fail. """
        self.failUnless (1+1==2, 'one plus one fails!')
        self.failIf (1+1 != 2, 'one plus one fails again!')
        self.failUnlessEqual (1+1, 2, 'more trouble with one plus one!')
        self.failIfEqual (1+1, 2, 'expected failure here')
        self.failIfEqual (1+1, 2, 'second failure')

def suite_2():
    suite = unittest.TestSuite()
    suite.addTest (ArithTest())
    suite.addTest (ArithTestFail())
    return suite

You also need to change the if statement that sets off the tests, and you need to make sure that it appears at the end of your file so that it can see both classes:

if __name__ == '__main__':
    runner = unittest.TextTestRunner()
    test_suite = suite_2()
    runner.run (test_suite)

Now run the newly modified file (after you've saved it). You'll get a very different result with the second set of tests. In fact, it'll be very different from the prior test:

.F
======================================================================
FAIL: Test addition and fail.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Traceback (most recent call last):
  File "C:Python30ch12	est1.py", line 22, in runTest
   self.failIfEqual(1+1,2, 'expected failure here')
AssertionError: expected failure here

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Ran 2 tests in 0.062s

FAILED (failures=1)
>>>

How It Works

Here, you've kept your successful test from the first example and added a second test that you know will fail. The result is that you now have a period from the first test, followed by an "F" for "Failed" from the second test, all in the first line of output from the test run.

After the tests are run, the results report is printed out so you can examine exactly what happened. The successful test still produces no output at all in the report, which makes sense: Imagine you have a hundred tests but only two fail — you would have to slog through a lot more output to find the failures than you do this way. It may seem like looking on the negative side of things, but you'll get used to it.

Because there was a failed test, the stack trace from the failed test is displayed. In addition, a couple of different messages result from the runTest method. The first thing you should look at is the FAIL message. It actually uses the docstring from your runTest method and prints it at the top, so you can reference the test that failed. Therefore, the first lesson to take away from this is that you should document your tests in the docstring! Second, you'll notice that the message you specified in the runTest for the specific test that failed is displayed along with the exception that PyUnit generated.

The report wraps up by listing the number of test cases actually run and a count of the failed test cases.

Test Fixtures

Well, this is all well and good, but real-world tests usually involve some work to set up your tests before they're run (creating files, creating an appropriate directory structure, generally making sure everything is in shape, and other things that may need to be done to ensure that the right things are being tested). In addition, cleanup also often needs to be done at the end of your tests.

In PyUnit, the environment in which a test case runs is called the test fixture, and the base TestCase class defines two methods: setUp, which is called before a test is run, and tearDown, which is called after the test case has completed. These are present to deal with anything involved in creating or cleaning up the test fixture.

Note

You should know that if setUp fails, tearDown isn't called. However, tearDown is called even if the test case itself fails.

Remember that when you set up tests, the initial state of each test shouldn't rely on a prior test having succeeded or failed. Each test case should create a pristine test fixture for itself. If you don't ensure this, you're going to get inconsistent test results that will only make your life more difficult.

To save time when you run similar tests repeatedly on an identically configured test fixture, subclass the TestCase class to define the setup and cleanup methods. This will give you a single class that you can use as a starting point. Once you've done that, subclass your class to define each test case. You can alternatively define several test case methods within your unit case class, and then instantiate test case objects for each method. Both of these are demonstrated in the next example.

Try It Out: Working with Test

Use your favorite text editor to add a new file test2.py. Make it look like the following example. Note that this example builds on the previous examples.

import unittest
class ArithTestSuper (unittest.TestCase):
    def setUp (self):
        print("Setting up ArithTest cases")
    def tearDown (self):
        print("Cleaning up ArithTest cases")
class ArithTest (ArithTestSuper):
    def runTest (self):
        """ Test addition and succeed. """
        print("Running ArithTest")
        self.failUnless (1+1==2, 'one plus one fails!')
        self.failIf (1+1 != 2, 'one plus one fails again!')
        self.failUnlessEqual (1+1, 2, 'more trouble with one plus one!')

class ArithTestFail (ArithTestSuper):
    def runTest (self):

""" Test addition and fail. """
        print("Running ArithTestFail")
        self.failUnless (1+1==2, 'one plus one fails!')
        self.failIf (1+1 != 2, 'one plus one fails again!')
        self.failUnlessEqual (1+1, 2, 'more trouble with one plus one!')
        self.failIfEqual (1+1, 2, 'expected failure here')
        self.failIfEqual (1+1, 2, 'second failure')

class ArithTest2 (unittest.TestCase):
    def setUp (self):
        print("Setting up ArithTest2 cases")
    def tearDown (self):
        print("Cleaning up ArithTest2 cases")
    def runArithTest (self):
        """ Test addition and succeed, in one class. """
        print("Running ArithTest in ArithTest2")
        self.failUnless (1+1==2, 'one plus one fails!')
        self.failIf (1+1 != 2, 'one plus one fails again!')
        self.failUnlessEqual (1+1, 2, 'more trouble with one plus one!')

   def runArithTestFail (self):
       """ Test addition and fail, in one class. """
       print("Running ArithTestFail in ArithTest2")
       self.failUnless (1+1==2, 'one plus one fails!')
       self.failIf (1+1 != 2, 'one plus one fails again!')
       self.failUnlessEqual (1+1, 2, 'more trouble with one plus one!')
       self.failIfEqual (1+1, 2, 'expected failure here')
       self.failIfEqual (1+1, 2, 'second failure')

def suite():
    suite = unittest.TestSuite()
    # First style:
    suite.addTest (ArithTest())
    suite.addTest (ArithTestFail())
    # Second style:
    suite.addTest (ArithTest2("runArithTest"))
    suite.addTest (ArithTest2("runArithTestFail"))

    return suite
if __name__ == '__main__':
    runner = unittest.TextTestRunner()
    test_suite = suite()
    runner.run (test_suite)

Run the code:

Setting up ArithTest cases
Running ArithTest
Cleaning up ArithTest cases
.Setting up ArithTest cases

Running ArithTestFail
FCleaning up ArithTest cases
Setting up ArithTest2 cases
Running ArithTest in ArithTest2
Cleaning up ArithTest2 cases
.Setting up ArithTest2 cases
Running ArithTestFail in ArithTest2
FCleaning up ArithTest2 cases

======================================================================
FAIL: Test addition and fail.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Traceback (most recent call last):
  File "C:/Python31/test2.py", line 25, in runTest
    self.failIfEqual (1+1, 2, 'expected failure here')
AssertionError: expected failure here

======================================================================
FAIL: Test addition and fail, in one class.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Traceback (most recent call last):
  File "C:/Python31/test2.py", line 48, in runArithTestFail
    self.failIfEqual (1+1, 2, 'expected failure here')
AssertionError: expected failure here

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Ran 4 tests in 0.396s

FAILED (failures=2)
>>>

How It Works

Take a look at this code before moving along. The first thing to note about this is that you're doing the same tests as before. One test is made to succeed and the other one is made to fail, but you're doing two sets, each of which implements multiple unit test cases with a test fixture, but in two different styles.

Which style you use is completely up to you; it really depends on what you consider readable and maintainable.

The first set of classes in the code (ArithTestSuper, ArithTest, and ArithTestFail) are essentially the same tests as shown in the second set of examples in test1.py, but this time a class has been created called ArithTestSuper. ArithTestSuper implements a setUp and tearDown method. They don't do much but they do demonstrate where you'd put in your own conditions. Each of the unit test classes are subclassed from your new ArithTestSuper class, so now they will perform the same setup of the test fixture. If you needed to make a change to the test fixture, you can now modify it in ArithTestSuper's classes, and have it take effect in all of its subclasses.

The actual test cases, ArithTest and ArithTestFail, are the same as in the previous example, except that you've added print calls to them as well.

The final test case class, ArithTest2, does exactly the same thing as the prior three classes that you've already defined. The only difference is that it combines the test fixture methods with the test case methods, and it doesn't override runTest. Instead ArithTest2 defines two test case methods: runArithTest and runArithTestFail. These are then invoked explicitly when you created test case instances during the test run, as you can see from the changed definition of suite.

Once this is actually run, you can see one change immediately: Because your setup, test, and cleanup functions all write to stdout, you can see the order in which everything is called. Note that the cleanup functions are indeed called even after a failed test. Finally, note that the tracebacks for the failed tests have been gathered up and displayed together at the end of the report.

Putting It All Together with Extreme Programming

A good way to see how all of this fits together is to use a test suite during the development of an extended coding project. This strategy underlies the XP (Extreme Programming) methodology, which is a popular trend in programming: First, you plan the code; then you write the test cases as a framework; and only then do you write the actual code. Whenever you finish a coding task, you rerun the test suite to see how closely you approach the design goals as embodied in the test suite. (Of course, you are also debugging the test suite at the same time, and that's fine!) This technique is a great way to find your programming errors early in the process, so that bugs in low-level code can be fixed and the code made stable before you even start on higher-level work, and it's extremely easy to set up in Python using PyUnit, as you see in the next example.

This example includes a realistic use of text fixtures as well, creating a test directory with a few files in it and then cleaning up the test directory after the test case is finished. It also demonstrates the convention of naming all test case methods with test followed by the name, such as testMyFunction, to enable the unittest.main procedure to recognize and run them automatically.

Implementing a Search Utility in Python

The first step in this programming methodology, as with any, is to define your objectives — in this case, a general-purpose, reusable search function that you can use in your own work. Obviously, it would be a waste of time to anticipate all possible text-processing functionality in a single search utility program, but certain search tasks tend to recur a lot. Therefore, if you wanted to implement a general-purpose search utility, how would you go about it? The UNIX find command is a good place to look for useful functionality — it enables you not only to iterate through the directory tree and perform actions on each file found, but also to specify certain directories to skip, to specify rather complex logic combinations on the command line, and a number of other things, such as searching by file modification date and size.

On the other hand, the find command doesn't include any searching on the content of files (the standard way to do this under UNIX is to call grep from within find) and it has a lot of features involving the invocation of post-processing programs that we don't really need for a general-purpose Python search utility.

What you might need when searching for files in Python could include the following:

Return values you can use easily in Python: A tuple including the full path, the file name, the extension, and the size of the file is a good start.
Specification of a regular expression for the file name to search for and a regular expression for the content (if no content search is specified, the files shouldn't be opened, to save overhead).
Optional specifications of additional search terms: The size of the file, its age, last modification, and so on are all useful.

A truly general search utility might include a function to be called with the parameters of the file, so that more advanced logic can be specified. The UNIX find command enables very general logic combinations on the command line, but frankly, let's face it — complex logic on the command line is hard to understand. This is the kind of thing that really works better in a real programming language like Python, so you could include an optional logic function for narrowing searches as well.

In general, it's a good idea to approach this kind of task by focusing first on the core functionality, adding more capability after the initial code is already in good shape. That's how the following example is structured — first you start with a basic search framework that encapsulates the functionality you covered in the examples for the os and re modules, and then you add more functionality once that first part is complete. This kind of incremental approach to software development can help keep you from getting bogged down in details before you have anything at all to work with, and the functionality of something like this general-purpose utility is complicated enough that it would be easy to lose the thread.

Because this is an illustration of the XP methodology as well, you'll follow that methodology and first write the code to call the find utility, build that code into a test suite, and only then will you write the find utility. Here, of course, you're cheating a little. Ordinarily, you would be changing the test suite as you go, but in this case, the test suite is already guaranteed to work with the final version of the tested code. Nonetheless, you can use this example for yourself.

Try It Out: Writing a Test Suite

Use your favorite text editor to create the file test_find.py. Enter the following code:

import unittest
import find
import os, os.path

def filename(ret):
   return ret[1]

class FindTest (unittest.TestCase):
   def setUp (self):
      os.mkdir ("_test")
      os.mkdir (os.path.join("_test", "subdir"))
      f = open (os.path.join("_test", "file1.txt"), "w")
      f.write ("""first line
second line
third line

fourth line""")
      f.close()

      f = open (os.path.join("_test", "file2.py"), "w")
      f.write ("""This is a test file.
It has many words in it.
This is the final line.""")
      f.close()

   def tearDown (self):
      os.unlink (os.path.join ("_test", "file1.txt"))
      os.unlink (os.path.join ("_test", "file2.py"))
      os.rmdir (os.path.join ("_test", "subdir"))
      os.rmdir ("_test")

   def test_01_SearchAll (self):
      """ 1: Test searching for all files. """
      res = find.find (r".*", start="_test")
      self.failUnless (map(filename,res) == ['file1.txt', 'file2.py'],
                       'wrong results')

   def test_02_SearchFileName (self):
      """ 2: Test searching for specific file by regexp. """
      res = find.find (r"file", start="_test")
      self.failUnless (map(filename,res) == ['file1.txt', 'file2.py'],
                       'wrong results')
      res = find.find (r"py$", start="_test")
      self.failUnless (map(filename,res) == ['file2.py'],
                       'Python file search incorrect')

   def test_03_SearchByContent (self):
      """ 3: Test searching by content. """
      res = find.find (start="_test", content="first")
      self.failUnless (map(filename,res) == ['file1.txt'],
                       "didn't find file1.txt")
      res = find.find (where="py$", start="_test", content="line")
      self.failUnless (map(filename,res) == ['file2.py'],
                       "didn't find file2.py")
      res = find.find (where="py$", start="_test", content="second")
      self.failUnless (len(res) == 0,
                        "found something that didn't exist")

   def test_04_SearchByExtension (self):
      """ 4: Test searching by file extension. """
      res = find.find (start="_test", ext='py')
      self.failUnless (map(filename,res) == ['file2.py'],
                       "didn't find file2.py")
      res = find.find (start="_test", ext='txt')
      self.failUnless (map(filename,res) == ['file1.txt'],
                       "didn't find file1.txt")

   def test_05_SearchByLogic (self):
      """ 5: Test searching using a logical combination callback. """
      res = find.find (start="_test", logic=lambda x: (x['size'] < 50))

self.failUnless (map(filename,res) == ['file1.txt'],
                       "failed to find by size")

if __name__ == '__main__':
   unittest.main()

Now create another code file named find.py — note that this is only the skeleton of the actual find utility and will fail miserably. That's okay; in testing and in extreme programming, failure is good because it tells you what you still need to do:
```
import os, os.path
import re
from stat import *

def find (where='.*', content=None, start='.', ext=None, logic=None):
    return ([])
```

Run the test_find.py test suite from the command line. An excerpt is shown here:

C:projectsarticlespython_bookch12_testing>python test_find.py
FFFFF
======================================================================
FAIL: 1: Test searching for all files.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

[a lot more information]

Ran 5 tests in 0.421s

FAILED (failures=5)

How It Works

The first three lines of the testing suite import the PyUnit module, the find module to be tested (which hasn't actually been written yet), and the os and os.path modules for file and directory manipulation when setting up and tearing down the test fixtures. Following this, there's a simple helper function to extract the file name from the search results, to make it simpler to check the results for correctness.

After that, the test suite itself starts. All test cases in this example are instances of the base class FindTest. The FindTest class starts out with setUp and tearDown methods to define the test fixtures used in the test cases, followed by five test cases.

The test fixture in all test cases consists of a testing directory; a subdirectory under that main directory to ensure that subdirectories aren't treated as files when scanning; and two test files with .txt and .py extensions. The contents of the test files are pretty arbitrary, but they contain different words so that the test suite can include tests to distinguish between them using a content search.

The test cases themselves are named with both a sequential number and a descriptive name, and each starts with the characters "test." This allows the unittest.main function to autodetect them when running the test suite. The sequential numbers ensure that the tests will be run in the proper order defined, because a simple character sort is used to order them when testing. Each docstring then cites the test number, followed by a simple description of the type of test. All of this enables the results of failed tests to be understood quickly and easily, so that you can trace exactly where the error occurred.

Finally, after the test cases are defined, there are exactly two lines of code to detect that the script is being run directly instead of being called as a module, and if it is being run, to create a default test runner using unittest.main in that case. The unittest.main call then finds all of the test cases, sorts them by the sequential number, and runs them in order.

The second file is the skeleton of the find utility itself. Beyond determining what it has to do and how it's called, you haven't done anything at all yet to write the code itself, so that's your next task.

Try It Out: A General-Purpose Search Framework

Using your favorite text editor, open find.py and change it to look like this:

import os, os.path
import re
from stat import *

def find (where='.*', content=None, start='.', ext=None, logic=None):
   context = {}
   context['where'] = where
   context['content'] = content
   context['return'] = []

   os.walk (start, find_file, context)

   return context['return']

def find_file (context, dir, files):
   for file in files:
      # Find out things about this file.
      path = os.path.join (dir, file)
      path = os.path.normcase (path)
      try:
         ext = os.path.splitext (file)[1][1:]
      except:
         ext = ''
      stat = os.stat(path)
      size = stat[ST_SIZE]

      # Don't treat directories like files
      if S_ISDIR(stat[ST_MODE]): continue

      # Do filtration based on the original parameters of find()
      if not re.search (context['where'], file): continue

      # Do content filtration last, to avoid it as much as possible
      if context['content']:
         f = open (path, 'r')
         match = 0
         for l in f.readlines():
            if re.search(context['content'], l):

match = 1
              break
        f.close()
        if not match: continue

     # Build the return value for any files that passed the filtration tests.
     file_return = (path, file, ext, size)
     context['return'].append (file_return)

Now, for example, to find Python files containing "find," you can start Python and do the following:

>>> import find
>>> find.find(r"py$", content='find')
[('.\find.py', 'find.py', 'py', 1297), ('.\test_find.py',
'test_find.py', 'py', 1696)]

How It Works

This example is really doing the same thing as the first example in the last chapter on text processing, except that instead of a task-specific print_pdf function, there is a more general find_file function to scan the files in each directory. Because this code is more complex than the other example scripts, you can see that having a testing framework available in advance will help you immensely in debugging the initial versions. This first version satisfies the first three test cases of the test suite.

Because the find_file function is doing most of the filtration work, it obviously needs access to the search parameters. In addition, because it also needs a place to keep the list of hits it is building during the search, a dictionary structure is a good choice for its argument, because a dictionary is mutable and can contain any number of named values. Therefore, the first thing the main find function does is to build that dictionary and put the search parameters into it. It then calls os.walk to do the work of iterating through the directory structure, just as in the PDF search code example at the beginning of this chapter. Once the walk is done, it returns the return value (the list of files found and information about them), which was built during the search.

During the search, os.walk calls find_file on each directory it finds, passing the dictionary argument built at the start of the search, the name of the current directory, and a list of all the files in the directory. The first thing the find_file function does, then, is to scan that list of files and determine some basic information for each one by running os.stat on it. If the "file" is actually a subdirectory, the function moves on; because all of the search parameters apply to file names, not to points in the directory tree (and because the content search will result in an error unless a file is being opened!), the function skips the subdirectories using the information gleaned from the os.stat call.

When that's finished, the function applies the search parameters stored in the dictionary argument to eliminate whatever files it can. If a content parameter is specified, it opens and reads each file, but otherwise no manipulation of the file itself is done.

If a file has passed all the search parameter tests (there are only two in this initial version), an entry is built for it and appended to the hit list; this entry consists of the full path name of the file relative to the starting point of the search, the file name itself, its extension, and its size. Naturally, you could return any set of values for files you find useful, but these are a good basic set that you could use to build a directory-like listing of hits, or use to perform some sort of task on the files.

A More Powerful Python Search

Remember that this is an illustration of an incremental programming approach, so the first example was a good place to stop and give an explanation, but there are plenty of other search parameters it would be nice to include in this general search utility, and of course there are still two unit cases to go in the test suite you wrote at the outset. Because Python gives you a keyword parameter mechanism, it's very simple to add new named parameters to your function definition and toss them into the search context dictionary, and then use them in find_file as needed, without making individual calls to the find function unwieldy.

The next example shows you how easy it is to add a search parameter for the file's extension, and throws in a logic combination callback just for good measure. You can add more search parameters at your leisure; the following code just shows you how to get started on your own extensions (one of the exercises for the chapter asks you to add search parameters for the date on which the file was last modified, for instance).

Though the file extension parameter, as a single simple value, is easy to conceive and implement — it's really just a matter of adding the parameter to the search context and adding a filter test in find_file — planning a logic combination callback parameter requires a little thought. The usual strategy for specification of a callback is to define a set of parameters — say, the file name, size, and modification date — and then pass those values in on each call to the callback. If you add a new search parameter, you're faced with a choice — you can arbitrarily specify that the new parameter can't be included in logical combinations, you can change the callback specification and invalidate all existing callbacks for use with the new code, or you can define multiple categories of logic callbacks, each with a different set of parameters. None of these alternatives is terribly satisfying, and yet they're decisions that have to be made all the time.

In Python, however, the dictionary structure provides you with a convenient way to circumvent this problem. If you define a dictionary parameter that passes named values for use in logic combinations, unused parameters are simply ignored. Thus, older callbacks can still be used with newer code that defines more search parameters, without any changes to code you've already got being necessary. In the updated search code found in the next Try It Out, the callback function is defined to be a function that takes a dictionary and returns a flag — a true filter function. You can see how it's used in the example section and in the next chapter, in test case 5 in the search test suite.

Adding a logical combination callback also makes it simple to work with numerical parameters such as the file size or the modification date. It's unlikely that a caller will search on the exact size of a file; instead, one usually searches for files larger or smaller than a given value, or in a given size range — in other words, most searches on numerical values are already logical combinations. Therefore, the logical combination callback should also get the size and dates for the file, so that a filter function can already be written to search on them. Fortunately, this is simple — the results of os.stat are already available to copy into the dictionary.

Try It Out: Extending the Search Framework

Again using your favorite text editor, open the file find.py from the last example and modify it so that it matches the following code:

import os, os.path
import re
from stat import *

def find (where='.*', content=None, start='.', ext=None, logic=None):
   context = {}
   context['where'] = where
   context['content'] = content
   context['return'] = []
   context['ext'] = ext
   context['logic'] = logic

   for root, dirs, files in os.walk(start):
       find_file(context, root, files)

   return context['return']

def find_file (context, dir, files):
   for file in files:
      # Find out things about this file.
      path = os.path.join (dir, file)
      path = os.path.normcase (path)
      stat = os.stat(path)
      size = stat[ST_SIZE]
      try:
         ext = os.path.splitext (file)[1][1:]
      except:
         ext = ''

      # Don't treat directories like files
      if S_ISDIR(stat[ST_MODE]): continue

      # Do filtration based on passed logic
      if context['logic'] and not context['logic'](locals()): continue

      # Do filtration based on extension
      if context['ext'] and ext != context['ext']: continue

      # Do filtration based on the original parameters of find()
      if not re.search (context['where'], file): continue

      # Do content filtration last, to avoid it as much as possible
      if context['content']:
         f = open (path, 'r')
         match = 0
         for l in f.readlines():
            if re.search(context['content'], l):
              match = 1
              break

f.close()
         if not match: continue

      # Build the return value for any files that passed the filtration tests.
      file_return = (path, file, ext, size)
      context['return'].append (file_return)

Now to find files larger than 1,000 bytes and older than yesterday:

>>> import find
>>> find.find(r"py$", content='find')
[('.\find.py', 'find.py', 'py', 1297), ('.\test_find.py',
'test_find.py', 'py', 1696)]

You can also run the test_find.py test suite from the command line:

C:projectspython_bookch11_regexp>python test_find.py
.....
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Ran 5 tests in 0.370s

During development, this run was not quite so smooth!

Formal Testing in the Software Life Cycle

The result of the test suite shown in the preceding example is clean and stable code in a somewhat involved programming example, and well-defined test cases that are documented as working correctly. This is a quick and easy process in the case of a software "product" that is some 30 lines long, although it can be astounding how many programming errors can be made in only 30 lines!

In a real-life software life cycle, of course, you will have thousands of lines of code. In projects of realistic magnitude like this, nobody can hope to define all possible test cases before releasing the code. It's true that formal testing during the development phase will dramatically improve both your code and your confidence in it, but there will still be errors in it when it goes out the door.

During the maintenance phase of the software life cycle, bug reports are filed after the target code is placed in production. If you're taking an integrated testing approach to your development process, you can see that it's logical to think of bug reports as highlighting errors in your test cases as well as errors in the code itself. Therefore, the first thing you should do with a bug report is to use it to modify an existing test case, or to define a new test case from scratch, and only then should you start to modify the target code itself.

By doing this, you accomplish several things. First, you're giving the reported bugs a formal definition. This enables you to agree with other people regarding what bugs are actually being fixed, and it enables further discussion to take place as to whether the bugs have really been understood correctly. Second, by defining test fixtures and test cases, you are ensuring that the bugs can be duplicated at will. As I'm sure you know if you've ever need to reproduce elusive bugs, this alone can save you a lot of lost sleep. Finally, the third result of this approach might be the most significant: If you never make a change to code that isn't covered by a test case, you will always know that later changes aren't going to break fixes already in place. The result is happier users and a more relaxed you. And you'll owe it all to unit testing.

Summary

Testing is a discipline best addressed at the very outset of the development life cycle. In general, you will know that you've got a firm grip on the problem you're solving when you understand it enough to write tests for it.

The most basic kind of test is an assertion. Assertions are conditions that you've placed inside of your program confirming that conditions that should exist do in fact exist. They are for use while you're developing a program to ensure that conditions you expect are met.

Assertions will be turned off if Python is run with the -O option. The -O indicates that you want Python to run in a higher performance mode, which would usually also be the normal way to run a program in production. This means that using assert is not something that you should rely on to catch errors in a running system.

PyUnit is the default way of doing comprehensive testing in Python, and it makes it very easy to manage the testing process. PyUnit is implemented in the unittest module.

When you use PyUnit to create your own tests, PyUnit provides you with functions and methods to test for specific conditions based on questions such as "is value A greater than value B," giving you a number of methods in the TestCase class that fail when the conditions reflected by their names fail. The names of these methods all begin with "fail" and can be used to set up most of the conditions for which you will ever need to test.

The TestCase class should be subclassed — it's the run method that is called on to run the tests, and this method needs to be customized to your tests. In addition, the test fixture, or the environment in which the tests should be run, can be set up before each test if the TestCase's setUp and tearDown methods are overridden, and code is specified for them.

You've seen two approaches to setting up a test framework for yourself. One subclasses a customized class, and another uses separate functions to implement the same features but without the need to subclass. You should use both and find out which ones work for your way of doing things. These tests do not have to live in the same file as your modules or programs; they should be kept separate so they don't bloat your code.

As you go through the remainder of this book, try to think about writing tests for the functions and classes that you see, and perhaps write tests as you go along. It's good exercise; better than having exercises here.

The key things to take away from this chapter are:

Assertions are statements made within your code that allow you to test the validity of the code. If the test fails, an AssertionError is raised. You can use assert to create your tests.
PyUnit is the name of the package as named by its authors, but the module you import is called the more generic-sounding name unittest.
A test suite is a series of test cases run together for a particular project.
In PyUnit, the environment in which a test case runs is called the test fixture, and the base TestCase class defines two methods: setUp, which is called before a test is run; and tearDown, which is called after the test case has completed. These are present to deal with anything involved in creating or cleaning up the test fixture.

In the following chapter, we will discuss GUI (graphical user interface) programming, and learn to make simple, interactive programs.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 12. Testing

Create new playlist

Sign In

Sign Up

Chapter 12. Testing

Assertions

Test Cases and Test Suites

Note

Test Fixtures

Note

Putting It All Together with Extreme Programming

Implementing a Search Utility in Python

A More Powerful Python Search

Formal Testing in the Software Life Cycle

Summary

Table of Contents for
12. Testing