Chapter 9. Modular Programming as a Foundation for Good Programming Technique

We have come a long way in this book. From learning how modules and packages work in Python, and how to use them to better organize your code, we have discovered many of the common practices used to apply modular patterns to solve a range of programming problems. We have seen how modular programming allows us to deal with changing requirements in a real-world system in the best possible way, and learned what makes a module or package a suitable candidate for reuse in new projects. We have seen many of the more advanced techniques for working with modules and packages in Python, as well as ways of avoiding the pitfalls that you may encounter along the way.

Finally, we looked at ways of testing your code, how to use a source code management system to keep track of the changes you make to your code over time, and how to submit your module or package to the Python Package Index (PyPI) so that others can find and use it.

Using what we have learned thus far, you will be able to competently apply modular techniques to your Python programming efforts, creating robust and well-written code that can be reused in a variety of programs. You can also share your code with others, both inside your organization and within the wider Python developer community.

In this final chapter, we will use a practical example to show how modules and packages do far more than just organize your code: they help to deal with the process of programming more effectively. We will see how modules are vital to the design and development of any large system, and demonstrate how the use of modular techniques to create robust, useful and well-written modules is an essential part of being a good programmer.

The process of programming

All too often as programmers, we focus on the technical details of a program. That is, we focus on the product rather than the process of programming. The difficulties of solving a particular programming problem are so great that we forget that the problem itself will change over time. No matter how much we try to avoid it, change is inevitable: changing markets, changing requirements, and changing technologies. As programmers, we need to be able to effectively cope with this change just as much as we need to be able to implement, test, and debug our code.

Back in Chapter 4, Using Modules for Real-World Programming, we looked at an example program that faced the challenge of changing requirements. We saw how a modular design allowed us to minimize the amount of code that had to be rewritten when the scope of the program increased well beyond what was first envisaged.

Now that we have learned more about modular programming and the related technologies that can help to make it more effective, let's work through this exercise again. This time, we'll choose a simple package for counting the number of occurrences of some event or object. For example, imagine that you need to keep a count of the number of animals of each type you see while walking across a farm. As you see each type of animal, you record its presence by passing it to the counter, and at the end, the counter will tell you how many animals of each type you have seen. For example:

>>> counter.reset()
>>> counter.add("sheep")
>>> counter.add("cow")
>>> counter.add("sheep")
>>> counter.add("rabbit")
>>> counter.add("cow")
>>> print(counter.totals())
[("cow", 2), ("rabbit", 1), ("sheep", 2)]

This is a simple package, but it gives us a good target for applying some of the more useful techniques we have learned in the previous chapters. In particular, we will make use of docstrings to document what each function in our package does, and we will write a series of unit tests to ensure that our package is working the way we expect it to.

Let's start by creating a directory to hold our new project, which we will call Counter. Create a directory named counter somewhere convenient, and then add a new file named README.rst to this directory. Since we expect to eventually upload this package to the Python Package Index, we will use reStructuredText format for our README file. Enter the following into this file:

About the ``counter`` package
-----------------------------

``counter`` is a package designed to make it easy to keep track of the number of times some event or object occurs.  Using this package, you **reset** the counter, **add** the various values to the counter, and then retrieve the calculated **totals** to see how often each value occurred.

Let's take a closer look at how this package might be used. Imagine that you wanted to keep a count of the number of cars of each color which were observed in a given timeframe. You would start by making the following call:

    counter.reset()

Then when you identify a car of a given color, you would make the following call:

    counter.add(color)

Finally, once the time period is over, you would obtain the various colors and how often they occurred in the following way:

    for color,num_occurrences in counter.totals():
        print(color, num_occurrences)

The counter can then be reset to start counting another set of values.

Let's now implement this package. Inside our counter directory, create another directory named counter to hold our package's source code, and create a package initialization file (__init__.py) inside this innermost counter directory. We'll follow the pattern we used earlier and define our package's public functions in a module named interface.py, which we will then import into the __init__.py file to make the various functions available at the package level. To do this, edit the __init__.py file and enter the following into this file:

from .interface import *

Our next task is to implement the interface module. Create the interface.py file inside the counter package directory, and enter the following into this file:

def reset():
    pass

def add(value):
    pass

def totals():
    pass

These are just placeholders for our counter package's public functions; we'll implement these one at a time, starting with the reset() function.

Following the recommended practice of documenting each function using a docstring, let's start by describing what this function does. Edit the existing definition for your reset() function so that it looks like the following:

def reset():
    """ Reset our counter.

        This should be called before we start counting.
    """
    pass

Remember that a docstring is a triple-quoted string (a string that spans multiple lines) which is "attached" to a function. A docstring typically starts with a one line description of what the function does. If more information is required, this will be followed by a single blank line, followed by one or more lines describing the function in more detail. As you can see, our docstring consists of a one-line description and one additional line providing more information about our function.

We now need to implement this function. Since our counter package needs to keep track of the number of times each unique value has occurred, it makes sense to store this information in a dictionary mapping unique values to the number of occurrences. We can store this dictionary as a private global variable which is initialized by our reset() function. Knowing this, we can go ahead and implement the remainder of our reset() function:

def reset():
    """ Reset our counter.

        This should be called before we start counting.
    """
    global _counts
    _counts = {} # Maps value to number of occurrences.

With the private _counts global defined, we can now implement the add() function. This function records the occurrence of a given value, storing the results into the _counts dictionary. Replace your placeholder implementation of the add() function with the following code:

def add(value):
    """ Add the given value to our counter.
    """
    global _counts

    try:
        _counts[value] += 1
    except KeyError:
        _counts[value] = 1

There shouldn't be any surprises here. Our final function, totals(), returns the values which were added to the _counts dictionary, along with how often each value occurred. Here is the necessary code, which should replace your existing placeholder for the totals() function:

def totals():
    """ Return the number of times each value has occurred.

        We return a list of (value, num_occurrences) tuples, one
        for each unique value included in the count.
    """
    global _counts

    results = []
    for value in sorted(_counts.keys()):
        results.append((value, _counts[value]))
    return results

This completes our first implementation of the counter package. We'll try it out using the ad hoc testing techniques we learned about in the previous chapter: open a terminal or command-line window and use the cd command to set the current directory to the outermost counter directory. Then, type python to start the Python interactive interpreter, and try entering the following commands:

import counter
counter.reset()
counter.add(1)
counter.add(2)
counter.add(1)
print(counter.totals())

All going well, you should see the following output:

[(1, 2), (2, 1)]

This tells you that the value 1 occurred twice and the value 2 occurred once—which is exactly what your calls to the add() function indicated.

Now that our package appears to be working, let's create some unit tests so that we can test our package more systematically. Create a new file named tests.py in the outermost counter directory and enter the following code into this file:

import unittest
import counter

class CounterTestCase(unittest.TestCase):
    """ Unit tests for the ``counter`` package.
    """
    def test_counter_totals(self):
        counter.reset()
        counter.add(1)
        counter.add(2)
        counter.add(3)
        counter.add(1)
        self.assertEqual(counter.totals(),
                         [(1, 2), (2, 1), (3, 1)])

    def test_counter_reset(self):
        counter.reset()
        counter.add(1)
        counter.reset()
        counter.add(2)
        self.assertEqual(counter.totals(), [(2, 1)])


if __name__ == "__main__":
    unittest.main()

As you can see, we have written two unit tests: one to check that the values we added are reflected in the counter's totals, and a second test to ensure that the reset() function is correctly resetting the counter, discarding any values that were added before reset() was called.

To run these tests, exit the Python interactive interpreter by pressing Control + D, and then type the following into the command line:

python tests.py

All going well, you should see the following output, indicating that both of your unit tests ran without any errors:

..
---------------------------------------------------------------------
Ran 2 tests in 0.000s

OK

The inevitable changes

At this stage, we now have a properly working counter package with good documentation and unit tests. Imagine, however, that the requirements for your package now changes, causing major problems for your design: instead of keeping a simple count of the number of unique values, you now need to support ranges of values. For example, the user of your package might define ranges of values from 0 to 5, 5 to 10, and 10 to 15; values within each range are grouped together for the purposes of counting. The following illustration shows how this is done:

The inevitable changes

To allow your package to support ranges, you will need to change the interface to the reset() function to accept an optional list of range values. For example, to count values between 0 and 5, 5 and 10, and 10 and 15, the reset() function can be called with the following parameter:

counter.reset([0, 5, 10, 15])

If no parameter is passed to counter.reset(), then the entire package should continue to work as it does at present, recording unique values rather than ranges.

Let's implement this new feature. First off, edit the reset() function so that it looks like the following:

def reset(ranges=None):
    """ Reset our counter.

        If 'ranges' is supplied, the given list of values will be
        used as the start and end of each range of values.  In
        this case, the totals will be calculated based on a range
        of values rather than individual values.

        This should be called before we start counting.
    """
    global _ranges
    global _counts

    _ranges = ranges
    _counts = {} # If _ranges is None, maps value to number of
                 # occurrences.  Otherwise, maps (min_value,
                 # max_value) to number of occurrences.

The only difference here, other than changing the documentation, is that we now accept an optional ranges parameter and store this into the private _ranges global.

Let's now update the add() function to support ranges. Change your source code so that this function looks like the following:

def add(value):
    """ Add the given value to our counter.
    """
    global _ranges
    global _counts

    if _ranges == None:
        key = value
    else:
        for i in range(len(_ranges)-1):
            if value >= _ranges[i] and value < _ranges[i+1]:
                key = (_ranges[i], _ranges[i+1])
                break

    try:
        _counts[key] += 1
    except KeyError:
        _counts[key] = 1

There's no change to the interface for this function; the only difference is behind the scenes, where we now check to see whether we are calculating totals for the ranges of values, and if so, we set the key into the _counts dictionary to be a (min_value, max_value) tuple identifying the range. This code is a little messy, but it works, nicely hiding this complexity from the code using this function.

The final function we need to update is the totals() function. The behavior of this function will change if we are using ranges. Edit your copy of the interface module so that the totals() function looks like the following:

def totals():
    """ Return the number of times each value has occurred.

        If we are currently counting ranges of values, we return a
        list of  (min_value, max_value, num_occurrences) tuples,
        one for each range.  Otherwise, we return a list of
        (value, num_occurrences) tuples, one for each unique value
        included in the count.
    """
    global _ranges
    global _counts

    if _ranges != None:
        results = []
        for i in range(len(_ranges)-1):
            min_value = _ranges[i]
            max_value = _ranges[i+1]
            num_occurrences = _counts.get((min_value, max_value),
                                          0)
            results.append((min_value, max_value,
                            num_occurrences))
        return results
    else:
        results = []
        for value in sorted(_counts.keys()):
            results.append((value, _counts[value]))
        return results

This code is a bit complicated, but we have updated our function's docstring to describe the new behavior. Let's now test our code; fire up the Python interpreter and try entering the following instructions:

import counter
counter.reset([0, 5, 10, 15])
counter.add(5.7)
counter.add(4.6)
counter.add(14.2)
counter.add(0.3)
counter.add(7.1)
counter.add(2.6)
print(counter.totals())

All going well, you should see the following output:

[(0, 5, 3), (5, 10, 2), (10, 15, 1)]

This corresponds to the three ranges you have defined, and shows that there are three values falling into the first range, two falling into the second range, and just one value falling into the third range.

Change management

At this stage, it seems that your updated package is a success. Just like the example we saw in Chapter 6, Creating Reusable Modules, we were able to use modular programming techniques to limit the number of changes that were needed to support a major new feature within our package. We have performed some tests, and the updated package seems to be working as it should.

However, we won't stop there. Since we added a major new feature to our package, we should add some unit tests to ensure that this feature is working as it should. Edit your tests.py script and add the following new test case to this module:

class RangeCounterTestCase(unittest.TestCase):
    """ Unit tests for the range-based features of the
        ``counter`` package.
    """
    def test_range_totals(self):
        counter.reset([0, 5, 10, 15])
        counter.add(3)
        counter.add(9)
        counter.add(4.5)
        counter.add(12)
        counter.add(19.1)
        counter.add(14.2)
        counter.add(8)
        self.assertEqual(counter.totals(),
                         [(0, 5, 2), (5, 10, 2), (10, 15, 2)])

This is very similar to the code we used for our ad hoc testing. After saving the updated tests.py script, run it. This should reveal something very interesting: your new package suddenly crashes:

ERROR: test_range_totals (__main__.RangeCounterTestCase)
-----------------------------------------------------------------
Traceback (most recent call last):
  File "tests.py", line 35, in test_range_totals
    counter.add(19.1)
  File "/Users/erik/Project Support/Work/Packt/PythonModularProg/First Draft/Chapter 9/code/counter-ranges/counter/interface.py", line 36, in add
    _counts[key] += 1
UnboundLocalError: local variable 'key' referenced before assignment

Our test_range_totals() unit test is failing because our package crashes with an UnboundLocalError when we try to add the value 19.1 to our ranged counter. A moment's reflection will show what is wrong here: we have defined three ranges, 0-5, 5-10, and 10-15, but we are now trying to add the value 19.1 to our counter. Since 19.1 is outside of the ranges we have set up, our package can't assign a range to this value, so our add() function is crashing.

It's easy enough to fix this problem; add the following highlighted lines to your add() function:

def add(value):
    """ Add the given value to our counter.
    """
    global _ranges
    global _counts

    if _ranges == None:
        key = value
    else:
        key = None
        for i in range(len(_ranges)-1):
            if value >= _ranges[i] and value < _ranges[i+1]:
                key = (_ranges[i], _ranges[i+1])
                break
        if key == None:
            raise RuntimeError("Value out of range: {}".format(value))

    try:
        _counts[key] += 1
    except KeyError:
        _counts[key] = 1

This causes our package to return a RuntimeError if the user attempts to add a value that falls outside of the ranges that we have set up.

Unfortunately, our unit test is still crashing, only now it fails with a RuntimeError. To fix this, remove the counter.add(19.1) line from the test_range_totals() unit test. We still want to test for this error condition, but we'll do so in a separate unit test. Add the following to the end of your RangeCounterTestCase class:

    def test_out_of_range(self):
        counter.reset([0, 5, 10, 15])
        with self.assertRaises(RuntimeError):
            counter.add(19.1)

This unit test checks specifically for the error condition we found earlier, and ensures that the package is correctly returning a RuntimeError if the supplied value is outside of the requested ranges.

Notice that we now have four separate unit tests defined for our package. We are still testing the package to make sure it runs without ranges, as well as testing all our range-based code. Because we have implemented (and are starting to flesh out) a range of unit tests for our package, we can be confident that any changes we made to support ranges won't break any existing code that doesn't use the new range-based features.

As you can see, the modular programming techniques we have used help us minimize the changes required to our code, and the unit tests we have written help to ensure that the updated code continues to work as we expect it to. In this way, the use of modular programming techniques allow us to deal with changing requirements and the ongoing process of programming in the most effective way possible.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.223.196.146