Chapter 10. Building a Module

As you saw in Chapter 7, modules provide a convenient way to share Python code between applications. A module is a very simple construct, and in Python, a module is merely a file of Python statements. The module might define functions and classes, and it can contain simple executable code that's not inside a function or class. And, best yet, a module might contain documentation about how to use the code in the module.

Python comes with a library of hundreds of modules that you can call in your scripts. You can also create your own modules to share code among your scripts. This chapter shows you how to create a module, step by step. This includes the following:

  • Exploring the internals of modules

  • Creating a module that contains only functions

  • Defining classes in a module

  • Extending classes with subclasses

  • Defining exceptions to report error conditions

  • Documenting your modules

  • Testing your modules

  • Running modules as programs

  • Installing modules

The first step is to examine what modules really are and how they work.

Exploring Modules

A module is just a Python source file. The module can contain variables, classes, functions, and any other element available in your Python scripts.

You can get a better understanding of modules by using the dir function. Pass the name of some Python element, such as a module, and dir will tell you all of the attributes of that element. For example, to see the attributes of __builtins__, which contain built-in functions, classes, and variables, use the following:

dir(__builtins__)

For example:

>>> dir(__builtins__)
['ArithmeticError', 'AssertionError', 'AttributeError', 'BaseException',
'BufferError', 'BytesWarning', 'DeprecationWarning', 'EOFError',
'Ellipsis', 'EnvironmentError', 'Exception', 'False', 'FloatingPointError',
'FutureWarning', 'GeneratorExit', 'IOError', 'ImportError', 'ImportWarning',
'IndentationError', 'IndexError', 'KeyError',
'KeyboardInterrupt','LookupError', 'MemoryError',
'NameError', 'None', 'NotImplemented', 'NotImplementedError',
'OSError', 'OverflowError','PendingDeprecationWarning',
'ReferenceError', 'RuntimeError', 'RuntimeWarning', 'StopIteration',
'SyntaxError', 'SyntaxWarning','SystemError', 'SystemExit',
'TabError', 'True', 'TypeError', 'UnboundLocalError',
'UnicodeDecodeError', 'UnicodeEncodeError', 'UnicodeError',
'UnicodeTranslateError', 'UnicodeWarning', 'UserWarning',
'ValueError', 'Warning', 'WindowsError', 'ZeroDivisionError',
'__build_class__', '__debug__', '__doc__', '__import__',
'__name__', '__package__', 'abs', 'all', 'any', 'ascii', 'bin',
'bool', 'bytearray', 'bytes', 'chr', 'classmethod', 'compile',
'complex', 'copyright', 'credits', 'delattr', 'dict', 'dir', 'divmod',
'enumerate', 'eval', 'exec', 'exit', 'filter', 'float', 'format',
'frozenset', 'getattr', 'globals', 'hasattr', 'hash', 'help', 'hex',
'id', 'input', 'int', 'isinstance', 'issubclass', 'iter',
'len', 'license', 'list', 'locals', 'map', 'max', 'memoryview', 'min',
'next', 'object', 'oct', 'open', 'ord', 'pow', 'print', 'property', 'quit',
'range', 'repr', 'reversed', 'round', 'set', 'setattr', 'slice', 'sorted',
'staticmethod', 'str', 'sum', 'super', 'tuple', 'type',
'vars', 'zip']

For a language with as many features as Python, it has surprisingly few built-in elements. You can run the dir function on modules you import as well. For example:

>>> import sys
>>> dir(sys)
['__displayhook__', '__doc__', '__excepthook__', '__name__', '__package__',
'__stderr__', '__stdin__', '__stdout__', '_clear_type_cache',
'_current_frames', '_getframe', 'api_version', 'argv',
'builtin_module_names', 'byteorder', 'call_tracing', 'callstats',
'copyright', 'displayhook','dllhandle', 'dont_write_bytecode',
'exc_info', 'excepthook', 'exec_prefix', 'executable','exit', 'flags',
'float_info', 'getcheckinterval','getdefaultencoding',
'getfilesystemencoding', 'getprofile', 'getrecursionlimit', 'getrefcount',
'getsizeof', 'gettrace', 'getwindowsversion', 'hexversion',
'intern', 'maxsize', 'maxunicode', 'meta_path', 'modules', 'path',
'path_hooks', 'path_importer_cache', 'platform', 'prefix',
'setcheckinterval','setfilesystemencoding', 'setprofile',
'setrecursionlimit','settrace', 'stderr', 'stdin', 'stdout',
'subversion',
'version', 'version_info', 'warnoptions', 'winver']

Use dir to help examine modules, including the modules you create.

Importing Modules

Before using a module, you need to import it. The standard syntax for importing follows:

import module

You can use this syntax with modules that come with Python or with modules you create. You can also use the following alternative syntax:

from module import item

The alternative syntax enables you to specifically import just a class or function if that is all you need.

If a module has changed, you can reload the new definition of the module using the imp.reload function. The syntax is as follows:

import module
import imp
imp.reload(module)

Replace module with the module you want to reload.

With imp.reload, always use parentheses. With import, do not use parentheses.

Finding Modules

To import a module, the Python interpreter needs to find the module. With a module, the Python interpreter first looks for a file named module.py, where module is the name of the module you pass to the import statement. On finding a module, the Python interpreter will compile the module into a .pyc file. When you next import the module, the Python interpreter can load the pre-compiled module, speeding your Python scripts.

When you place an import statement in your scripts, the Python interpreter has to be able to find the module. The key point is that the Python interpreter only looks in a certain number of directories for your module. If you enter a name the Python interpreter cannot find, it will display an error, as shown in the following example:

>>> import foo
Traceback (most recent call last):
  File "<pyshell#12>", line 1, in <module>
    import foo
ImportError: No module named foo

The Python interpreter looks in the directories that are part of the module search path. These directories are listed in the sys.path variable from the sys module.

To list where the Python interpreter looks for modules, print out the value of the sys.path variable in the Python interpreter. For example:

>>> import sys
>>> print(sys.path)
['C:\Python31\Lib\idlelib', 'C:\Windows\system32\python3`.zip',
'C:\Python31\DLLs', 'C:\Python31\lib',
'C:\Python31\lib\plat-win', 'C:\Python31',
'C:\Python31\lib\site-packages']

Digging through Modules

Because Python is an open-source package, you can get the source code to the Python interpreter as well as all modules. In fact, even with a binary distribution of Python, you'll find the source code for modules written in Python.

Start by looking in all the directories listed in the sys.path variable for files with names ending in .py. These are Python modules. Some modules contain functions, and others contain classes and functions. For example, the following module, Parser, defines a class in the Python 3.0 distribution:

"""A parser of RFC 2822 and MIME email messages."""
__all__ = ['Parser', 'HeaderParser']
import warnings
from io import StringIO
from email.feedparser import FeedParser
from email.message import Message
class Parser:
    def __init__(self, *args, **kws):
        """Parser of RFC 2822 and MIME email messages.
        Creates an in-memory object tree representing the email message, which
        can then be manipulated and turned over to a Generator to return the
        textual representation of the message.
        The string must be formatted as a block of RFC 2822 headers and header
        continuation lines, optionally preceded by a `Unix-from' header.  The
        header block is terminated either by the end of the string or by a
        blank line.
        _class is the class to instantiate for new message objects when they
        must be created.  This class must have a constructor that can take
        zero arguments.  Default is Message.Message.
        """
        if len(args) >= 1:
            if '_class' in kws:
                raise TypeError("Multiple values for keyword arg '_class'")
            kws['_class'] = args[0]
        if len(args) == 2:
            if 'strict' in kws:
                raise TypeError("Multiple values for keyword arg 'strict'")
kws['strict'] = args[1]
        if len(args) > 2:
            raise TypeError('Too many arguments')
        if '_class' in kws:
            self._class = kws['_class']
            del kws['_class']
        else:
            self._class = Message
        if 'strict' in kws:
            warnings.warn("'strict' argument is deprecated (and ignored)",
                          DeprecationWarning, 2)
            del kws['strict']
        if kws:
            raise TypeError('Unexpected keyword arguments')
    def parse(self, fp, headersonly=False):
        """Create a message structure from the data in a file.
        Reads all the data from the file and returns the root of the message
        structure.  Optional headersonly is a flag specifying whether to stop
        parsing after reading the headers or not.  The default is False,
        meaning it parses the entire contents of the file.
        """
        feedparser = FeedParser(self._class)
        if headersonly:
            feedparser._set_headersonly()
        while True:
            data = fp.read(8192)
            if not data:
                break
            feedparser.feed(data)
        return feedparser.close()

    def parsestr(self, text, headersonly=False):
        """Create a message structure from a string.

        Returns the root of the message structure.  Optional headersonly is a
        flag specifying whether to stop parsing after reading the headers or
        not.  The default is False, meaning it parses the entire contents of
        the file.
        """
        return self.parse(StringIO(text), headersonly=headersonly)
class HeaderParser(Parser):
    def parse(self, fp, headersonly=True):
        return Parser.parse(self, fp, True)

    def parsestr(self, text, headersonly=True):
        return Parser.parsestr(self, text, True)

The majority of this small module is made up of documentation that instructs users how to use the module. Documentation is important.

When you look through the standard Python modules, you can get a feel for how modules are put together. It also helps when you want to create your own modules.

Creating Modules and Packages

Creating modules is easier than you might think. A module is merely a Python source file. In fact, any time you've created a Python file, you have already been creating modules without even knowing it.

The following example will help you get started creating modules.

Working with Classes

Most modules define a set of related functions or classes. A class, as introduced in Chapter 6, holds data as well as the methods that operate on that data. Python is a little looser than most programming languages, such as Java, C++, or C#, in that Python lets you break rules enforced in other languages. For example, Python, by default, lets you access data inside a class. This does violate some of the concepts of object-oriented programming but with good reason: Python aims first and foremost to be practical.

Defining Object-Oriented Programming

Computer geeks argue endlessly over what is truly object-oriented programming (OOP). Most experts, however, agree on the following three concepts:

  • Encapsulation

  • Inheritance

  • Polymorphism

Encapsulation is the idea that a class can hide the internal details and data necessary to perform a certain task. A class holds the necessary data, and you are not supposed to see that data under normal circumstances. Furthermore, a class provides a number of methods to operate on that data. These methods can hide the internal details, such as network protocols, disk access, and so on. Encapsulation is a technique to simplify your programs. At each step in creating your program, you can write code that concentrates on a single task. Encapsulation hides the complexity.

Inheritance means that a class can inherit, or gain access to, data and methods defined in a parent class. This just follows common sense in classifying a problem domain. For example, a rectangle and a circle are both shapes. In this case, the base class would be Shapes. The Rectangle class would then inherit from Shapes, as would the Circle class. Inheritance enables you to treat objects of both the Rectangle and Circle classes as children and members of the Shape class, meaning you can write more generic code in the base class, and become more specific in the children. (The terms children and child class, and membership in a class, are similar and can be used interchangeably here.) For the most part, the base class should be general and the subclasses specialized. Inheritance is often called specialization.

Polymorphism means that subclasses can override methods for more specialized behavior. For example, a rectangle and a circle are both shapes. You may define a set of common operations, such as move and draw, that should apply to all shapes. However, the draw method for a Circle will obviously be different than the draw method for a Rectangle. Polymorphism enables you to name both methods draw and then call these methods as if the Circle and the Rectangle were both Shapes, which they are, at least in this example.

Creating Classes

As described in Chapter 6, creating classes is easy. (In fact, most things in Python are pleasantly easy.) The following example shows a simple class that represents a meal.

Extending Existing Classes

After you have defined a class, you can extend it by defining subclasses. For example, you can create a Breakfast class that represents the first meal of the day:

class Breakfast(Meal):
     '''Holds the food and drink for breakfast.'''

    def __init__(self):
        '''Initialize with an omelet and coffee.'''
        Meal.__init__(self, 'omelet', 'coffee')
        self.setName('breakfast')

The Breakfast class extends the Meal class as shown by the class definition:

class Breakfast(Meal):

Another subclass would naturally be Lunch:

class Lunch(Meal):
    '''Holds the food and drink for lunch.'''

    def __init__(self):
        '''Initialize with a sandwich and a gin and tonic.'''
        Meal.__init__(self, 'sandwich', 'gin and tonic')
        self.setName('midday meal')

    # Override setFood().
    def setFood(self, food='sandwich'):
        if food != 'sandwich' and food != 'omelet':
            raise AngryChefException
            Meal.setFood(self, food)

With the Lunch class, you can see some use for the setter methods. In the Lunch class, the setFood method allows only two values for the food: a sandwich and an omelet. Nothing else is allowed or you will make the chef angry.

The Dinner class also overrides a method — in this case, the printIt method:

class Dinner(Meal):
    '''Holds the food and drink for dinner.'''

    def __init__(self):
        '''Initialize with steak and merlot.'''
        Meal.__init__(self, 'steak', 'merlot')
        self.setName('dinner')

    def printIt(self, prefix=''):
        '''Print even more nicely.'''
        print(prefix,'A gourmet',self.name,'with',self.food,'and',self.drink)

Normally, you would place all these classes into a module. See the section "Creating a Whole Module" for an example of a complete module.

Finishing Your Modules

After defining the classes and functions that you want for your module, the next step is to finish the module to make it better fit into the conventions expected by Python users and the Python interpreter.

Finishing your module can include a lot of things, but at the very least you need to do the following:

  • Define the errors and exceptions that apply to your module.

  • Define which items in the module you want to export. This defines the public API for the module.

  • Document your module.

  • Test your module.

  • Provide a fallback function in case your module is executed as a program.

The following sections describe how to finish up your modules.

Defining Module-Specific Errors

Python defines a few standard exception classes, such as IOError and NotImplementedError. If those classes apply, by all means use them. Otherwise, you may need to define exceptions for specific issues that may arise when using your module. For example, a networking module may need to define a set of exceptions relating to network errors.

For the food-related theme used in the example module, you can define an AngryChefException. To make this more generic, and perhaps allow reuse in other modules, the AngryChefException is defined as a subclass of the more general SensitiveArtistException, representing issues raised by touchy artsy types.

In most cases, your exception classes will not need to define any methods or initialize any data. The base Exception class provides enough. For most exceptions, the mere presence of the exception indicates the problem.

This is not always true. For example, an XML-parsing exception should probably contain the line number where the error occurred, as well as a description of the problem.

You can define the exceptions for the meal module as follows:

class SensitiveArtistException(Exception):
    pass

class AngryChefException(SensitiveArtistException):
    pass

This is just an example, of course. In your modules, define exception classes as needed. In addition to exceptions, you should carefully decide what to export from your module.

Choosing What to Export

When you use the from form of importing a module, you can specify which items in the module to import. For example, the following statement imports the AngryChefException from the module meal:

from meal import AngryChefException

To import all public items from a module, you can use the following format:

from module_name import *

For example:

from meal import *

The asterisk, or star (*), tells the Python interpreter to import all public items from the module. What exactly is public? You, as the module designer, can choose to define whichever items you want to be exported as public.

The Python interpreter uses two methods to determine what should be considered public:

  • If you have defined the variable __all__ in your module, the interpreter uses __all__ to determine what should be public.

  • If you have not defined the variable __all__, the interpreter imports everything except items with names that begin with an underscore, _, so printIt would be considered public, but _printIt would not.

See Chapter 7 for more information about modules and the import statement.

As a best practice, always define __all__ in your modules. This provides you with explicit control over what other Python scripts can import. To do this, simply create a sequence of text strings with the names of each item you want to export from your module. For example, in the meal module, you can define __all__ in the following manner:

__all__ = ['Meal', 'AngryChefException', 'makeBreakfast',
    'makeLunch', 'makeDinner', 'Breakfast', 'Lunch', 'Dinner']

Each name in this sequence names a class or function to export from the module.

Choosing what to export is important. When you create a module, you are creating an API to perform some presumably useful function. The API you export from a module then defines what users of your module can do. You want to export enough for users of the module to get their work done, but you don't have to export everything. You may want to exclude items for a number of reasons, including the following:

  • Items you are likely to change should remain private until you have settled on the API for those items. This gives you the freedom to make changes inside the module without impacting users of the module.

  • Modules can oftentimes hide, on purpose, complicated code. For example, an e-mail module can hide the gory details of SMTP, POP3, and IMAP network e-mail protocols. Your e-mail module could present an API that enables users to send messages, see which messages are available, download messages, and so on.

Hiding the gory details of how your code is implemented is called encapsulation. Impress your friends with lines like "making the change you are asking for would violate the rules of encapsulation ..."

Always define, explicitly, what you want to export from a module. You should also always document your modules.

Documenting Your Modules

It is vitally important that you document your modules. If not, no one, not even you, will know what your modules do. Think ahead six months. Will you remember everything that went into your modules? Probably not. The solution is simple: document your modules.

Python defines a few easy conventions for documenting your modules. Follow these conventions and your modules will enable users to view the documentation in the standard way. At its most basic, for each item you want to document, write a text string that describes the item. Enclose this text string in three quotes, and place it immediately inside the item.

For example, to document a method or function, use the following code as a guide:

def makeLunch():
    ''' Creates a Breakfast object.'''
    return Lunch()

The line in triple quotes shows the documentation. The documentation that appears right after the function is defined with the def statement.

Document a class similarly:

class Meal:
    '''Holds the food and drink used in a meal.
    In true object-oriented tradition, this class
    includes setter methods for the food and drink.

    Call printIt to pretty-print the values.
    '''

Place the documentation on the line after the class statement.

Exceptions are classes, too. Document them as well:

class SensitiveArtistException(Exception):
    '''Exception raised by an overly-sensitive artist.

    Base class for artistic types.'''
    Pass

Note that even though this class adds no new functionality, you should describe the purpose of each exception or class.

In addition, document the module itself. Start your module with the special three-quoted text string, as shown here:

"""
Module for making meals in Python.

Import this module and then call
makeBreakfast(), makeDinner() or makeLunch().

"""

Place this documentation on the first line of the text file that contains the module. For modules, start with one line that summarizes the purpose of the module. Separate this line from the remaining lines of the documentation, using a blank line as shown previously. The Python help function will extract the one-line summary and treat it specially. (See the following Try It Out example for more details about how to call the help function.)

Usually, one or two lines per class, method, or function should suffice. In general, your documentation should tell the user the following:

  • How to call the function or method, including what parameters are necessary and what type of data will be returned. Describe default values for parameters.

  • What a given class was designed for, or its purpose. Include how to use objects of the class.

  • Any conditions that must exist prior to calling a function or method.

  • Any side effects or other parts of the system that will change as a result of the class. For example, a method to erase all of the files on a disk should be documented as to what it does.

  • Exceptions that may be raised and under what reasons these exceptions will be raised.

Note that some people go way overboard in writing documentation. Too much documentation doesn't help, but don't use this as an excuse to do nothing. Too much documentation is far better than none at all.

A good rule of thumb comes from enlightened self-interest. Ask yourself what you would like to see in someone else's module and document to that standard.

You can view the documentation you write using the help function, as shown in the following example.

Testing Your Module

Testing is hard. Testing is yucky. That's why testing is often skipped. Even so, testing your module can verify that it works. More important, creating tests enables you to make changes to your module and then verify that the functionality still works.

Any self-respecting module should include a test function that exercises the functionality in the module. Your tests should create instances of the classes defined in the module, and call methods on those instances.

For example, the following method provides a test of the meal module: (Note that this will not work if you run it yet; you'll need to add the Dinner class, which is defined later in this chapter.)

def test():
    '''Test function.'''

    print('Module meal test.')

    # Generic no arguments.
    print('Testing Meal class.')
    m = Meal()

    m.printIt("	")


    m = Meal('green eggs and ham', 'tea')
    m.printIt("	")

    # Test breakfast
    print('Testing Breakfast class.')
    b = Breakfast()
    b.printIt("	")

    b.setName('breaking of the fast')
    b.printIt("	")


    # Test dinner
    print('Testing Dinner class.')
    d = Dinner()
    d.printIt("	")


    # Test lunch
    print('Testing Lunch class.')
    l = Lunch()
    l.printIt("	")

    print('Calling Lunch.setFood().')
    try:
        l.setFood('hotdog')
    except AngryChefException:
        print("	",'The chef is angry. Pick an omelet.')

Make your test functions part of your modules, so the tests are always available. You learn more about testing in Python in Chapter 12.

Testing is never finished. You can always add more tests. Just do what you can.

Running a Module as a Program

Normally, modules aren't intended to be run on their own. Instead, other Python scripts import items from a module and then use those items. However, because a module can be any file of Python code, you can indeed run a module.

Because modules aren't meant to be run on their own, Python defines a convention for modules. When a module is run on its own, it should execute the module tests. This provides a simple means to test your modules: Just run the module as a Python script.

To help with this convention, Python provides a handy idiom to detect whether your module is run as a program. Using the test function shown previously, you can use the following code to execute your module tests:

if __name__ == '__main__':
    test()

If you look at the source code for the standard Python modules, you'll find this idiom used repeatedly.

The next example runs the meal module, created in the section "Creating a Whole Module."

Creating a Whole Module

The sections in this chapter so far show the elements you need to include in the modules you create. The following example shows a complete module using the techniques described so far.

The meal module doesn't do much. It supposedly models a domain that includes food and drink over three daily meals.

Obviously, this module doesn't support Hobbits, who require more than three meals a day.

The code in this module is purposely short. The intent is not to perform a useful task but instead to show how to put together a module.

Installing Your Modules

The Python interpreter looks for modules in the directories listed in the sys.path variable. The sys.path variable includes the current directory, so you can always use modules available locally. If you want to use a module you've written in multiple scripts, or on multiple systems, however, you need to install it into one of the directories listed in the sys.path variable.

In most cases, you'll want to place your Python modules in the site-packages directory. Look in the sys.path listing and find a directory name ending in site-packages. This is a directory for packages installed at a site that are not part of the Python standard library of packages.

In addition to modules, you can create packages of modules, a set of related modules that install into the same directory structure. See the Python documentation at http://docs.python.org for more on this subject.

You can install your modules using one of three mechanisms:

  • You can do everything by hand and manually create an installation script or program.

  • You can create an installer specific to your operating system, such as MSI files on Windows, an RPM file on Linux, or a DMG file on Mac OS X.

  • You can use the handy Python distutils package, short for distribution utilities, to create a Python-based installer.

To use the Python distutils, you need to create a setup script, named setup.py. A minimal setup script can include the following:

from distutils.core import setup

setup(name='NameOfModule',
      version='1.0',
      py_modules=['NameOfModule'],
      )

You need to include the name of the module twice. Replace NameOfModule with the name of your module, such as meal in the examples in this chapter.

Name the script setup.py.

After you have created the setup.py script, you can create a distribution of your module using the following command:

python setup.py sdist

The argument sdist is short for software distribution. You can try this out with the following example.

Summary

This chapter pulls together concepts from the earlier chapters to delve into how to create modules by example. If you follow the techniques described in this chapter, your modules will fit in with other modules and follow the import Python conventions.

A module is simply a Python source file that you choose to treat as a module. Simple as that sounds, you need to follow a few conventions when creating a module:

  • Document the module and all classes, methods, and functions in the module.

  • Test the module and include at least one test function.

  • Define which items in the module to export — which classes, functions, and so on.

  • Create any exception classes you need for the issues that can arise when using the module.

  • Handle the situation in which the module itself is executed as a Python script.

Inside your modules, you'll likely define classes, which Python makes exceedingly easy.

While developing your module, you can use the help and reload functions to display documentation on your module (or any other module for that matter) and reload the changed module, respectively.

After you have created a module, you can create a distributable bundle of the module using the distutils. To do this, you need to create a setup.py script.

Chapter 11 describes regular expressions, an important concept used for finding relevant information in a sea of data.

The key things to take away from this chapter are:

  • Modules are Python source files. Like functions, modules are pieces of code that are reusable and save programmers coding time. They also make your programs less error prone, as modules are typically used over and over and have been thoroughly tested.

  • You can use the dir() function to view attributes of modules, such as functions, classes, and variables.

  • To use a module in a program, you must import it using import. You can also import a class or function from a module by using the code from module import item.

  • Python looks for module files in specific places. To see where Python searches, import sys and use the print(sys.path) function to view the directories.

  • Object-oriented programming consists of encapsulation, inheritance, and polymorphism.

  • Use triple quotes (''') to document objects in your modules. The first set of triple quotes begins the comment; the second set ends the comment.

  • To print the documentation in a module, you can use the help() function (i.e., help(modulename)).

  • You should always make test functions in your module in case you need them at a later date.

Exercises

  1. How can you get access to the functionality provided by a module?

  2. How can you control which items from your modules are considered public? (Public items are available to other Python scripts.)

  3. How can you view documentation on a module?

  4. How can you find out what modules are installed on a system?

  5. What kind of Python commands can you place in a module?

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.123.34