2
MODULES, LIBRARIES, AND FRAMEWORKS

image

Modules are an essential part of what makes Python extensible. Without them, Python would just be a language built around a monolithic interpreter; it wouldn’t flourish within a giant ecosystem that allows developers to build applications quickly and simply by combining extensions. In this chapter, I’ll introduce you to some of the features that make Python modules great, from the built-in modules you need to know to externally managed frameworks.

The Import System

To use modules and libraries in your programs, you have to import them using the import keyword. As an example, Listing 2-1 imports the all-important Zen of Python guidelines.

>>> import this
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

Listing 2-1: The Zen of Python

The import system is quite complex, and I’m assuming you already know the basics, so here I’ll show you some of the internals of this system, including how the sys module works, how to change or add import paths, and how to use custom importers.

First, you need to know that the import keyword is actually a wrapper around a function named __import__. Here is a familiar way of importing a module:

>>> import itertools
>>> itertools
<module 'itertools' from '/usr/.../>

This is precisely equivalent to this method:

>>> itertools = __import__("itertools")
>>> itertools
<module 'itertools' from '/usr/.../>

You can also imitate the as keyword of import, as these two equivalent ways of importing show:

>>> import itertools as it
>>> it
<module 'itertools' from '/usr/.../>

And here’s the second example:

>>> it = __import__("itertools")
>>> it
<module 'itertools' from '/usr/.../>

While import is a keyword in Python, internally it’s a simple function that’s accessible through the __import__ name. The __import__ function is extremely useful to know, as in some (corner) cases, you might want to import a module whose name is unknown beforehand, like so:

>>> random = __import__("RANDOM".lower())
>>> random
<module 'random' from '/usr/.../>

Don’t forget that modules, once imported, are essentially objects whose attributes (classes, functions, variables, and so on) are objects.

The sys Module

The sys module provides access to variables and functions related to Python itself and the operating system it is running on. This module also contains a lot of information about Python’s import system.

First of all, you can retrieve the list of modules currently imported using the sys.modules variable. The sys.modules variable is a dictionary whose key is the module name you want to inspect and whose returned value is the module object. For example, once the os module is imported, we can retrieve it by entering:

>>> import sys
>>> import os
>>> sys.modules['os']
<module 'os' from '/usr/lib/python2.7/os.pyc'>

The sys.modules variable is a standard Python dictionary that contains all loaded modules. That means that calling sys.modules.keys(), for example, will return the complete list of the names of loaded modules.

You can also retrieve the list of modules that are built in by using the sys.builtin_module_names variable. The built-in modules compiled to your interpreter can vary depending on what compilation options were passed to the Python build system.

Import Paths

When importing modules, Python relies on a list of paths to know where to look for the module. This list is stored in the sys.path variable. To check which paths your interpreter will search for modules, just enter sys.path.

You can change this list, adding or removing paths as necessary, or even modify the PYTHONPATH environment variable to add paths without writing Python code at all. Adding paths to the sys.path variable can be useful if you want to install Python modules to nonstandard locations, such as a test environment. In normal operations, however, it should not be necessary to change the path variable. The following approaches are almost equivalent—almost because the path will not be placed at the same level in the list; this difference may not matter, depending on your use case:

>>> import sys
>>> sys.path.append('/foo/bar')

This would be (almost) the same as:

$ PYTHONPATH=/foo/bar python
>>> import sys
>>> '/foo/bar' in sys.path
True

It’s important to note that the list will be iterated over to find the requested module, so the order of the paths in sys.path is important. It’s useful to put the path most likely to contain the modules you are importing early in the list to speed up search time. Doing so also ensures that if two modules with the same name are available, the first match will be picked.

This last property is especially important because one common mistake is to shadow Python built-in modules with your own. Your current directory is searched before the Python Standard Library directory. That means that if you decide to name one of your scripts random.py and then try using import random, the file from your current directory will be imported rather than the Python module.

Custom Importers

You can also extend the import mechanism using custom importers. This is the technique that the Lisp-Python dialect Hy uses to teach Python how to import files other than standard .py or .pyc files. (Hy is a Lisp implementation on top of Python, discussed later in the section “A Quick Introduction to Hy” on page 145.)

The import hook mechanism, as this technique is called, is defined by PEP 302. It allows you to extend the standard import mechanism, which in turn allows you to modify how Python imports modules and build your own system of import. For example, you could write an extension that imports modules from a database over the network or that does some sanity checking before importing any module.

Python offers two different but related ways to broaden the import system: the meta path finders for use with sys.meta_path and the path entry finders for use with sys.path_hooks.

Meta Path Finders

The meta path finder is an object that will allow you to load custom objects as well as standard .py files. A meta path finder object must expose a find_module(fullname, path=None) method that returns a loader object. The loader object must also have a load_module(fullname) method responsible for loading the module from a source file.

To illustrate, Listing 2-2 shows how Hy uses a custom meta path finder to enable Python to import source files ending with .hy instead of .py.

class MetaImporter(object):
    def find_on_path(self, fullname):
        fls = ["%s/__init__.hy", "%s.hy"]
        dirpath = "/".join(fullname.split("."))

        for pth in sys.path:
            pth = os.path.abspath(pth)
            for fp in fls:
                composed_path = fp % ("%s/%s" % (pth, dirpath))
                if os.path.exists(composed_path):
                    return composed_path

    def find_module(self, fullname, path=None):
        path = self.find_on_path(fullname)
        if path:
            return MetaLoader(path)

sys.meta_path.append(MetaImporter())

Listing 2-2: A Hy module importer

Once Python has determined that the path is valid and that it points to a module, a MetaLoader object is returned, as shown in Listing 2-3.

class MetaLoader(object):
    def __init__(self, path):
        self.path = path

    def is_package(self, fullname):
        dirpath = "/".join(fullname.split("."))
        for pth in sys.path:
            pth = os.path.abspath(pth)
            composed_path = "%s/%s/__init__.hy" % (pth, dirpath)
            if os.path.exists(composed_path):
                return True
        return False

    def load_module(self, fullname):
        if fullname in sys.modules:
            return sys.modules[fullname]

        if not self.path:
            return

        sys.modules[fullname] = None
      mod = import_file_to_module(fullname, self.path)

        ispkg = self.is_package(fullname)

        mod.__file__ = self.path
        mod.__loader__ = self
        mod.__name__ = fullname

        if ispkg:
            mod.__path__ = []
            mod.__package__ = fullname
        else:
            mod.__package__ = fullname.rpartition('.')[0]

        sys.modules[fullname] = mod
        return mod

Listing 2-3: A Hy module loader object

At , import_file_to_module reads a .hy source file, compiles it to Python code, and returns a Python module object.

This loader is pretty straightforward: once the .hy file is found, it’s passed to this loader, which compiles the file if necessary, registers it, sets some attributes, and then returns it to the Python interpreter.

The uprefix module is another good example of this feature in action. Python 3.0 through 3.2 didn’t support the u prefix for denoting Unicode strings that was featured in Python 2; the uprefix module ensures compatibility between Python versions 2 and 3 by removing the u prefix from strings before compilation.

Useful Standard Libraries

Python comes with a huge standard library packed with tools and features for almost any purpose you can think of. Newcomers to Python who are used to having to write their own functions for basic tasks are often shocked to find that the language itself ships with so much functionality built in and ready for use.

Whenever you’re tempted to write your own function to handle a simple task, first stop and look through the standard library. In fact, skim through the whole thing at least once before you begin working with Python so that next time you need a function, you have an idea of whether it already exists in the standard library.

We’ll talk about some of these modules, such as functools and itertools, in later chapters, but here are a few of the standard modules that you’ll definitely find useful:

  • atexit allows you to register functions for your program to call when it exits.

  • argparse provides functions for parsing command line arguments.

  • bisect provides bisection algorithms for sorting lists (see Chapter 10).

  • calendar provides a number of date-related functions.

  • codecs provides functions for encoding and decoding data.

  • collections provides a variety of useful data structures.

  • copy provides functions for copying data.

  • csv provides functions for reading and writing CSV files.

  • datetime provides classes for handling dates and times.

  • fnmatch provides functions for matching Unix-style filename patterns.

  • concurrent provides asynchronous computation (native in Python 3, available for Python 2 via PyPI).

  • glob provides functions for matching Unix-style path patterns.

  • io provides functions for handling I/O streams. In Python 3, it also contains StringIO (inside the module of the same name in Python 2), which allows you to treat strings as files.

  • json provides functions for reading and writing data in JSON format.

  • logging provides access to Python’s own built-in logging functionality.

  • multiprocessing allows you to run multiple subprocesses from your application, while providing an API that makes them look like threads.

  • operator provides functions implementing the basic Python operators, which you can use instead of having to write your own lambda expressions (see Chapter 10).

  • os provides access to basic OS functions.

  • random provides functions for generating pseudorandom numbers.

  • re provides regular expression functionality.

  • sched provides an event scheduler without using multithreading.

  • select provides access to the select() and poll() functions for creating event loops.

  • shutil provides access to high-level file functions.

  • signal provides functions for handling POSIX signals.

  • tempfile provides functions for creating temporary files and directories.

  • threading provides access to high-level threading functionality.

  • urllib (and urllib2 and urlparse in Python 2.x) provides functions for handling and parsing URLs.

  • uuid allows you to generate Universally Unique Identifiers (UUIDs).

Use this list as a quick reference for what these useful libraries modules do. If you can memorize even part of this list, all the better. The less time you have to spend looking up library modules, the more time you can spend writing the code you actually need.

Most of the standard library is written in Python, so there’s nothing stopping you from looking at the source code of the modules and functions. When in doubt, crack open the code and see what it does for yourself. Even if the documentation has everything you need to know, there’s always a chance you could learn something useful.

External Libraries

Python’s “batteries included” philosophy is that, once you have Python installed, you should have everything you need to build whatever you want. This is to prevent the programming equivalent of unwrapping an awesome gift only to find out that whoever gave it to you forgot to buy batteries for it.

Unfortunately, there’s no way the people behind Python can predict everything you might want to make. And even if they could, most people wouldn’t want to deal with a multigigabyte download, especially if they just wanted to write a quick script for renaming files. So even with its extensive functionality, the Python Standard Library doesn’t cover everything. Luckily, members of the Python community have created external libraries.

The Python Standard Library is safe, well-charted territory: its modules are heavily documented, and enough people use it on a regular basis that you can feel assured it won’t break messily when you give it a try—and in the unlikely event that it does break, you can be confident someone will fix it in short order. External libraries, on the other hand, are the parts of the map labeled “here there be dragons”: documentation may be sparse, functionality may be buggy, and updates may be sporadic or even nonexistent. Any serious project will likely need functionality that only external libraries can provide, but you need to be mindful of the risks involved in using them.

Here’s a tale of external library dangers from the trenches. OpenStack uses SQLAlchemy, a database toolkit for Python. If you’re familiar with SQL, you know that database schemas can change over time, so OpenStack also made use of sqlalchemy-migrate to handle schema migration needs. And it worked . . . until it didn’t. Bugs started piling up, and nothing was getting done about them. At this time, OpenStack was also interested in supporting Python 3, but there was no sign that sqlalchemy-migrate was moving toward Python 3 support. It was clear by that point that sqlalchemy-migrate was effectively dead for our needs and we needed to switch to something else—our needs had outlived the capabilities of the external library. At the time of this writing, OpenStack projects are migrating toward using Alembic instead, a new SQL database migrations tool with Python 3 support. This is happening not without some effort, but fortunately without much pain.

The External Libraries Safety Checklist

All of this builds up to one important question: how can you be sure you won’t fall into this external libraries trap? Unfortunately, you can’t: programmers are people, too, and there’s no way you can know for sure whether a library that’s zealously maintained today will still be in good shape in a few months. However, using such libraries may be worth the risk; it’s just important to carefully assess your situation. At OpenStack, we use the following checklist when choosing whether to use an external library, and I encourage you to do the same.

Python 3 compatibility Even if you’re not targeting Python 3 right now, odds are good that you will somewhere down the line, so it’s a good idea to check that your chosen library is already Python 3–compatible and committed to staying that way.

Active development GitHub and Ohloh usually provide enough information to determine whether a given library is being actively developed by its maintainers.

Active maintenance Even if a library is considered finished (that is, feature complete), the maintainers should be ensuring it remains bug-free. Check the project’s tracking system to see how quickly the maintainers respond to bugs.

Packaged with OS distributions If a library is packaged with major Linux distributions, that means other projects are depending on it—so if something goes wrong, you won’t be the only one complaining. It’s also a good idea to check this if you plan to release your software to the public: your code will be easier to distribute if its dependencies are already installed on the end user’s machine.

API compatibility commitment Nothing’s worse than having your software suddenly break because a library it depends on has changed its entire API. You might want to check whether your chosen library has had anything like this happen in the past.

License You need to make sure that the license is compatible with the software you’re planning to write and that it allows you to do whatever you intend to do with your code in terms of distribution, modification, and execution.

Applying this checklist to dependencies is also a good idea, though that could turn out to be a huge undertaking. As a compromise, if you know your application is going to depend heavily on a particular library, you should apply this checklist to each of that library’s dependencies.

Protecting Your Code with an API Wrapper

No matter what libraries you end up using, you need to treat them as useful devices that could potentially do some serious damage. For safety, libraries should be treated like any physical tool: kept in your tool shed, away from your fragile valuables but available when you actually need them.

No matter how useful an external library might be, be wary of letting it get its hooks into your actual source code. Otherwise, if something goes wrong and you need to switch libraries, you might have to rewrite huge swaths of your program. A better idea is to write your own API—a wrapper that encapsulates your external libraries and keeps them out of your source code. Your program never has to know what external libraries it’s using, only what functionality your API provides. Then, if you need to use a different library, all you have to change is your wrapper. As long as the new library provides the same functionality, you won’t have to touch the rest of your codebase at all. There might be exceptions, but probably not many; most libraries are designed to solve a tightly focused range of problems and can therefore be easily isolated.

Later in Chapter 5, we’ll also look at how you can use entry points to build driver systems that will allow you to treat parts of your projects as modules you can switch out at will.

Package Installation: Getting More from pip

The pip project offers a really simple way to handle package and external library installations. It is actively developed, well maintained, and included with Python starting at version 3.4. It can install or uninstall packages from the Python Packaging Index (PyPI), a tarball, or a Wheel archive (we’ll discuss these in Chapter 5).

Its usage is simple:

$ pip install --user voluptuous
Downloading/unpacking voluptuous
  Downloading voluptuous-0.8.3.tar.gz
  Storing download in cache at ./.cache/pip/https%3A%2F%2Fpypi.python.org%2Fpa
ckages%2Fsource%2Fv%2Fvoluptuous%2Fvoluptuous-0.8.3.tar.gz
  Running setup.py egg_info for package voluptuous

Requirement already satisfied (use --upgrade to upgrade): distribute in /usr/
lib/python2.7/dist-packages (from voluptuous)
Installing collected packages: voluptuous
  Running setup.py install for voluptuous

Successfully installed voluptuous
Cleaning up...

By looking it up on the PyPI distribution index, where anyone can upload a package for distribution and installation by others, pip install can install any package.

You can also provide a --user option that makes pip install the package in your home directory. This avoids polluting your operating system directories with packages installed system-wide.

You can list the packages you already have installed using the pip freeze command, like so:

$ pip freeze
Babel==1.3
Jinja2==2.7.1
commando=0.3.4
--snip--

Uninstalling packages is also supported by pip, using the uninstall command:

$ pip uninstall pika-pool
Uninstalling pika-pool-0.1.3:
  /usr/local/lib/python2.7/site-packages/pika_pool-0.1.3.dist-info/
DESCRIPTION.rst
  /usr/local/lib/python2.7/site-packages/pika_pool-0.1.3.dist-info/INSTALLER
  /usr/local/lib/python2.7/site-packages/pika_pool-0.1.3.dist-info/METADATA

--snip--
Proceed (y/n)? y
  Successfully uninstalled pika-pool-0.1.3

One very valuable feature of pip is its ability to install a package without copying the package’s file. The typical use case for this feature is when you’re actively working on a package and want to avoid the long and boring process of reinstalling it each time you need to test a change. This can be achieved by using the -e <directory> flag:

$ pip install -e .
Obtaining file:///Users/jd/Source/daiquiri
Installing collected packages: daiquiri
  Running setup.py develop for daiquiri
Successfully installed daiquiri

Here, pip does not copy the files from the local source directory but places a special file, called an egg-link, in your distribution path. For example:

$ cat /usr/local/lib/python2.7/site-packages/daiquiri.egg-link
/Users/jd/Source/daiquiri

The egg-link file contains the path to add to sys.path to look for packages. The result can be easily checked by running the following command:

$ python -c "import sys; print('/Users/jd/Source/daiquiri' in sys.path)"
True

Another useful pip tool is the -e option of pip install, helpful for deploying code from repositories of various version control systems: git, Mercurial, Subversion, and even Bazaar are supported. For example, you can install any library directly from a git repository by passing its address as a URL after the -e option:

$ pip install -e git+https://github.com/jd/daiquiri.git#egg=daiquiri
Obtaining daiquiri from git+https://github.com/jd/daiquiri.git#egg=daiquiri
  Cloning https://github.com/jd/daiquiri.git to ./src/daiquiri
Installing collected packages: daiquiri
  Running setup.py develop for daiquiri
Successfully installed daiquiri

For the installation to work correctly, you need to provide the package egg name by adding #egg= at the end of the URL. Then, pip just uses git clone to clone the repository inside a src/<eggname> and creates an egg-link file pointing to that same cloned directory.

This mechanism is extremely handy when depending on unreleased versions of libraries or when working in a continuous testing system. However, since there is no versioning behind it, the -e option can also be very nasty. You cannot know in advance that the next commit in this remote repository is not going to break everything.

Finally, all other installation tools are being deprecated in favor of pip, so you can confidently treat it as your one-stop shop for all your package management needs.

Using and Choosing Frameworks

Python has a variety of frameworks available for various kinds of Python applications: if you’re writing a web application, you could use Django, Pylons, TurboGears, Tornado, Zope, or Plone; if you’re looking for an event-driven framework, you could use Twisted or Circuits; and so on.

The main difference between frameworks and external libraries is that applications use frameworks by building on top of them: your code will extend the framework rather than vice versa. Unlike a library, which is basically an add-on you can bring in to give your code some extra oomph, a framework forms the chassis of your code: everything you do builds on that chassis in some way. This can be a double-edged sword. There are plenty of upsides to using frameworks, such as rapid prototyping and development, but there are also some noteworthy downsides, such as lock-in. You need to take these considerations into account when you decide whether to use a framework.

The recommendations for what to check when choosing the right framework for your Python application are largely the same as those described in “The External Libraries Safety Checklist” on page 23—which makes sense, as frameworks are distributed as bundles of Python libraries. Sometimes frameworks also include tools for creating, running, and deploying applications, but that doesn’t change the criteria you should apply. We’ve established that replacing an external library after you’ve already written code that makes use of it is a pain, but replacing a framework is a thousand times worse, usually requiring a complete rewrite of your program from the ground up.

To give an example, the Twisted framework mentioned earlier still doesn’t have full Python 3 support: if you wrote a program using Twisted a few years back and wanted to update it to run on Python 3, you’d be out of luck. Either you’d have to rewrite your entire program to use a different framework, or you’d have to wait until someone finally gets around to upgrading Twisted with full Python 3 support.

Some frameworks are lighter than others. For example, Django has its own built-in ORM functionality; Flask, on the other hand, has nothing of the sort. The less a framework tries to do for you, the fewer problems you’ll have with it in the future. However, each feature a framework lacks is another problem for you to solve, either by writing your own code or going through the hassle of handpicking another library to handle it. It’s your choice which scenario you’d rather deal with, but choose wisely: migrating away from a framework when things go sour can be a Herculean task, and even with all its other features, there’s nothing in Python that can help you with that.

Doug Hellmann, Python Core Developer, on Python Libraries

Doug Hellmann is a senior developer at DreamHost and a fellow contributor to the OpenStack project. He launched the website Python Module of the Week (http://www.pymotw.com/) and has written an excellent book called The Python Standard Library by Example. He is also a Python core developer. I’ve asked Doug a few questions about the Standard Library and designing libraries and applications around it.

When you start writing a Python application from scratch, what’s your first move?

The steps for writing an application from scratch are similar to hacking an existing application, in the abstract, but the details change.

When I change existing code, I start by figuring out how it works and where my changes would need to go. I may use some debugging techniques: adding logging or print statements, or using pdb, and running the app with test data to make sure I understand what it’s doing. I usually make the change and test it by hand, then add any automated tests before contributing a patch.

I take the same exploratory approach when I create a new application—create some code and run it by hand, and then once I have the basic functionality working, I write tests to make sure I’ve covered all of the edge cases. Creating the tests may also lead to some refactoring to make the code easier to work with.

That was definitely the case with smiley [a tool for spying on your Python programs and recording their activities]. I started by experimenting with Python’s trace API, using some throwaway scripts, before building the real application. Originally, I planned to have one piece to instrument and collect data from another running application, and another to collect the data sent over the network and save it. While adding a couple of reporting features, I realized that the processing for replaying the collected data was almost identical to the processing for collecting it in the first place. I refactored a few classes and was able to create a base class for the data collection, database access, and report generator. Making those classes conform to the same API allowed me to easily create a version of the data collection app that wrote directly to the database instead of sending information over the network.

While designing an app, I think about how the user interface works, but for libraries, I focus on how a developer will use the API. It can also be easier to write the tests for programs that will use the new library first, then the library code. I usually create a series of example programs in the form of tests and then build the library to work that way.

I’ve also found that writing documentation for a library before writing any code helps me think through the features and workflows without committing to the implementation details, and it lets me record the choices I made in the design so the reader understands not just how to use the library but the expectations I had while creating it.

What’s the process for getting a module into the Python Standard Library?

The full process and guidelines for submitting a module into the standard library can be found in the Python Developer’s Guide at https://docs.python.org/devguide/stdlibchanges.html.

Before a module can be added, the submitter needs to prove that it’s stable and widely useful. The module should provide something that is either hard to implement correctly on your own or so useful that many developers have created their own variations. The API should be clear, and any module dependencies should be inside the Standard Library only.

The first step would be to run the idea of introducing the module into the standard library by the community via the python-ideas list to informally gauge the level of interest. Assuming the response is positive, the next step is to create a Python Enhancement Proposal (PEP), which should include the motivation for adding the module and implementation details of how the transition will happen.

Because package management and discovery tools have become so reliable, especially pip and the PyPI, it may be more practical to maintain a new library outside of the Python Standard Library. A separate release allows for more frequent updates with new features and bug fixes, which can be especially important for libraries addressing new technologies or APIs.

What are the top three modules from the Standard Library that you wish people knew more about?

One really useful tool from the Standard Library is the abc module. I use the abc module to define the APIs for dynamically loaded extensions as abstract base classes, to help extension authors understand which methods of the API are required and which are optional. Abstract base classes are built into some other OOP [object-oriented programming] languages, but I’ve found a lot of Python programmers don’t know we have them as well.

The binary search algorithm in the bisect module is a good example of a useful feature that’s often implemented incorrectly, which makes it a great fit for the Standard Library. I especially like the fact that it can search sparse lists where the search value may not be included in the data.

There are some useful data structures in the collections module that aren’t used as often as they could be. I like to use namedtuple for creating small, class-like data structures that need to hold data without any associated logic. It’s very easy to convert from a namedtuple to a regular class if logic does need to be added later, since namedtuple supports accessing attributes by name. Another interesting data structure from the module is ChainMap, which makes a good stackable namespace. ChainMap can be used to create contexts for rendering templates or managing configuration settings from different sources with clearly defined precedence.

A lot of projects, including OpenStack and external libraries, roll their own abstractions on top of the Standard Library, like for date/time handling, for example. In your opinion, should programmers stick to the Standard Library, roll their own functions, switch to some external library, or start sending patches to Python?

All of the above! I prefer to avoid reinventing the wheel, so I advocate strongly for contributing fixes and enhancements upstream to projects that can be used as dependencies. On the other hand, sometimes it makes sense to create another abstraction and maintain that code separately, either within an application or as a new library.

The timeutils module, used in your example, is a fairly thin wrapper around Python’s datetime module. Most of the functions are short and simple, but creating a module with the most common operations ensures they’re handled consistently throughout all projects. Because a lot of the functions are application specific, in the sense that they enforce decisions about things like timestamp format strings or what “now” means, they are not good candidates for patches to Python’s library or to be released as a general purpose library and adopted by other projects.

In contrast, I have been working to move the API services in OpenStack away from the WSGI [Web Server Gateway Interface] framework created in the early days of the project and onto a third-party web development framework. There are a lot of options for creating WSGI applications in Python, and while we may need to enhance one to make it completely suitable for OpenStack’s API servers, contributing those reusable changes upstream is preferable to maintaining a “private” framework.

What would your advice be to developers hesitating between major Python versions?

The number of third-party libraries supporting Python 3 has reached critical mass. It’s easier than ever to build new libraries and applications for Python 3, and thanks to the compatibility features added to 3.3, maintaining support for Python 2.7 is also easier. The major Linux distributions are working on shipping releases with Python 3 installed by default. Anyone starting a new project in Python should look seriously at Python 3, unless they have a dependency that hasn’t been ported. At this point, though, libraries that don’t run on Python 3 could almost be classified as “unmaintained.”

What are the best ways to branch code out from an application into a library in terms of design, planning ahead, migration, etc.?

Applications are collections of “glue code” holding libraries together for a specific purpose. Designing your application with the features to achieve that purpose as a library first and then building the application ensures that code is properly organized into logical units, which in turn makes testing simpler. It also means the features of an application are accessible through the library and can be remixed to create other applications. If you don’t take this approach, you risk the features of the application being tightly bound to the user interface, which makes them harder to modify and reuse.

What advice would you give to people planning to design their own Python libraries?

I always recommend designing libraries and APIs from the top down, applying design criteria such as the Single Responsibility Principle (SRP) at each layer. Think about what the caller will want to do with the library and create an API that supports those features. Think about what values can be stored in an instance and used by the methods versus what needs to be passed to each method every time. Finally, think about the implementation and whether the underlying code should be organized differently than the code of the public API.

SQLAlchemy is an excellent example of applying those guidelines. The declarative ORM [object relational mapping], data mapping, and expression generation layers are all separate. A developer can decide the right level of abstraction for entering the API and using the library based on their needs rather than constraints imposed by the library’s design.

What are the most common programming errors you encounter while reading Python developers’ code?

One area where Python’s idioms are significantly different from other languages is in looping and iteration. For example, one of the most common anti-patterns I see is the use of a for loop to filter a list by first appending items to a new list and then processing the result in a second loop (possibly after passing the list as an argument to a function). I almost always suggest converting filtering loops like these into generator expressions, which are more efficient and easier to understand. It’s also common to see lists being combined so their contents can be processed together in some way, rather than using itertools.chain().

There are other, more subtle things I often suggest in code reviews, like using a dict() as a lookup table instead of a long if:then:else block, making sure functions always return the same type of object (for example, an empty list instead of None), reducing the number of arguments a function requires by combining related values into an object with either a tuple or a new class, and defining classes to use in public APIs instead of relying on dictionaries.

What’s your take on frameworks?

Frameworks are like any other kind of tool. They can help, but you need to take care when choosing one to make sure that it’s right for the job at hand.

Pulling out the common parts of your app into a framework helps you focus your development efforts on the unique aspects of an application. Frameworks also provide a lot of bootstrapping code, for doing things like running in development mode and writing a test suite, that helps you bring an application to a useful state more quickly. They also encourage consistency in the implementation of the application, which means you end up with code that is easier to understand and more reusable.

There are some potential pitfalls too, though. The decision to use a particular framework usually implies something about the design of the application itself. Selecting the wrong framework can make an application harder to implement if those design constraints do not align naturally with the application’s requirements. You may end up fighting with the framework if you try to use patterns or idioms that differ from what it recommends.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.21.106.7