Chapter 23. Module Packages

So far, when we’ve imported modules, we’ve been loading files. This represents typical module usage, and it’s probably the technique you’ll use for most imports you’ll code early on in your Python career. However, the module import story is a bit richer than I have thus far implied.

In addition to a module name, an import can name a directory path. A directory of Python code is said to be a package, so such imports are known as package imports. In effect, a package import turns a directory on your computer into another Python namespace, with attributes corresponding to the subdirectories and module files that the directory contains.

This is a somewhat advanced feature, but the hierarchy it provides turns out to be handy for organizing the files in a large system and tends to simplify module search path settings. As we’ll see, package imports are also sometimes required to resolve import ambiguities when multiple program files of the same name are installed on a single machine.

Because it is relevant to code in packages only, we’ll also introduce Python’s recent relative imports model and syntax here. As we’ll see, this model modifies search paths and extends the from statement for imports within packages.

Package Import Basics

So, how do package imports work? In the place where you have been naming a simple file in your import statements, you can instead list a path of names separated by periods:

import dir1.dir2.mod

The same goes for from statements:

from dir1.dir2.mod import x

The “dotted” path in these statements is assumed to correspond to a path through the directory hierarchy on your machine, leading to the file mod.py (or similar; the extension may vary). That is, the preceding statements indicate that on your machine there is a directory dir1, which has a subdirectory dir2, which contains a module file mod.py (or similar).

Furthermore, these imports imply that dir1 resides within some container directory dir0, which is a component of the Python module search path. In other words, the two import statements imply a directory structure that looks something like this (shown with DOS backslash separators):

dir0dir1dir2mod.py               # Or mod.pyc, mod.so, etc.

The container directory dir0 needs to be added to your module search path (unless it’s the home directory of the top-level file), exactly as if dir1 were a simple module file.

More generally, the leftmost component in a package import path is still relative to a directory included in the sys.path module search path list we met in Chapter 21. From there down, though, the import statements in your script give the directory paths leading to the modules explicitly.

Packages and Search Path Settings

If you use this feature, keep in mind that the directory paths in your import statements can only be variables separated by periods. You cannot use any platform-specific path syntax in your import statements, such as C:dir1, My Documents.dir2 or ../dir1—these do not work syntactically. Instead, use platform-specific syntax in your module search path settings to name the container directories.

For instance, in the prior example, dir0—the directory name you add to your module search path—can be an arbitrarily long and platform-specific directory path leading up to dir1. Instead of using an invalid statement like this:

import C:mycodedir1dir2mod      # Error: illegal syntax

add C:mycode to your PYTHONPATH variable or a .pth file (assuming it is not the program’s home directory, in which case this step is not necessary), and say this in your script:

import dir1.dir2.mod

In effect, entries on the module search path provide platform-specific directory path prefixes, which lead to the leftmost names in import statements. import statements provide directory path tails in a platform-neutral fashion.[52]

Package __init__.py Files

If you choose to use package imports, there is one more constraint you must follow: each directory named within the path of a package import statement must contain a file named __init__.py, or your package imports will fail. That is, in the example we’ve been using, both dir1 and dir2 must contain a file called __init__.py; the container directory dir0 does not require such a file because it’s not listed in the import statement itself. More formally, for a directory structure such as this:

dir0dir1dir2mod.py

and an import statement of the form:

import dir1.dir2.mod

the following rules apply:

  • dir1 and dir2 both must contain an __init__.py file.

  • dir0, the container, does not require an __init__.py file; this file will simply be ignored if present.

  • dir0, not dir0dir1, must be listed on the module search path (i.e., it must be the home directory, or be listed in your PYTHONPATH, etc.).

The net effect is that this example’s directory structure should be as follows, with indentation designating directory nesting:

dir0                               # Container on module search path
    dir1
        __init__.py
        dir2
            __init__.py
            mod.py

The __init__.py files can contain Python code, just like normal module files. They are partly present as a declaration to Python, however, and can be completely empty. As declarations, these files serve to prevent directories with common names from unintentionally hiding true modules that appear later on the module search path. Without this safeguard, Python might pick a directory that has nothing to do with your code, just because it appears in an earlier directory on the search path.

More generally, the __init__.py file serves as a hook for package-initialization-time actions, generates a module namespace for a directory, and implements the behavior of from * (i.e., from .. import *) statements when used with directory imports:

Package initialization

The first time Python imports through a directory, it automatically runs all the code in the directory’s __init__.py file. Because of that, these files are a natural place to put code to initialize the state required by files in a package. For instance, a package might use its initialization file to create required data files, open connections to databases, and so on. Typically, __init__.py files are not meant to be useful if executed directly; they are run automatically when a package is first accessed.

Module namespace initialization

In the package import model, the directory paths in your script become real nested object paths after an import. For instance, in the preceding example, after the import the expression dir1.dir2 works and returns a module object whose namespace contains all the names assigned by dir2’s __init__.py file. Such files provide a namespace for module objects created for directories, which have no real associated module files.

from * statement behavior

As an advanced feature, you can use __all__ lists in __init__.py files to define what is exported when a directory is imported with the from * statement form. In an __init__.py file, the __all__ list is taken to be the list of submodule names that should be imported when from * is used on the package (directory) name. If __all__ is not set, the from * statement does not automatically load submodules nested in the directory; instead, it loads just names defined by assignments in the directory’s __init__.py file, including any submodules explicitly imported by code in this file. For instance, the statement from submodule import X in a directory’s __init__.py makes the name X available in that directory’s namespace. (We’ll see additional roles for __all__ in Chapter 24.)

You can also simply leave these files empty, if their roles are beyond your needs (and frankly, they are often empty in practice). They must exist, though, for your directory imports to work at all.

Note

Don’t confuse package __init__.py files with the class __init__ constructor methods we’ll meet in the next part of the book. The former are files of code run when imports first step through a package directory, while the latter are called when an instance is created. Both have initialization roles, but they are otherwise very different.

Package Import Example

Let’s actually code the example we’ve been talking about to show how initialization files and paths come into play. The following three files are coded in a directory dir1 and its subdirectory dir2—comments give the path names of these files:

# dir1\__init__.py
print('dir1 init')
x = 1

# dir1dir2\__init__.py
print('dir2 init')
y = 2

# dir1dir2mod.py
print('in mod.py')
z = 3

Here, dir1 will be either a subdirectory of the one we’re working in (i.e., the home directory), or a subdirectory of a directory that is listed on the module search path (technically, on sys.path). Either way, dir1’s container does not need an __init__.py file.

import statements run each directory’s initialization file the first time that directory is traversed, as Python descends the path; print statements are included here to trace their execution. As with module files, an already imported directory may be passed to reload to force reexecution of that single item. As shown here, reload accepts a dotted pathname to reload nested directories and files:

% python
>>> import dir1.dir2.mod      # First imports run init files
dir1 init
dir2 init
in mod.py
>>>
>>> import dir1.dir2.mod      # Later imports do not
>>>
>>> from imp import reload    # Needed in 3.0
>>> reload(dir1)
dir1 init
<module 'dir1' from 'dir1\__init__.pyc'>
>>>
>>> reload(dir1.dir2)
dir2 init
<module 'dir1.dir2' from 'dir1dir2\__init__.pyc'>

Once imported, the path in your import statement becomes a nested object path in your script. Here, mod is an object nested in the object dir2, which in turn is nested in the object dir1:

>>> dir1
<module 'dir1' from 'dir1\__init__.pyc'>
>>> dir1.dir2
<module 'dir1.dir2' from 'dir1dir2\__init__.pyc'>
>>> dir1.dir2.mod
<module 'dir1.dir2.mod' from 'dir1dir2mod.pyc'>

In fact, each directory name in the path becomes a variable assigned to a module object whose namespace is initialized by all the assignments in that directory’s __init__.py file. dir1.x refers to the variable x assigned in dir1\__init__.py, much as mod.z refers to the variable z assigned in mod.py:

>>> dir1.x
1
>>> dir1.dir2.y
2
>>> dir1.dir2.mod.z
3

from Versus import with Packages

import statements can be somewhat inconvenient to use with packages, because you may have to retype the paths frequently in your program. In the prior section’s example, for instance, you must retype and rerun the full path from dir1 each time you want to reach z. If you try to access dir2 or mod directly, you’ll get an error:

>>> dir2.mod
NameError: name 'dir2' is not defined
>>> mod.z
NameError: name 'mod' is not defined

It’s often more convenient, therefore, to use the from statement with packages to avoid retyping the paths at each access. Perhaps more importantly, if you ever restructure your directory tree, the from statement requires just one path update in your code, whereas imports may require many. The import as extension, discussed formally in the next chapter, can also help here by providing a shorter synonym for the full path:

% python
>>> from dir1.dir2 import mod      # Code path here only
dir1 init
dir2 init
in mod.py
>>> mod.z                          # Don't repeat path
3
>>> from dir1.dir2.mod import z
>>> z
3
>>> import dir1.dir2.mod as mod    # Use shorter name (see Chapter 24)
>>> mod.z
3

Why Use Package Imports?

If you’re new to Python, make sure that you’ve mastered simple modules before stepping up to packages, as they are a somewhat advanced feature. They do serve useful roles, though, especially in larger programs: they make imports more informative, serve as an organizational tool, simplify your module search path, and can resolve ambiguities.

First of all, because package imports give some directory information in program files, they both make it easier to locate your files and serve as an organizational tool. Without package paths, you must often resort to consulting the module search path to find files. Moreover, if you organize your files into subdirectories for functional areas, package imports make it more obvious what role a module plays, and so make your code more readable. For example, a normal import of a file in a directory somewhere on the module search path, like this:

import utilities

offers much less information than an import that includes the path:

import database.client.utilities

Package imports can also greatly simplify your PYTHONPATH and .pth file search path settings. In fact, if you use explicit package imports for all your cross-directory imports, and you make those package imports relative to a common root directory where all your Python code is stored, you really only need a single entry on your search path: the common root. Finally, package imports serve to resolve ambiguities by making explicit exactly which files you want to import. The next section explores this role in more detail.

A Tale of Three Systems

The only time package imports are actually required is to resolve ambiguities that may arise when multiple programs with same-named files are installed on a single machine. This is something of an install issue, but it can also become a concern in general practice. Let’s turn to a hypothetical scenario to illustrate.

Suppose that a programmer develops a Python program that contains a file called utilities.py for common utility code and a top-level file named main.py that users launch to start the program. All over this program, its files say import utilities to load and use the common code. When the program is shipped, it arrives as a single .tar or .zip file containing all the program’s files, and when it is installed, it unpacks all its files into a single directory named system1 on the target machine:

system1
    utilities.py        # Common utility functions, classes
    main.py             # Launch this to start the program
    other.py            # Import utilities to load my tools

Now, suppose that a second programmer develops a different program with files also called utilities.py and main.py, and again uses import utilities throughout the program to load the common code file. When this second system is fetched and installed on the same computer as the first system, its files will unpack into a new directory called system2 somewhere on the receiving machine (ensuring that they do not overwrite same-named files from the first system):

system2
    utilities.py        # Common utilities
    main.py             # Launch this to run
    other.py            # Imports utilities

So far, there’s no problem: both systems can coexist and run on the same machine. In fact, you won’t even need to configure the module search path to use these programs on your computer—because Python always searches the home directory first (that is, the directory containing the top-level file), imports in either system’s files will automatically see all the files in that system’s directory. For instance, if you click on system1main.py, all imports will search system1 first. Similarly, if you launch system2main.py, system2 will be searched first instead. Remember, module search path settings are only needed to import across directory boundaries.

However, suppose that after you’ve installed these two programs on your machine, you decide that you’d like to use some of the code in each of the utilities.py files in a system of your own. It’s common utility code, after all, and Python code by nature wants to be reused. In this case, you want to be able to say the following from code that you’re writing in a third directory to load one of the two files:

import utilities
utilities.func('spam')

Now the problem starts to materialize. To make this work at all, you’ll have to set the module search path to include the directories containing the utilities.py files. But which directory do you put first in the path—system1 or system2?

The problem is the linear nature of the search path. It is always scanned from left to right, so no matter how long you ponder this dilemma, you will always get utilities.py from the directory listed first (leftmost) on the search path. As is, you’ll never be able to import it from the other directory at all. You could try changing sys.path within your script before each import operation, but that’s both extra work and highly error prone. By default, you’re stuck.

This is the issue that packages actually fix. Rather than installing programs as flat lists of files in standalone directories, you can package and install them as subdirectories under a common root. For instance, you might organize all the code in this example as an install hierarchy that looks like this:

root
    system1
        __init__.py
        utilities.py
        main.py
        other.py
    system2
        __init__.py
        utilities.py
        main.py
        other.py
    system3                    # Here or elsewhere
        __init__.py             # Your new code here
        myfile.py

Now, add just the common root directory to your search path. If your code’s imports are all relative to this common root, you can import either system’s utility file with a package import—the enclosing directory name makes the path (and hence, the module reference) unique. In fact, you can import both utility files in the same module, as long as you use an import statement and repeat the full path each time you reference the utility modules:

import system1.utilities
import system2.utilities
system1.utilities.function('spam')
system2.utilities.function('eggs')

The names of the enclosing directories here make the module references unique.

Note that you have to use import instead of from with packages only if you need to access the same attribute in two or more paths. If the name of the called function here was different in each path, from statements could be used to avoid repeating the full package path whenever you call one of the functions, as described earlier.

Also, notice in the install hierarchy shown earlier that __init__.py files were added to the system1 and system2 directories to make this work, but not to the root directory. Only directories listed within import statements in your code require these files; as you’ll recall, they are run automatically the first time the Python process imports through a package directory.

Technically, in this case the system3 directory doesn’t have to be under root—just the packages of code from which you will import. However, because you never know when your own modules might be useful in other programs, you might as well place them under the common root directory as well to avoid similar name-collision problems in the future.

Finally, notice that both of the two original systems’ imports will keep working unchanged. Because their home directories are searched first, the addition of the common root on the search path is irrelevant to code in system1 and system2; they can keep saying just import utilities and expect to find their own files. Moreover, if you’re careful to unpack all your Python systems under a common root like this, path configuration becomes simple: you’ll only need to add the common root directory, once.

Package Relative Imports

The coverage of package imports so far has focused mostly on importing package files from outside the package. Within the package itself, imports of package files can use the same path syntax as outside imports, but they can also make use of special intra-package search rules to simplify import statements. That is, rather than listing package import paths, imports within the package can be relative to the package.

The way this works is version-dependent today: Python 2.6 implicitly searches package directories first on imports, while 3.0 requires explicit relative import syntax. This 3.0 change can enhance code readability, by making same-package imports more obvious. If you’re starting out in Python with version 3.0, your focus in this section will likely be on its new import syntax. If you’ve used other Python packages in the past, though, you’ll probably also be interested in how the 3.0 model differs.

Changes in Python 3.0

The way import operations in packages work has changed slightly in Python 3.0. This change applies only to imports within files located in the package directories we’ve been studying in this chapter; imports in other files work as before. For imports in packages, though, Python 3.0 introduces two changes:

  • It modifies the module import search path semantics to skip the package’s own directory by default. Imports check only other components of the search path. These are known as “absolute” imports.

  • It extends the syntax of from statements to allow them to explicitly request that imports search the package’s directory only. This is known as “relative” import syntax.

These changes are fully present in Python 3.0. The new from statement relative syntax is also available in Python 2.6, but the default search path change must be enabled as an option. It’s currently scheduled to be added in the 2.7 release[53]—this change is being phased in this way because the search path portion is not backward compatible with earlier Pythons.

The impact of this change is that in 3.0 (and optionally in 2.6), you must generally use special from syntax to import modules located in the same package as the importer, unless you spell out a complete path from a package root. Without this syntax, your package is not automatically searched.

Relative Import Basics

In Python 3.0 and 2.6, from statements can now use leading dots (“.”) to specify that they require modules located within the same package (known as package relative imports), instead of modules located elsewhere on the module import search path (called absolute imports). That is:

  • In both Python 3.0 and 2.6, you can use leading dots in from statements to indicate that imports should be relative to the containing package—such imports will search for modules inside the package only and will not look for same-named modules located elsewhere on the import search path (sys.path). The net effect is that package modules override outside modules.

  • In Python 2.6, normal imports in a package’s code (without leading dots) currently default to a relative-then-absolute search path order—that is, they search the package’s own directory first. However, in Python 3.0, imports within a package are absolute by default—in the absence of any special dot syntax, imports skip the containing package itself and look elsewhere on the sys.path search path.

For example, in both Python 3.0 and 2.6, a statement of the form:

from . import spam                        # Relative to this package

instructs Python to import a module named spam located in the same package directory as the file in which this statement appears. Similarly, this statement:

from .spam import name

means “from a module named spam located in the same package as the file that contains this statement, import the variable name.”

The behavior of a statement without the leading dot depends on which version of Python you use. In 2.6, such an import will still default to the current relative-then-absolute search path order (i.e., searching the package’s directory first), unless a statement of the following form is included in the importing file:

from __future__ import  absolute_import   # Required until 2.7?

If present, this statement enables the Python 3.0 absolute-by-default default search path change, described in the next paragraph.

In 3.0, an import without a leading dot always causes Python to skip the relative components of the module import search path and look instead in the absolute directories that sys.path contains. For instance, in 3.0’s model, a statement of the following form will always find a string module somewhere on sys.path, instead of a module of the same name in the package:

import string                             # Skip this package's version

Without the from __future__ statement in 2.6, if there’s a string module in the package, it will be imported instead. To get the same behavior in 3.0 and in 2.6 when the absolute import change is enabled, run a statement of the following form to force a relative import:

from . import string                      # Searches this package only

This works in both Python 2.6 and 3.0 today. The only difference in the 3.0 model is that it is required in order to load a module that is located in the same package directory as the file in which this appears, when the module is given with a simple name.

Note that leading dots can be used to force relative imports only with the from statement, not with the import statement. In Python 3.0, the import modname statement is always absolute, skipping the containing package’s directory. In 2.6, this statement form still performs relative imports today (i.e., the package’s directory is searched first), but these will become absolute in Python 2.7, too. from statements without leading dots behave the same as import statements—absolute in 3.0 (skipping the package directory), and relative-then-absolute in 2.6 (searching the package directory first).

Other dot-based relative reference patterns are possible, too. Within a module file located in a package directory named mypkg, the following alternative import forms work as described:

from .string import name1, name2            # Imports names from mypkg.string
from . import string                        # Imports mypkg.string
from .. import string                       # Imports string sibling of mypkg

To understand these latter forms better, we need to understand the rationale behind this change.

Why Relative Imports?

This feature is designed to allow scripts to resolve ambiguities that can arise when a same-named file appears in multiple places on the module search path. Consider the following package directory:

mypkg
    __init__.py
    main.py
    string.py

This defines a package named mypkg containing modules named mypkg.main and mypkg.string. Now, suppose that the main module tries to import a module named string. In Python 2.6 and earlier, Python will first look in the mypkg directory to perform a relative import. It will find and import the string.py file located there, assigning it to the name string in the mypkg.main module’s namespace.

It could be, though, that the intent of this import was to load the Python standard library’s string module instead. Unfortunately, in these versions of Python, there’s no straightforward way to ignore mypkg.string and look for the standard library’s string module located on the module search path. Moreover, we cannot resolve this with package import paths, because we cannot depend on any extra package directory structure above the standard library being present on every machine.

In other words, imports in packages can be ambiguous—within a package, it’s not clear whether an import spam statement refers to a module within or outside the package. More accurately, a local module or package can hide another hanging directly off of sys.path, whether intentionally or not.

In practice, Python users can avoid reusing the names of standard library modules they need for modules of their own (if you need the standard string, don’t name a new module string!). But this doesn’t help if a package accidentally hides a standard module; moreover, Python might add a new standard library module in the future that has the same name as a module of your own. Code that relies on relative imports is also less easy to understand, because the reader may be confused about which module is intended to be used. It’s better if the resolution can be made explicit in code.

The relative imports solution in 3.0

To address this dilemma, imports run within packages have changed in Python 3.0 (and as an option in 2.6) to be absolute. Under this model, an import statement of the following form in our example file mypkg/main.py will always find a string outside the package, via an absolute import search of sys.path:

import string                          # Imports string outside package

A from import without leading-dot syntax is considered absolute as well:

from string import name                # Imports name from string outside package

If you really want to import a module from your package without giving its full path from the package root, though, relative imports are still possible by using the dot syntax in the from statement:

from . import string                   # Imports mypkg.string (relative)

This form imports the string module relative to the current package only and is the relative equivalent to the prior import example’s absolute form; when this special relative syntax is used, the package’s directory is the only directory searched.

We can also copy specific names from a module with relative syntax:

from .string import name1, name2       # Imports names from mypkg.string

This statement again refers to the string module relative to the current package. If this code appears in our mypkg.main module, for example, it will import name1 and name2 from mypkg.string.

In effect, the “.” in a relative import is taken to stand for the package directory containing the file in which the import appears. An additional leading dot performs the relative import starting from the parent of the current package. For example, this statement:

from .. import spam                    # Imports a sibling of mypkg

will load a sibling of mypkg—i.e., the spam module located in the package’s own container directory, next to mypkg. More generally, code located in some module A.B.C can do any of these:

from . import D                        # Imports A.B.D     (. means A.B)
from .. import E                       # Imports A.E       (.. means A)

from .D import X                       # Imports A.B.D.X   (. means A.B)
from ..E import X                      # Imports A.E.X     (.. means A)

Relative imports versus absolute package paths

Alternatively, a file can sometimes name its own package explicitly in an absolute import statement. For example, in the following, mypkg will be found in an absolute directory on sys.path:

from mypkg import string                    # Imports mypkg.string (absolute)

However, this relies on both the configuration and the order of the module search path settings, while relative import dot syntax does not. In fact, this form requires that the directory immediately containing mypkg be included in the module search path. In general, absolute import statements must list all the directories below the package’s root entry in sys.path when naming packages explicitly like this:

from system.section.mypkg import string     # system container on sys.path only

In large or deep packages, that could be much more work than a dot:

from . import string                        # Relative import syntax

With this latter form, the containing package is searched automatically, regardless of the search path settings.

The Scope of Relative Imports

Relative imports can seem a bit perplexing on first encounter, but it helps if you remember a few key points about them:

  • Relative imports apply to imports within packages only. Keep in mind that this feature’s module search path change applies only to import statements within module files located in a package. Normal imports coded outside package files still work exactly as described earlier, automatically searching the directory containing the top-level script first.

  • Relative imports apply to the from statement only. Also remember that this feature’s new syntax applies only to from statements, not import statements. It’s detected by the fact that the module name in a from begins with one or more dots (periods). Module names that contain dots but don’t have a leading dot are package imports, not relative imports.

  • The terminology is ambiguous. Frankly, the terminology used to describe this feature is probably more confusing than it needs to be. Really, all imports are relative to something. Outside a package, imports are still relative to directories listed on the sys.path module search path. As we learned in Chapter 21, this path includes the program’s container directory, PYTHONPATH settings, path file settings, and standard libraries. When working interactively, the program container directory is simply the current working directory.

    For imports made inside packages, 2.6 augments this behavior by searching the package itself first. In the 3.0 model, all that really changes is that normal “absolute” import syntax skips the package directory, but special “relative” import syntax causes it to be searched first and only. When we talk about 3.0 imports as being “absolute,” what we really mean is that they are relative to the directories on sys.path, but not the package itself. Conversely, when we speak of “relative” imports, we mean they are relative to the package directory only. Some sys.path entries could, of course, be absolute or relative paths too. (And I could probably make up something more confusing, but it would be a stretch!)

In other words, “package relative imports” in 3.0 really just boil down to a removal of 2.6’s special search path behavior for packages, along with the addition of special from syntax to explicitly request relative behavior. If you wrote your package imports in the past to not depend on 2.6’s special implicit relative lookup (e.g., by always spelling out full paths from a package root), this change is largely a moot point. If you didn’t, you’ll need to update your package files to use the new from syntax for local package files.

Module Lookup Rules Summary

With packages and relative imports, the module search story in Python 3.0 in its entirety can be summarized as follows:

  • Simple module names (e.g., A) are looked up by searching each directory on the sys.path list, from left to right. This list is constructed from both system defaults and user-configurable settings.

  • Packages are simply directories of Python modules with a special __init__.py file, which enables A.B.C directory path syntax in imports. In an import of A.B.C, for example, the directory named A is located relative to the normal module import search of sys.path, B is another package subdirectory within A, and C is a module or other importable item within B.

  • Within a package’s files, normal import statements use the same sys.path search rule as imports elsewhere. Imports in packages using from statements and leading dots, however, are relative to the package; that is, only the package directory is checked, and the normal sys.path lookup is not used. In from . import A, for example, the module search is restricted to the directory containing the file in which this statement appears.

Relative Imports in Action

But enough theory: let’s run some quick tests to demonstrate the concepts behind relative imports.

Imports outside packages

First of all, as mentioned previously, this feature does not impact imports outside a package. Thus, the following finds the standard library string module as expected:

C:	est> c:Python30python
>>> import string
>>> string
<module 'string' from 'c:Python30libstring.py'>

But if we add a module of the same name in the directory we’re working in, it is selected instead, because the first entry on the module search path is the current working directory (CWD):

# teststring.py
print('string' * 8)

C:	est> c:Python30python
>>> import string
stringstringstringstringstringstringstringstring
>>> string
<module 'string' from 'string.py'>

In other words, normal imports are still relative to the “home” directory (the top-level script’s container, or the directory you’re working in). In fact, relative import syntax is not even allowed in code that is not in a file being used as part of a package:

>>> from . import string
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: Attempted relative import in non-package

In this and all examples in this section, code entered at the interactive prompt behaves the same as it would if run in a top-level script, because the first entry on sys.path is either the interactive working directory or the directory containing the top-level file. The only difference is that the start of sys.path is an absolute directory, not an empty string:

# testmain.py
import string
print(string)

C:	est> C:python30python main.py                   # Same results in 2.6
stringstringstringstringstringstringstringstring
<module 'string' from 'C:	eststring.py'>

Imports within packages

Now, let’s get rid of the local string module we coded in the CWD and build a package directory there with two modules, including the required but empty testpkg\__init__.py file (which I’ll omit here):

C:	est> del string*
C:	est> mkdir pkg

# testpkgspam.py
import eggs                    # <== Works in 2.6 but not 3.0!
print(eggs.X)

# testpkgeggs.py
X = 99999
import string
print(string)

The first file in this package tries to import the second with a normal import statement. Because this is taken to be relative in 2.6 but absolute in 3.0, it fails in the latter. That is, 2.6 searches the containing package first, but 3.0 does not. This is the noncompatible behavior you have to be aware of in 3.0:

C:	est> c:Python26python
>>> import pkg.spam
<module 'string' from 'c:Python26libstring.pyc'>
99999

C:	est> c:Python30python
>>> import pkg.spam
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pkgspam.py", line 1, in <module>
    import eggs
ImportError: No module named eggs

To make this work in both 2.6 and 3.0, change the first file to use the special relative import syntax, so that its import searches the package directory in 3.0, too:

# testpkgspam.py
from . import eggs             # <== Use package relative import in 2.6 or 3.0
print(eggs.X)

# testpkgeggs.py
X = 99999
import string
print(string)

C:	est> c:Python26python
>>> import pkg.spam
<module 'string' from 'c:Python26libstring.pyc'>
99999

C:	est> c:Python30python
>>> import pkg.spam
<module 'string' from 'c:Python30libstring.py'>
99999

Imports are still relative to the CWD

Notice in the preceding example that the package modules still have access to standard library modules like string. Really, their imports are still relative to the entries on the module search path, even if those entries are relative themselves. If you add a string module to the CWD again, imports in a package will find it there instead of in the standard library. Although you can skip the package directory with an absolute import in 3.0, you still can’t skip the home directory of the program that imports the package:

# teststring.py
print('string' * 8)

# testpkgspam.py
from . import eggs
print(eggs.X)

# testpkgeggs.py
X = 99999
import string                  # <== Gets string in CWD, not Python lib!
print(string)

C:	est> c:Python30python    # Same result in 2.6
>>> import pkg.spam
stringstringstringstringstringstringstringstring
<module 'string' from 'string.py'>
99999

Selecting modules with relative and absolute imports

To show how this applies to imports of standard library modules, reset the package one more time. Get rid of the local string module, and define a new one inside the package itself:

C:	est> del string*

# testpkgspam.py
import string                  # <== Relative in 2.6, absolute in 3.0
print(string)

# testpkgstring.py
print('Ni' * 8)

Now, which version of the string module you get depends on which Python you use. As before, 3.0 interprets the import in the first file as absolute and skips the package, but 2.6 does not:

C:	est> c:Python30python
>>> import pkg.spam
<module 'string' from 'c:Python30libstring.py'>

C:	est> c:Python26python
>>> import pkg.spam
NiNiNiNiNiNiNiNi
<module 'pkg.string' from 'pkgstring.py'>

Using relative import syntax in 3.0 forces the package to be searched again, as it is in 2.6—by using absolute or relative import syntax in 3.0, you can either skip or select the package directory explicitly. In fact, this is the use case that the 3.0 model addresses:

# testpkgspam.py
from . import string           # <== Relative in both 2.6 and 3.0
print(string)

# testpkgstring.py
print('Ni' * 8)

C:	est> c:Python30python
>>> import pkg.spam
NiNiNiNiNiNiNiNi
<module 'pkg.string' from 'pkgstring.py'>

C:	est> c:Python26python
>>> import pkg.spam
NiNiNiNiNiNiNiNi
<module 'pkg.string' from 'pkgstring.py'>

It’s important to note that relative import syntax is really a binding declaration, not just a preference. If we delete the string.py file in this example, the relative import in spam.py fails in both 3.0 and 2.6, instead of falling back on the standard library’s version of this module (or any other):

# testpkgspam.py
from . import string           # <== Fails if no string.py here!

C:	est> C:python30python
>>> import pkg.spam
...text omitted...
ImportError: cannot import name string

Modules referenced by relative imports must exist in the package directory.

Imports are still relative to the CWD (again)

Although absolute imports let you skip package modules, they still rely on other components of sys.path. For one last test, let’s define two string modules of our own. In the following, there is one module by that name in the CWD, one in the package, and another in the standard library:

# teststring.py
print('string' * 8)

# testpkgspam.py
from . import string           # <== Relative in both 2.6 and 3.0
print(string)

# testpkgstring.py
print('Ni' * 8)

When we import the string module with relative import syntax, we get the version in the package, as desired:

C:	est> c:Python30python    # Same result in 2.6
>>> import pkg.spam
NiNiNiNiNiNiNiNi
<module 'pkg.string' from 'pkgstring.py'>

When absolute syntax is used, though, the module we get varies per version again. 2.6 interprets this as relative to the package, but 3.0 makes it “absolute,” which in this case really just means it skips the package and loads the version relative to the CWD (not the version the standard library):

# teststring.py
print('string' * 8)

# testpkgspam.py
import string                  # <== Relative in 2.6, "absolute" in 3.0: CWD!
print(string)

# testpkgstring.py
print('Ni' * 8)

C:	est> c:Python30python
>>> import pkg.spam
stringstringstringstringstringstringstringstring
<module 'string' from 'string.py'>

C:	est> c:Python26python
>>> import pkg.spam
NiNiNiNiNiNiNiNi
<module 'pkg.string' from 'pkgstring.pyc'>

As you can see, although packages can explicitly request modules within their own directories, their imports are otherwise still relative to the rest of the normal module search path. In this case, a file in the program using the package hides the standard library module the package may want. All that the change in 3.0 really accomplishes is allowing package code to select files either inside or outside the package (i.e., relatively or absolutely). Because import resolution can depend on an enclosing context that may not be foreseen, absolute imports in 3.0 are not a guarantee of finding a module in the standard library.

Experiment with these examples on your own for more insight. In practice, this is not usually as ad-hoc as it might seem: you can generally structure your imports, search paths, and module names to work the way you wish during development. You should keep in mind, though, that imports in larger systems may depend upon context of use, and the module import protocol is part of a successful library’s design.

Note

Now that you’ve learned about package-relative imports, also keep in mind that they may not always be your best option. Absolute package imports, relative to a directory on sys.path, are still sometimes preferred over both implicit package-relative imports in Python 2, and explicit package-relative import syntax in both Python 2 and 3.

Package-relative import syntax and Python 3.0’s new absolute import search rules at least require relative imports from a package to be made explicit, and thus easier to understand and maintain. Files that use imports with dots, though, are implicitly bound to a package directory and cannot be used elsewhere without code changes.

Naturally, the extent to which this may impact your modules can vary per package; absolute imports may also require changes when directories are reorganized.

Chapter Summary

This chapter introduced Python’s package import model—an optional but useful way to explicitly list part of the directory path leading up to your modules. Package imports are still relative to a directory on your module import search path, but rather than relying on Python to traverse the search path manually, your script gives the rest of the path to the module explicitly.

As we’ve seen, packages not only make imports more meaningful in larger systems, but also simplify import search path settings (if all cross-directory imports are relative to a common root directory) and resolve ambiguities when there is more than one module of the same name (including the name of the enclosing directory in a package import helps distinguish between them).

Because it’s relevant only to code in packages, we also explored the newer relative import model here—a way for imports in package files to select modules in the same package using leading dots in a from, instead of relying on an older implicit package search rule.

In the next chapter, we will survey a handful of more advanced module-related topics, such as relative import syntax and the __name__ usage mode variable. As usual, though, we’ll close out this chapter with a short quiz to test what you’ve learned here.

Test Your Knowledge: Quiz

  1. What is the purpose of an __init__.py file in a module package directory?

  2. How can you avoid repeating the full package path every time you reference a package’s content?

  3. Which directories require __init__.py files?

  4. When must you use import instead of from with packages?

  5. What is the difference between from mypkg import spam and from . import spam?

Test Your Knowledge: Answers

  1. The __init__.py file serves to declare and initialize a module package; Python automatically runs its code the first time you import through a directory in a process. Its assigned variables become the attributes of the module object created in memory to correspond to that directory. It is also not optional—you can’t import through a directory with package syntax unless it contains this file.

  2. Use the from statement with a package to copy names out of the package directly, or use the as extension with the import statement to rename the path to a shorter synonym. In both cases, the path is listed in only one place, in the from or import statement.

  3. Each directory listed in an import or from statement must contain an __init__.py file. Other directories, including the directory containing the leftmost component of a package path, do not need to include this file.

  4. You must use import instead of from with packages only if you need to access the same name defined in more than one path. With import, the path makes the references unique, but from allows only one version of any given name.

  5. from mypkg import spam is an absolute import—the search for mypkg skips the package directory and the module is located in an absolute directory in sys.path. A statement from . import spam, on the other hand, is a relative import—spam is looked up relative to the package in which this statement is contained before sys.path is searched.



[52] The dot path syntax was chosen partly for platform neutrality, but also because paths in import statements become real nested object paths. This syntax also means that you get odd error messages if you forget to omit the .py in your import statements. For example, import mod.py is assumed to be a directory path import—it loads mod.py, then tries to load a modpy.py, and ultimately issues a potentially confusing “No module named py” error message.

[53] Yes, there will be a 2.7 release, and possibly 2.8 and later releases, in parallel with new releases in the 3.X line. As described in the Preface, both the Python 2 and Python 3 lines are expected to be fully supported for years to come, to accommodate the large existing Python 2 user and code bases.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.154.69