3
DOCUMENTATION AND GOOD API PRACTICE

image

In this chapter, we’ll discuss documentation; specifically, how to automate the trickier and more tedious aspects of documenting your project with Sphinx. While you will still have to write the documentation yourself, Sphinx will simplify your task. As it is common to provide features using a Python library, we’ll also look at how to manage and document your public API changes. Because your API will have to evolve as you make changes to its features, it’s rare to get everything built perfectly from the outset, but I’ll show you a few things you can do to ensure your API is as user-friendly as possible.

We’ll end this chapter with an interview with Christophe de Vienne, author of the Web Services Made Easy framework, in which he discusses best practices for developing and maintaining APIs.

Documenting with Sphinx

Documentation is one of the most important parts of writing software. Unfortunately, a lot of projects don’t provide proper documentation. Writing documentation is seen as complicated and daunting, but it doesn’t have to be: with the tools available to Python programmers, documenting your code can be just as easy as writing it.

One of the biggest reasons for sparse or nonexistent documentation is that many people assume the only way to document code is by hand. Even with multiple people on a project, this means one or more of your team will end up having to juggle contributing code with maintaining documentation—and if you ask any developer which job they’d prefer, you can be sure they’ll say they’d rather write software than write about software.

Sometimes the documentation process is completely separate from the development process, meaning that the documentation is written by people who did not write the actual code. Furthermore, any documentation produced this way is likely to be out-of-date: it’s almost impossible for manual documentation to keep up with the pace of development, regardless of who handles it.

Here’s the bottom line: the more degrees of separation between your code and your documentation, the harder it will be to keep the latter properly maintained. So why keep them separate at all? It’s not only possible to put your documentation directly in the code itself, but it’s also simple to convert that documentation into easy-to-read HTML and PDF files.

The most common format for Python documentation is reStructuredText, or reST for short. It’s a lightweight markup language (like Markdown) that’s as easy to read and write for humans as it is for computers. Sphinx is the most commonly used tool for working with this format; Sphinx can read reST-formatted content and output documentation in a variety of other formats.

I recommend that your project documentation always include the following:

  • The problem your project is intended to solve, in one or two sentences.

  • The license your project is distributed under. If your software is open source, you should also include this information in a header in each code file; just because you’ve uploaded your code to the Internet doesn’t mean that people will know what they’re allowed to do with it.

  • A small example of how your code works.

  • Installation instructions.

  • Links to community support, mailing list, IRC, forums, and so on.

  • A link to your bug tracker system.

  • A link to your source code so that developers can download and start delving into it right away.

You should also include a README.rst file that explains what your project does. This README should be displayed on your GitHub or PyPI project page; both sites know how to handle reST formatting.

NOTE

If you’re using GitHub, you can also add a CONTRIBUTING.rst file that will be displayed when someone submits a pull request. It should provide a checklist for users to follow before they submit the request, including things like whether your code follows PEP 8 and reminders to run the unit tests. Read the Docs (http://readthedocs.org/) allows you to build and publish your documentation online automatically. Signing up and configuring a project is straightforward. Then Read the Docs searches for your Sphinx configuration file, builds your documentation, and makes it available for your users to access. It’s a great companion to code-hosting sites.

Getting Started with Sphinx and reST

You can get Sphinx from http://www.sphinx-doc.org/. There are installation instructions on the site, but the easiest method is to install with pip install sphinx.

Once Sphinx is installed, run sphinx-quickstart in your project’s top-level directory. This will create the directory structure that Sphinx expects to find, along with two files in the doc/source folder: conf.py, which contains Sphinx’s configuration settings (and is absolutely required for Sphinx to work), and index.rst, which serves as the front page of your documentation. Once you run the quick-start command, you’ll be taken through a series of steps to designate naming conventions, version conventions, and options for other useful tools and standards.

The conf.py file contains a few documented variables, such as the project name, the author, and the theme to use for HTML output. Feel free to edit this file at your convenience.

Once you’ve built your structure and set your defaults, you can build your documentation in HTML by calling sphinx-build with your source directory and output directory as arguments, as shown in Listing 3-1. The command sphinx-build reads the conf.py file from the source directory and parses all the .rst files from this directory. It renders them in HTML in the output directory.

$ sphinx-build doc/source doc/build
  import pkg_resources
Running Sphinx v1.2b1
loading pickled environment... done
No builder selected, using default: html
building [html]: targets for 1 source files that are out of date
updating environment: 0 added, 0 changed, 0 removed
looking for now-outdated files... none found
preparing documents... done
writing output... [100%] index
writing additional files... genindex search
copying static files... done
dumping search index... done
dumping object inventory... done
build succeeded.

Listing 3-1: Building a basic Sphinx HTML document

Now you can open doc/build/index.html in your favorite browser and read your documentation.

NOTE

If you’re using setuptools or pbr (see Chapter 5) for packaging, Sphinx extends them to support the command setup.py build_sphinx, which will run sphinx-build automatically. The pbr integration of Sphinx has some saner defaults, such as outputting the documentation in the /doc subdirectory.

Your documentation begins with the index.rst file, but it doesn’t have to end there: reST supports include directives to include reST files from other reST files, so there’s nothing stopping you from dividing your documentation into multiple files. Don’t worry too much about syntax and semantics to start; reST offers a lot of formatting possibilities, but you’ll have plenty of time to dive into the reference later. The complete reference (http://docutils.sourceforge.net/docs/ref/rst/restructuredtext.html) explains how to create titles, bulleted lists, tables, and more.

Sphinx Modules

Sphinx is highly extensible: its basic functionality supports only manual documentation, but it comes with a number of useful modules that enable automatic documentation and other features. For example, sphinx.ext.autodoc extracts reST-formatted docstrings from your modules and generates .rst files for inclusion. This is one of the options sphinx-quickstart will ask if you want to activate. If you didn’t select that option, however, you can still edit your conf.py file and add it as an extension like so:

extensions = ['sphinx.ext.autodoc']

Note that autodoc will not automatically recognize and include your modules. You need to explicitly indicate which modules you want documented by adding something like Listing 3-2 to one of your .rst files.

   .. automodule:: foobar
     :members:
     :undoc-members:
     :show-inheritance:

Listing 3-2: Indicating the modules for autodoc to document

In Listing 3-2, we make three requests, all of which are optional: that all documented members be printed , that all undocumented members be printed , and that inheritance be shown . Also note the following:

  • If you don’t include any directives, Sphinx won’t generate any output.

  • If you only specify :members:, undocumented nodes on your module, class, or method tree will be skipped, even if all their members are documented. For example, if you document the methods of a class but not the class itself, :members: will exclude both the class and its methods. To keep this from happening, you’d have to write a docstring for the class or specify :undoc-members: as well.

  • Your module needs to be where Python can import it. Adding ., .., and/or ../.. to sys.path can help.

The autodoc extension gives you the power to include most of your documentation in your source code. You can even pick and choose which modules and methods to document—it’s not an “all-or-nothing” solution. By maintaining your documentation directly alongside your source code, you can easily ensure it stays up to date.

Automating the Table of Contents with autosummary

If you’re writing a Python library, you’ll usually want to format your API documentation with a table of contents containing links to individual pages for each module.

The sphinx.ext.autosummary module was created specifically to handle this common use case. First, you need to enable it in your conf.py by adding the following line:

extensions = ['sphinx.ext.autosummary']

Then, you can add something like the following to an .rst file to automatically generate a table of contents for the specified modules:

.. autosummary::

   mymodule
   mymodule.submodule

This will create files called generated/mymodule.rst and generated/mymodule.submodule.rst containing the autodoc directives described earlier. Using this same format, you can specify which parts of your module API you want included in your documentation.

NOTE

The sphinx-apidoc command can automatically create these files for you; check out the Sphinx documentation to find out more.

Automating Testing with doctest

Another useful feature of Sphinx is the ability to run doctest on your examples automatically when you build your documentation. The standard Python doctest module searches your documentation for code snippets and tests whether they accurately reflect what your code does. Every paragraph starting with the primary prompt >>> is treated as a code snippet to test. For example, if you wanted to document the standard print function from Python, you could write this documentation snippet and doctest would check the result:

    To print something to the standard output, use the :py:func:`print`
function:
>>> print("foobar")
    foobar

Having such examples in your documentation lets users understand your API. However, it’s easy to put off and eventually forget to update your examples as your API evolves. Fortunately, doctest helps make sure this doesn’t happen. If your documentation includes a step-by-step tutorial, doctest will help you keep it up to date throughout development by testing every line it can.

You can also use doctest for documentation-driven development (DDD): write your documentation and examples first and then write code to match your documentation. Taking advantage of this feature is as simple as running sphinx-build with the special doctest builder, like this:

$ sphinx-build -b doctest doc/source doc/build
Running Sphinx v1.2b1
loading pickled environment... done
building [doctest]: targets for 1 source files that are out of date
updating environment: 0 added, 0 changed, 0 removed
looking for now-outdated files... none found
running tests...

Document: index
---------------
1 items passed all tests:
   1 tests in default
1 tests in 1 items.
1 passed and 0 failed.
Test passed.

Doctest summary
===============
    1 test
    0 failures in tests
    0 failures in setup code
    0 failures in cleanup code
build succeeded.

When using the doctest builder, Sphinx reads the usual .rst files and executes code examples that are contained in those files.

Sphinx also provides a bevy of other features, either out of the box or through extension modules, including these:

  • Linking between projects

  • HTML themes

  • Diagrams and formulas

  • Output to Texinfo and EPUB format

  • Linking to external documentation

You might not need all this functionality right away, but if you ever need it in the future, it’s good to know about in advance. Again, check out the full Sphinx documentation to find out more.

Writing a Sphinx Extension

Sometimes off-the-shelf solutions just aren’t enough and you need to create custom tools to deal with a situation.

Say you’re writing an HTTP REST API. Sphinx will only document the Python side of your API, forcing you to write your REST API documentation by hand, with all the problems that entails. The creators of Web Services Made Easy (WSME) (interviewed at the end of this chapter) have come up with a solution: a Sphinx extension called sphinxcontrib-pecanwsme that analyzes docstrings and actual Python code to generate REST API documentation automatically.

NOTE

For other HTTP frameworks, such as Flask, Bottle, and Tornado, you can use sphinxcontrib.httpdomain.

My point is that whenever you know you could extract information from your code to build documentation, you should, and you should also automate the process. This is better than trying to maintain manually written documentation, especially when you can leverage auto-publication tools such as Read the Docs.

We’ll examine the sphinxcontrib-pecanwsme extension as an example of writing your own Sphinx extension. The first step is to write a module—preferably as a submodule of sphinxcontrib, as long as your module is generic enough—and pick a name for it. Sphinx requires this module to have one predefined function called setup(app), which contains the methods you’ll use to connect your code to Sphinx events and directives. The full list of methods is available in the Sphinx extension API at http://www.sphinx-doc.org/en/master/extdev/appapi.html.

For example, the sphinxcontrib-pecanwsme extension includes a single directive called rest-controller, added using the setup(app) function. This added directive needs a fully qualified controller class name to generate documentation for, as shown in Listing 3-3.

def setup(app):
    app.add_directive('rest-controller', RESTControllerDirective)

Listing 3-3: Code from sphinxcontrib.pecanwsme.rest.setup that adds the rest-controller directive

The add_directive method in Listing 3-3 registers the rest-controller directive and delegates its handling to the RESTControllerDirective class. This RESTControllerDirective class exposes certain attributes that indicate how the directive treats content, whether it has arguments, and so on. The class also implements a run() method that actually extracts the documentation from your code and returns parsed data to Sphinx.

The repository at https://bitbucket.org/birkenfeld/sphinx-contrib/src/ has many small modules that can help you develop your own extensions.

NOTE

Even though Sphinx is written in Python and targets it by default, extensions are available that allow it to support other languages as well. You can use Sphinx to document your project in full, even if it uses multiple languages at once.

As another example, in one of my projects named Gnocchi—a database for storing and indexing time series data at a large scale—I’ve used a custom Sphinx extension to autogenerate documentation. Gnocchi provides a REST API, and usually to document such an API, projects will manually write examples of what an API request and its response should look like. Unfortunately, this approach is error prone and out of sync with reality.

Using the unit-testing code available to test the Gnocchi API, we built a Sphinx extension to run Gnocchi and generate an .rst file containing HTTP requests and responses run against a real Gnocchi server. In this way, we ensure the documentation is up to date: the server responses are not manually crafted, and if a manually written request fails, then the documentation process fails, and we know that we must fix the documentation.

Including that code in the book would be too verbose, but you can check the sources of Gnocchi online and look at the gnocchi.gendoc module to get an idea of how it works.

Managing Changes to Your APIs

Well-documented code is a sign to other developers that the code is suitable to be imported and used to build something else. When building a library and exporting an API for other developers to use, for example, you want to provide the reassurance of solid documentation.

This section will cover best practices for public APIs. These will be exposed to users of your library or application, and while you can do whatever you like with internal APIs, public APIs should be handled with care.

To distinguish between public and private APIs, the Python convention is to prefix the symbol for a private API with an underscore: foo is public, but _bar is private. You should use this convention both to recognize whether another API is public or private and to name your own APIs. In contrast to other languages, such as Java, Python does not enforce any restriction on accessing code marked as private or public. The naming conventions are just to facilitate understanding among programmers.

Numbering API Versions

When properly constructed, the version number of an API can give users a great deal of information. Python has no particular system or convention in place for numbering API versions, but we can take inspiration from Unix platforms, which use a complex management system for libraries with fine-grained version identifiers.

Generally, your version numbering should reflect changes in the API that will impact users. For example, when the API has a major change, the major version number might change from 1 to 2. When only a few new API calls are added, the lesser number might go from 2.2 to 2.3. If a change only involves bug fixes, the version might bump from 2.2.0 to 2.2.1. A good example of how to use version numbering is the Python requests library (https://pypi.python.org/pypi/requests/). This library increments its API numbers based on the number of changes in each new version and the impact the changes might have on consuming programs.

Version numbers hint to developers that they should look at changes between two releases of a library, but alone they are not enough to fully guide a developer: you must provide detailed documentation to describe those changes.

Documenting Your API Changes

Whenever you make changes to an API, the first and most important thing to do is to heavily document them so that a consumer of your code can get a quick overview of what’s changing. Your document should cover the following:

  • New elements of the new interface

  • Elements of the old interface that are deprecated

  • Instructions on how to migrate to the new interface

You should also make sure that you don’t remove the old interface right away. I recommend keeping the old interface until it becomes too much trouble to do so. If you have marked it as deprecated, users will know not to use it.

Listing 3-4 is an example of good API change documentation for code that provides a representation of a car object that can turn in any direction. For whatever reason, the developers decided to retract the turn_left method and instead provide a generic turn method that can take the direction as an argument.

class Car(object):

    def turn_left(self):
        """Turn the car left.

        .. deprecated:: 1.1
           Use :func:`turn` instead with the direction argument set to left
        """
        self.turn(direction='left')

    def turn(self, direction):
        """Turn the car in some direction.

        :param direction: The direction to turn to.
        :type direction: str
        """
        # Write actual code for the turn function here instead
        pass

Listing 3-4: An example of API change documentation for a car object

The triple quotes here, """, indicate the start and end of the docstrings, which will be pulled into the documentation when the user enters help(Car.turn_left) into the terminal or extracts the documentation with an external tool such as Sphinx. The deprecation of the car.turn_left method is indicated by .. deprecated 1.1, where 1.1 refers to the first version released that ships this code as deprecated.

Using this deprecation method and making it visible via Sphinx clearly tells users that the function should not be used and gives them direct access to the new function along with an explanation of how to migrate old code.

Figure 3-1 shows Sphinx documentation that explains some deprecated functions.

image

Figure 3-1: Explanation of some deprecated functions

The downside of this approach is that it relies on developers reading your changelog or documentation when they upgrade to a newer version of your Python package. However, there is a solution for that: mark your deprecated functions with the warnings module.

Marking Deprecated Functions with the warnings Module

Though deprecated modules should be marked well enough in documentation that users will not attempt to call them, Python also provides the warnings module, which allows your code to issue various kinds of warnings when a deprecated function is called. These warnings, DeprecationWarning and PendingDeprecationWarning, can be used to tell the developer that a function they’re calling is deprecated or going to be deprecated, respectively.

NOTE

For those who work with C, this is a handy counterpart to the __attribute__ ((deprecated)) GCC extension.

To go back to the car object example in Listing 3-4, we can use this to warn users when they are attempting to call deprecated functions, as shown in Listing 3-5.

import warnings

class Car(object):
    def turn_left(self):
        """Turn the car left.

      .. deprecated:: 1.1
           Use :func:`turn` instead with the direction argument set to "left".
        """
      warnings.warn("turn_left is deprecated; use turn instead",
                      DeprecationWarning)
        self.turn(direction='left')

    def turn(self, direction):
        """Turn the car in some direction.

        :param direction: The direction to turn to.
        :type direction: str
        """
        # Write actual code here instead
        pass

Listing 3-5: A documented change to the car object API using the warnings module

Here, the turn_left function has been deprecated . By adding the warnings.warn line, we can write our own error message . Now, if any code should call the turn_left function, a warning will appear that looks like this:

>>> Car().turn_left()
__main__:8: DeprecationWarning: turn_left is deprecated; use turn instead

Python 2.7 and later versions, by default, do not print any warnings emitted by the warnings module because the warnings are filtered. To see those warnings printed, you need to pass the -W option to the Python executable. The option -W all will print all warnings to stderr. See the Python man page for more information on the possible values for -W.

When running test suites, developers can run Python with the -W error option, which will raise an error every time an obsolete function is called. Developers using your library can readily find exactly where their code needs to be fixed. Listing 3-6 shows how Python transforms warnings into fatal exceptions when Python is called with the -W error option.

>>> import warnings
>>> warnings.warn("This is deprecated", DeprecationWarning)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
DeprecationWarning: This is deprecated

Listing 3-6: Running Python with the -W error option and getting a deprecation error

Warnings are usually missed at runtime, and running a production system with the -W error option is rarely a good idea. Running the test suite of a Python application with the -W error option, on the other hand, can be a good way to catch warnings and fix them early on.

However, manually writing all those warnings, docstring updates, and so on can become tedious, so the debtcollector library has been created to help automate some of that. The debtcollector library provides a few decorators that you can use with your functions to make sure the correct warnings are emitted and the docstring is updated correctly. Listing 3-7 shows how you can, with a simple decorator, indicate that a function has been moved to some other place.

from debtcollector import moves

class Car(object):
    @moves.moved_method('turn', version='1.1')
    def turn_left(self):
        """Turn the car left."""

        return self.turn(direction='left')
    def turn(self, direction):
        """Turn the car in some direction.

        :param direction: The direction to turn to.
        :type direction: str
        """

        # Write actual code here instead
        pass

Listing 3-7: An API change automated with debtcollector

Here we’re using the moves() method from debtcollector, whose moved_method decorator makes turn_left emit a DeprecationWarning whenever it’s called.

Summary

Sphinx is the de facto standard for documenting Python projects. It supports a wide variety of syntax, and it is easy to add new syntax or features if your project has particular needs. Sphinx can also automate tasks such as generating indexes or extracting documentation from your code, making it easy to maintain documentation in the long run.

Documenting changes to your API is critical, especially when you deprecate functionality, so that users are not caught unawares. Ways to document deprecations include the Sphinx deprecated keyword and the warnings module, and the debtcollector library can automate maintaining this documentation.

Christophe de Vienne on Developing APIs

Christophe is a Python developer and the author of the WSME (Web Services Made Easy) framework, which allows developers to define web services in a Pythonic way and supports a wide variety of APIs, allowing it to be plugged into many other web frameworks.

What mistakes do developers tend to make when designing a Python API?

There are a few common mistakes I avoid when designing a Python API by following these rules:

  • Don’t make it too complicated. Keep it simple. Complicated APIs are hard to understand and hard to document. While the actual library functionality doesn’t have to be simple as well, it’s smart to make it simple so users can’t easily make mistakes. For example, the library is very simple and intuitive, but it does complex things behind the scenes. The urllib API, by contrast, is almost as complicated as the things it does, making it hard to use.

  • Make the magic visible. When your API does things that your documentation doesn’t explain, your end users will want to crack open your code and see what’s going on under the hood. It’s okay if you’ve got some magic happening behind the scenes, but your end users should never see anything unexpected happening up front, or they could become confused or rely on a behavior that may change.

  • Don’t forget use cases. When you’re so focused on writing code, it’s easy to forget to think about how your library will actually be used. Thinking up good use cases makes it easier to design an API.

  • Write unit tests. TDD (test-driven development) is a very efficient way to write libraries, especially in Python, because it forces the developer to assume the role of the end user from the very beginning, which leads the developer to design for usability. It’s the only approach I know of that allows a programmer to completely rewrite a library, as a last resort.

What aspects of Python may affect how easy it is to design a library API?

Python has no built-in way to define which sections of the API are public and which are private, which can be both a problem and an advantage.

It’s a problem because it can lead the developer to not fully consider which parts of their API are public and which parts should remain private. But with a little discipline, documentation, and (if needed) tools like zope.interface, it doesn’t stay a problem for long.

It’s an advantage when it makes it quicker and easier to refactor APIs while keeping compatibility with previous versions.

What do you consider when thinking about your API’s evolution, deprecation, and removal?

There are several criteria I weigh when making any decision regarding API development:

  • How difficult will it be for users of the library to adapt their code? Considering that there are people relying on your API, any change you make has to be worth the effort needed to adopt it. This rule is intended to prevent incompatible changes to the parts of the API that are in common use. That said, one of the advantages of Python is that it’s relatively easy to refactor code to adopt an API change.

  • How easy will it be to maintain my API? Simplifying the implementation, cleaning up the codebase, making the API easier to use, having more complete unit tests, making the API easier to understand at first glance . . . all of these things will make your life as a maintainer easier.

  • How can I keep my API consistent when applying a change? If all the functions in your API follow a similar pattern (such as requiring the same parameter in the first position), make sure new functions follow that pattern as well. Also, doing too many things at once is a great way to end up doing none of them right: keep your API focused on what it’s meant to do.

  • How will users benefit from the change? Last but not least, always consider the users’ point of view.

What advice do you have regarding API documentation in Python?

Good documentation makes it easy for newcomers to adopt your library. Neglecting it will drive away a lot of potential users—not just beginners, either. The problem is, documenting is difficult, so it gets neglected all the time!

  • Document early and include your documentation build in continuous integration. With the Read the Docs tool for creating and hosting documentation, there’s no excuse for not having documentation built and published (at least for open source software).

  • Use docstrings to document classes and functions in your API. If you follow the PEP 257 (https://www.python.org/dev/peps/pep-0257/) guidelines, developers won’t have to read your source to understand what your API does. Generate HTML documentation from your docstrings—and don’t limit it to the API reference.

  • Give practical examples throughout. Have at least one “startup guide” that will show newcomers how to build a working example. The first page of the documentation should give a quick overview of your API’s basic and representative use case.

  • Document the evolution of your API in detail, version by version. Version control system (VCS) logs are not enough!

  • Make your documentation accessible and, if possible, comfortable to read. Your users need to be able to find it easily and get the information they need without feeling like they’re being tortured. Publishing your documentation through PyPI is one way to achieve this; publishing on Read the Docs is also a good idea, since users will expect to find your documentation there.

  • Finally, choose a theme that is both efficient and attractive. I chose the “Cloud” Sphinx theme for WSME, but there are plenty of other themes out there to choose from. You don’t have to be a web expert to produce nice-looking documentation.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.22.160