Generating documentation generation with sphinx

Documentation is king when it comes to supporting consumers of your code and convincing newcomers that it actually makes sense to buy in and use your package. For most people, a documentation website is the first place they go to learn about the package. It is, by definition, assumed to be the single source of truth on the code in its current version. 

The role of documentation is usually threefold:

  • Explain how to install your package and what the general requirements are (for example, which Python versions are supported)
  • Show how to use the package (preferably with a quick example showing its immediate value)
  • Express the general idea and philosophy of the package

A documentation website does benefit from having tutorials, example cases, and a roadmap. With that being said, the core of any documentation website is, obviously, documentation—lists of all functions, classes, and modules, for instance, with explanations of what they do, how to use them, and which variables to pass. 

Now, it may sound like a large task on its own (and it generally is), but there are tools to make this bearable—especially the code documentation part. Remember how we tasked you with writing docstrings in Chapter 3, Functions? Now is the payday—those docstrings can (and will, in a minute!) be used to form documentation. Cool, huh? In order to generate a static website with documentation, we'll use sphinx—a Python package and a tool that is designed to build documentation.

Let's give it a try. First, go to your package's root directory in a terminal. Assuming sphinx is installed, run the following command:

sphinx-quickstart ./docs --ext-autodoc --ext-coverage

Here, we pass a docs folder as a place to store everything related to documentation (there will be a lot of files, so you'd better separate them from the package itself). The two parameters that we've passed will tell sphinx to use two built-in plugins for our project—autodoc and coverage. The first is the piece of code that will utilize your docstrings. The other calculates the overall documentation coverage (for example, the percentage of functions/modules/classes that have docstrings) in your code—don't confuse this with test coverage.

Next, this script will ask you a series of questions. The default values are pretty good, so there's no need to change them. Besides, everything can be changed later, or you can always delete the docs folder and re-run the script, if you want. 

Once that's all done, the script will generate all the settings necessary for the tool to run. Now you can run the tool manually, like this (replacing directories as you wish):

$ sphinx-build -b html <sourcedir> <builddir>

If you agreed to create a makefile, this will help sphinx add the directories you picked to it automatically. Here is how to run it:

$ cd docs;
$ make html

It should be noted that in the wikiwwi package, we copied and pasted part of the code from the docs/makefile to the Makefile in the root, to avoid going in and out of the folder.

We're focusing on web page documentation (HTML), but Sphinx can also generate documentation in other formats, including PDF, JSON, and LaTeX. 

If everything goes as expected, a (mostly empty) documentation package will be generated under docs/build/html. You can open the files in the browser, or spin a simple server (we use VS Code's Live Server plugin). The index (root) file is generated from a corresponding file, docs/source/index.rst. In order to add more content, just edit this file. For example, let's add a small introduction, which will then be shown at the beginning of the web page:

`wikiwwii` is a package aiming at collecting and processing the data on WWII battle from the Wikipedia. The list of all battles is taken from Battles_.

The underscore in Battles_ is an rst-specific symbol, adding a reference to the link. At the bottom of the page, we'll use it to link to Wikipedia: 

.. _Battles: https://en.wikipedia.org/wiki/List_of_World_War_II_battles

Once the file is stored, we re-generate the page and the text should appear.

By default, sphinx is using reStructuredText (rst) format. If you prefer Markdown (which is a similar but simpler and more limited format), follow these instructions: https://www.sphinx-doc.org/en/master/usage/markdown.html.

Now, let's focus on our main task—showing our Python documentation on this site. To do that, we first need to do a few more tweaks:

  1. First, open the conf.py file and uncomment the code after -- Path setup --. This path needs to point at the root directory of your repository, where your Python package lives. Here is how it will look in our case:
import os
import sys
sys.path.insert(0, os.path.abspath('../..'))
  1. Next, go to the extensions section and add one more extension—sphinx.ext.napoleon. It only does one thing: it can parse non-rst docstrings. It is a subjective choice, but we're not fans of it. Here is how the section will look afterward:
extensions = [
'sphinx.ext.autodoc',
'sphinx.ext.coverage',
'sphinx.ext.napoleon'
]
  1. Finally, let's add an autodoc directive to our web pages. Here is the one for the geocoding file (yes, we'll need to add one per file, but it actually makes sense; you wouldn't want to get everything on the same page):
.. automodule:: wikiwwii.parse.geocode
:members:

Upon building, this should result in a list of functions from that file, such as geocode_location and extract_latlon. Before we move on, there are a few things worth mentioning regarding the docstrings and the code itself:

  1. First of all, while sphinx won't fail on bad docstring formatting, it won't be able to format them nicely either. Please check your docstrings and format them to one of the standards that Napoleon supports, such as NumPy or Google style
  2. As you might have noticed, there is no documentation for some functions, such as _get_dom from the collect.fronts module, even though we do have a small docstring for that function! That's because its name starts with an underscore, which, according to a widespread convention, means that the function is private (not meant to be used directly by module consumers) and sphinx respects that. It is a great feature, allowing you to declutter your documentation. Use it wisely!
  3. In the original version of the code for wikiwwii, we passed a few dictionaries as default arguments for some functions. While code-wise this is okay, sphinx tries to print their values in the documentation and the result is hard to read. The choice to do this is completely subjective and optional, but to keep documentation little cleaner, you can (and this is what we did for this package) pass a different default value, such as none or default, and if the argument is equal to that, then replace it with the true default value—just make sure to make that clear in the documentation. A side benefit of that solution is that Git won't see any difference in documentation files if you change those defaults.

As our package is twofold, we created a separate file for each subpackage and added all the corresponding docstrings there. Then we linked the root page to both submodule ones via toctree (see the following code):

.. toctree::
:maxdepth: 2
:caption: Contents:

collect
parse

And that, essentially, is it! The re-built documentation is now actually useful. Of course, it is missing the installation part, tutorials, and examples, but the first step has been taken. Here is a screenshot of our version 1.0 documentation web page:

In a few lines, we have deployed a simple web page with auto-generated documentation, including a place on the web page for the basic information working search and index functionalities. You can add more text, a logo, charts, and more to the documentation in the same way. You can even change the look of the site by swapping one of many themes (https://sphinx-themes.org/). You can always customize the templates to your taste, as well.

Any time you add new features to the package, you can regenerate the documentation—with one script! Once built, this documentation can be copied (for example, using the same CI process that we followed before) to any host (such as an AWS S3 bucket) or readthedocs.org—a service that hosts public documentation for free (and monetizes by injecting ads). 

Now that a documentation page has been set up, let's move on to the last topic we'll cover in this chapter—the useful trick of working with packages in editable mode.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.137.169