Around Chapter 4 we moved from just having everything in one folder to a more structured tree, and we thought it might be of interest to outline the moving parts.
You can find our code for this chapter at github.com/cosmicpython/code/tree/appendix_project_structure.
git clone https://github.com/cosmicpython/code.git && cd code git checkout appendix_project_structure
Example B-1 shows the folder structure:
Project tree
. ├── Dockerfile
├── Makefile
├── README.md ├── docker-compose.yml
├── license.txt ├── mypy.ini ├── requirements.txt ├── src
│ ├── allocation │ │ ├── __init__.py │ │ ├── adapters │ │ │ ├── __init__.py │ │ │ ├── orm.py │ │ │ └── repository.py │ │ ├── config.py │ │ ├── domain │ │ │ ├── __init__.py │ │ │ └── model.py │ │ ├── entrypoints │ │ │ ├── __init__.py │ │ │ └── flask_app.py │ │ └── service_layer │ │ ├── __init__.py │ │ └── services.py │ └── setup.py
└── tests
├── conftest.py
├── e2e │ └── test_api.py ├── integration │ ├── test_orm.py │ └── test_repository.py ├── pytest.ini
└── unit ├── test_allocate.py ├── test_batches.py └── test_services.py
Our docker-compose.yml and our Dockerfile are the main bits of configuration for the containers that run our app, and can also run the tests (for CI). A more complex project might have several Dockerfiles, although we’ve found that minimising the number of images is usually a good idea.1
A Makefile2
provides the entrypoint for all the typical commands a developer
(or a CI server) might want to run during their normal workflow. make
build
, make test
, and so on. This is optional, you could just use
docker-compose
and pytest
directly, but if nothing else it’s nice to
have all the “common commands” in a list somewhere, and unlike
documentation, a Makefile is code so it has less tendency to go out of date.
All the actual source code for our app, including the domain model, the
Flask app, and infrastructure code, lives in a Python package inside
src,3
which we install using pip install -e
and the setup.py file. This makes
imports easy. Currently the structure within this module is totally flat,
but for a more complex project you’d expect to grow a folder hierarchy
including domain_model/, infrastructure/, services/, api/
Tests live in their own folder, with subfolders to distinguish different test types, and allow you to run them separately. We can keep shared fixtures (conftest.py) in the main tests folder, and nest more specific ones if we wish. This is also the place to keep pytest.ini.
The pytest docs are really good on test layout and importability.
Let’s look at a few of these in more detail.
The basic problem we’re trying to solve here is that we need different config settings for:
Running code or tests directly from your own dev machine, perhaps talking to mapped ports from docker containers
Running on the containers themselves, with “real” ports and hostnames
And different settings for different container environments, dev, staging, prod, and so on.
Configuration through environment variables as suggested by the 12-factor manifesto will solve this problem, but concretely, how do we implement it in our code and our containers?
Whenever our application code needs access to some config, it’s going to get it from a file called config.py. Here are a couple of examples from our app:
Sample config functions (src/allocation/config.py)
import
os
def
get_postgres_uri
(
)
:
host
=
os
.
environ
.
get
(
'
DB_HOST
'
,
'
localhost
'
)
port
=
54321
if
host
==
'
localhost
'
else
5432
password
=
os
.
environ
.
get
(
'
DB_PASSWORD
'
,
'
abc123
'
)
user
,
db_name
=
'
allocation
'
,
'
allocation
'
return
f
"
postgresql://{user}:{password}@{host}:{port}/{db_name}
"
def
get_api_url
(
)
:
host
=
os
.
environ
.
get
(
'
API_HOST
'
,
'
localhost
'
)
port
=
5005
if
host
==
'
localhost
'
else
80
return
f
"
http://{host}:{port}
"
We use functions for getting the current config, rather than constants
available at import time, because that allows client code to modify
os.environ
if it needs to.
config.py also defines some default settings, designed to work when running the code from the developer’s local machine4.
There is an elegant Python package called environ-config which is worth looking at if you get tired of hand-rolling your own environment-based config functions.
Don’t let this config module become a dumping ground full of things that are only vaguely related to config, and is then imported all over the place. Keep things immutable and only modify them via environment variables. If you decide to use a bootstrap script, you can make it the only place (other than tests) that config is imported.
We use a lightweight docker container orchestration tool called docker-compose. It’s main configuration is via a YAML file (sigh5):
Docker-Compose config file (docker-compose.yml)
version
:
"
3
"
services
:
app
:
build
:
context
:
.
dockerfile
:
Dockerfile
depends_on
:
-
postgres
environment
:
-
DB_HOST=postgres
-
DB_PASSWORD=abc123
-
API_HOST=app
-
PYTHONDONTWRITEBYTECODE=1
volumes
:
-
./src:/src
-
./tests:/tests
ports
:
-
"
5005:80
"
postgres
:
image
:
postgres:9.6
environment
:
-
POSTGRES_USER=allocation
-
POSTGRES_PASSWORD=abc123
ports
:
-
"
54321:5432
"
In the docker-compose file, we define the different “services” (containers) that we need for our app. Usually one main image contains all our code, and we can use it to run our API, our tests, or any other service that needs access to the domain model.
You’ll probably have some other infrastructure services like a database. In production you may not use containers for this, you might have a cloud provider instead, but docker-compose gives us a way of producing a similar service for dev or CI.
The environment
stanza lets you set the environment variables for your
containers, the hostnames and ports as seen from inside the docker cluster.
If you have enough containers that information starts to be duplicated in
these sections, you can use environment_file
instead. We usually call
ours container.env.
Inside a cluster, docker-compose sets up networking such that containers are available to each other via hostnames named after their service name.
Protip: if you’re mounting volumes to share source folders between your
local dev machine and the container, the PYTHONDONTWRITEBYTECODE
env
var tells Python to not write .pyc
files, and that will save you from
having millions of root-owned files sprinkled all over your local filesystem,
being all annoying to delete, and causing weird Python compiler errors besides.
Mounting our source and test code as volumes
means we don’t need to rebuild
our containers every time we make a code change.
And the ports
section allows us to expose the ports from inside the containers
to the outside world6--these correspond to the default ports we set
in config.py.
Inside Docker, other containers are available through hostnames named after
their service name. Outside Docker, they are available on localhost
, at the
port defined in the ports
section.
All our application code (everything except tests really) lives inside an src folder:
Subfolders define top-level module names. You can have multiple if you like.
And setup.py is the file you need to make it pip-installable, shown next.
pip-installable modules in 3 lines (src/setup.py)
from
setuptools
import
setup
setup
(
name
=
'allocation'
,
version
=
'0.1'
,
packages
=
[
'allocation'
],
)
That’s all you need. packages=
specifies the names of subfolders that you
want to install as top-level modules. The name
entry is just cosmetic, but
it’s required. For a package that’s never actually going to hit PyPI, this is
all you need.
Dockerfiles are going to be very project-specific, but here’s a few key stages you’ll expect to see:
Our Dockerfile (Dockerfile)
FROM
python:3.8-alpine
RUN
apk
add
--no-cache
--virtual
.build-deps
gcc
postgresql-dev
musl-dev
python3-dev
RUN
apk
add
libpq
COPY
requirements.txt
/tmp/
RUN
pip
install
-r
/tmp/requirements.txt
RUN
apk
del
--no-cache
.build-deps
RUN
mkdir
-p
/src
COPY
src/
/src/
RUN
pip
install
-e
/src
COPY
tests/
/tests/
WORKDIR
/src
ENV
FLASK_APP=allocation/entrypoints/flask_app.py FLASK_DEBUG=1 PYTHONUNBUFFERED=1
CMD
flask run --host=0.0.0.0 --port=80
Installing system-level dependencies.
Installing our Python dependencies (you may want to split out your dev from prod dependencies; we haven’t here, for simplicity).
Copying and installing our source.
Optionally configuring a default startup command (you’ll probably override this a lot from the command-line)
One thing to note is that we install things in the order of how frequently they are likely to change. This allows us to maximize docker build cache reuse. I can’t tell you how much pain and frustration belies this lesson. For this, and many more Python Dockerfile improvement tips, check out Production-ready Docker packaging.
Our tests are kept alongside everything else, as in Example B-7:
Tests folder tree
└── tests ├── conftest.py ├── e2e │ └── test_api.py ├── integration │ ├── test_orm.py │ └── test_repository.py ├── pytest.ini └── unit ├── test_allocate.py ├── test_batches.py └── test_services.py
Nothing particularly clever here, just some separation of different test types that you’re likely to want to run separately, and some files for common fixtures, config and so on.
There’s no src folder or setup.py in the tests folders because we’ve not usually found we need to make tests pip-installable, but if you have difficulties with import paths, you might find it helps.
Those are our basic building blocks:
Source code in an src folder, pip-installable using setup.py
Some docker config for spinning up a local cluster that mirrors production as far as possible
Configuration via environment variables, centralised in a Python file called config.py, with defaults allowing things to run outside containers.
And a Makefile for useful command-line, um, commands.
We doubt that anyone will end up with exactly the same solutions we did, but we hope you find some inspiration here.
1 Splitting out images for prod and test is sometimes a good idea, but we’ve tended to find that going further and trying to split out different images for different types of application code (eg web api vs pubsub client) usually ends up being more trouble than it’s worth; the cost in terms of complexity and longer rebuild/CI times is too high. YMMV.
2 A pure-Python alternative to Makefiles is Invoke; worth checking out if everyone in your team knows Python (or at least knows it better than Bash!)
3 More on src folders: https://hynek.me/articles/testing-packaging/
4 This gives us a local dev setup that “just works” (as much as possible). You may prefer to fail hard on missing env vars instead, particularly if any of the defaults would be insecure in production.
5 Harry hates YAML. He says he can never remember the syntax or how it’s supposed to indent.
6 On a CI server you may not be able to expose arbitrary ports reliably, but it’s only a convenience for local dev. You can find ways of making these port mappings optional, eg with docker-compose.override.yml
3.143.228.40