Code

Even before we get to questions about containers versus VM images, we should look at some things about the code.

Building the Code

Developers naturally pay a lot of attention to their code. As a result, we have great tools at our disposal to build, house, and deploy code. There are some important rules to follow, though. These are mostly about making sure that you know exactly what goes into the code on the instance. It is vital to establish a strong “chain of custody” that stretches from the developer through to the production instance. It must be impossible for an unauthorized party to sneak code into your system.

It starts at the desktop. Developers should work on code within a version control system. There’s simply no excuse not to use version control today. Only the code goes into version control, though. Version control doesn’t handle third-party libraries or dependencies very well.

Developers must be able to build the system, run tests, and run at least a portion of the system locally. That means build tools have to download dependencies from somewhere to the dev box. The default would be to download libraries from the Internet. (The standard joke for Maven users is that Maven downloads half of the Internet to run a build.)

Downloading dependencies from the Internet is convenient but not safe. It’s far too easy for one of those dependencies to silently be replaced, either though a man-in-the-middle attack or by compromising the upstream repository. Even if you download dependencies from the Net to start with, you should plan on moving to a private repository as soon as possible. Only put libraries into the repository when their digital signatures match published information from the upstream provider.

Don’t forget about plugins to the build system, either. A colleague who asked not to be named described an attempt to subvert his company’s product in order to attack one of its enterprise customers. That attack was introduced via a compromised Jenkins plugin.

Developers should not do production builds from their own machines. Developer boxes are hopelessly polluted. We install all kinds of junk on these systems. We play games and visit sketchy websites. Our browsers get loaded up with slimy toolbars and bogus “search enhancers” like any other human user does. Only make production builds on a CI server, and have it put the binary into a safe repository that nobody else can write into.

Immutable and Disposable Infrastructure

Configuration management tools like Chef, Puppet, and Ansible are all about applying changes to running machines. They use scripts, playbooks, or recipes (each has their own jargon) to transition the machine from one state to a new state. After each set of changes, the machine should be fully described by the latest scripts, as shown in the figure.

images/design_for_production/layers_of_stucco.png

The “layers of stucco” approach has two big challenges. First, it’s easy for side effects to creep in that are the result of, but not described by, the recipes. For example, suppose a Chef recipe uses RPM to install version 12.04 of a third-party package. That package has a post-install script that changes some TCP tuning parameters. A month later, Chef installs a newer version of the RPM, but the new RPM’s post-install changes a subset of the original parameters. Now the machine has a state that cannot be re-created by either the original or the new recipes. That state is the result of the history of the changes.

The second challenge comes from broken machines or scripts that only partially worked. These leave the machine in an undefined state. The configuration management tools put a lot of effort into converging unknown machine states into known machine states, but they aren’t always successful.

The DevOps and cloud community say that it’s more reliable to always start from a known base image, apply a fixed set of changes, and then never attempt to patch or update that machine. Instead, when a change is needed, create a new image starting from the base again, as shown in the figure.

images/design_for_production/start_from_known_state.png

This is often described as “immutable infrastructure.” Machines don’t change once they’ve been deployed. Take a container as an example. The container’s “file system” is a binary image from a repository. It holds the code that runs on the instance. When it’s time to deploy new code, we don’t patch up the container; we just build a new one instead. We launch it and throw away the old one.

That notion of disposability puts the emphasis in the right place. The important part is that we can throw away the environment, piece by piece or as a whole, and start over.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.69.255