Configuration

Every piece of production-class software has scads of configurable properties containing hostnames, port numbers, filesystem locations, ID numbers, magic keys, usernames, passwords, and lottery numbers. Get any of these properties wrong and the system is broken. Even if the system seems to work most of the time, it could break at 1 a.m. when Daylight Saving Time kicks in.

“Configuration” suffers from hidden linkages and high complexity—two of the biggest factors leading to operator error. This puts the system at risk because configuration is part of the system’s user interface. It’s the interface used by one of its most overlooked constituencies: the developers and operators who support it. Let’s look at some design guidelines for handling instance-level configuration.

Configuration Files

The configuration “starter kit” is a file or set of files the instance reads at startup. Configuration files may be buried deep in the directory structure of the codebase, possibly in multiple directories. Some of them represent basic application plumbing like API routes. Others need to change per environment.

Because the same software runs on several instances, some configuration properties should probably vary per machine. Keep these properties in separate places so nobody ever has to ask, “Are those supposed to be different?”

We don’t want our instance binaries to change per environment, but we do want their properties to change. That means the code should look outside the deployment directory to find per-environment configurations.

These files contain the most sensitive information in the entire enterprise: production database passwords. They need to be protected from tampering and prying eyes. That leads us to another great reason to keep per-environment configuration out of the source tree: version control. Sooner or later, you’ll accidentally commit a production password to version control. GitHub currently shows 288,093 commits with the title “Removed password.” Tomorrow that number will be higher.

That’s not to say you should keep configurations out of version control altogether. Just keep them in a different repository than the source code. Lock it down to only the people who should have access, and make sure you have controls (i.e., processes, procedures, and people following up on them) to grant and revoke access to those configurations.

Configuration with Disposable Infrastructure

In image-based environments like EC2 or a container platform, configuration files can’t change per instance. Frankly, some of the instances will be there and gone so fast that it doesn’t make any sense to apply static configs. There we need to find another way to provide a new instance with details about its mission in life. The two approaches are to inject configuration at startup or use a configuration service.

Injecting configuration works by providing environment variables or a text blob. For example, EC2 allows “user data” to be passed to a new virtual machine as a blob of text. To use the user data, some code in the image must already know how to read and parse it (for example, it might be in properties format, but it might be JSON or YAML, too). Heroku prefers environment variables. So the application code does need some awareness of its targeted deployment environment.

The other way to get configuration into an image is via a configuration service. In this form, the instance code reaches out to a well-known location to ask for its configuration. ZooKeeper and etcd are both popular choices for a configuration service. Because this builds a hard dependency on the config service, any downtime is immediately a “Severity 1” problem. Instances cannot start up when the config service is not available, yet by definition we’re in an environment where instances start and stop frequently.

Be very careful here. ZooKeeper and etcd—and any other configuration service, for that matter—are complex pieces of distributed systems software. They must have a well-planned network topology to maximize availability, and they must be managed very carefully for capacity. ZooKeeper is scalable but not elastic, and adding and removing nodes is disruptive. In other words, these services require a high degree of operational maturity and carry some noticeable overhead. It’s not worth introducing them to support just one application. Only use them as part of a broader strategy for your organization. Most small teams are better off using injected config.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.188.108.54