Chapter 3. Selecting a Feature Management Platform

Feature flags are not a new idea in software development. However, the increasing pace of delivery has shifted the technique from a rarely used tool to a requirement in modern applications. With the more frequent use of flags and the sheer number of flags used in software increasing, teams need a scalable, enterprise-grade feature management platform. In this chapter, I discuss important requirements and considerations for your feature management system. Whether you decide to build your own or opt for a third-party feature management service, you should ensure that it is well designed.

Feature Management

When teams embark on a journey of feature management, they often go through similar stages of development:

Stage 0: Config file

This is typically where teams start with feature flags. Developers create a flat file of values that is read at initialization time to provide configuration values for an application. However, values often can’t be changed after application startup and are static, meaning it’s not possible to offer different settings to different users. Flag values proliferate until they become difficult to manage in a single file, and the files run the risk of losing synchronization across different deployments.

Stage 1: Database

Developers often decide to move flag values into an application-wide database. A database can be queried during runtime and updates can be read without the need to reinitialize the system. Simple database systems still usually lack the ability to customize behavior on a per-user basis.

Stage 2: Database with context

As teams scale, they typically add more context about their flags—who is the owner/maintainer of a flag, what part of the code it is used for, and so on. Often, a rudimentary UI is added to give team members visibility to what exists in the system. Many teams investigate open source tools or dedicate engineering time at this stage to make the system more useful and reliable.

Stage 3: Feature management service

Feature flags become such an important part of your team’s process that it requires a dedicated service to manage them. Scale is typically the greatest driver, both the number of flags being managed and the number of times they must be evaluated each day. This is the stage at which the service goes from a developer tool to a mission-critical business service. A robust feature management platform will solve problems like the following:

  • Distributing information globally and propagating flag rule updates quickly

  • Guaranteeing system redundancy and being able to survive failures with a predictable outcome

  • Ensuring that the appropriate set of people have access to manage flags, and maintaining an audit log of changes

  • Separating different teams, projects, and environments and supporting multiple programming languages and frameworks

  • Targeting specific users or segments with customizable rules

Feature Management Is Mission Critical

Most homegrown systems never mature past stage 2 and can quickly become a liability instead of an asset. For those organizations that are interested in integrating feature management deeply into their development process, I have a few recommendations to help you be successful.

Design for Scale

Teams designing a feature management system need to consider how to maintain the source of truth for flags, how that information is delivered quickly to where it’s needed (to servers and even to end users’ devices scattered around the world), and how that information is updated when states change.

It’s critical that whenever the application is evaluating a flag that it always receives the same answer regardless of the server, datacenter, or even the continent where the application resides. If one request is served false and another one true for the same user in the same session, users get a confusing and inconsistent experience.

Polling Versus Streaming

In any networked system there are two methods to distribute information. Polling is the method by which the endpoints (clients or servers) periodically ask for updates. Streaming, the second method, is when the central authority pushes the new values to all the endpoints as they change.

Both options have pros and cons. However, in a poll-based system you are faced with an unattractive trade-off: either you poll infrequently and run the risk of different parts of your application having different flag states, or you poll very frequently and shoulder high costs in system load, network bandwidth, and the necessary infrastructure to support the high demands.

A streaming architecture, on the other hand, offers speed advantages and consistency guarantees. Streaming is a better fit for large-scale and distributed systems. In this design, each client maintains a long-running connection to the feature management system, which instantly sends down any changes as they occur to all clients.

Table 3-1. Polling
Pros Cons
Simple Inefficient. All clients need to connect momentarily, regardless of whether there is a change.
Easily cached Changes require roughly twice the polling interval to propagate to all clients.
Because of long polling intervals, the system could create a “split brain” situation, in which both new flag and old flag states exist at the same time.
Table 3-2. Streaming
Pros Cons
Efficient at scale. Each client receives messages only when necessary. Requires the central service to maintain connections for every client.
Fast Propagation. Changes can be pushed out to clients in real time. Assumes a reliable network.

Design for Failure

Feature management systems have become a mission-critical component in the production application stack. In many ways, they act like the central nervous system of your application. Businesses now rely on feature flags to maintain the state of applications and control which features (or feature versions) users will experience. If they are not designed properly, failures in the feature management system can be catastrophic. If it fails (for whatever reason), your application should be designed such that it continues to function.

In practice, this means designing multiple layers of redundancy. When you write code you must consider what should happen if the feature flag system fails to respond. Most feature flag APIs include the ability to specify a default option—what is to be served if no other information is available. Ensure that you have a default option and that your defaults are safe and sane.

Your system should be resilient to momentary interruptions, be able to reestablish a connection to your platform, and resynchronize to the true state of the world, all while the application is running.

Design for Collaboration

The “Mythical Man-Month” is real. The larger the team working on a software project, the greater the communication overhead. It’s true when building software, and it’s equally true when operating a service. When a large team incorporates feature management into its process, there are techniques that teams can use to work together smoothly.

First, just as it is helpful to separate large codebases into smaller units, you can separate your flags into different projects and help the developers avoid the mental overhead of considering hundreds of flags that they don’t need to care about.

When flags are created, they should be assigned an owner or maintainer: someone who understands the context and the purposes of the flag, and, even more important, when that flag is no longer useful and able to be removed. That developer is responsible for the life cycle of the flag, including cleaning it up when it’s no longer needed.

The information contained in flags is valuable. It helps to describe the behavior of the system and diagnose what users are actually experiencing. And it’s a window into the development process and helps everyone to know where a given feature is in the release cycle. Generally, you’ll want to let everyone view the state of a flag. But often, you will want to limit who can change the state of a flag. You can use role-based access to ensure that the people have appropriate access to the right flags, or even limit permissions per environment so that only certain people can change the state of production. As earlier examples have made clear, giving nondevelopment teams access to the feature management system can have significant benefits to the business and to the development team. With that said, it’s up to every team to design the right set of rules for your needs.

It’s also important to keep an audit trail of changes made to each flag. Track and make visible every change that is made to a flag, whether it’s turning the flag on and off, changing the targeting rules, or updating variations. Record who makes the change, when, and ideally why (comments are great for this). You can also use audit log entries as notifications for the rest of the team via Slack or email.

Adoptable

Most modern Software as a Service (SaaS) applications are composed of many different programming technologies, and multiple languages are used to build the end-to-end application experience. You might implement backend code in Java, Python, or Go, whereas the web frontend is likely JavaScript based, and native mobile applications are built for Android and iOS.

Your users don’t recognize that distinction; to them it’s one application. Thus, it’s important that your feature flags work consistently across all of your applications. When a feature is enabled for a user it must be available across platforms, whether that’s in a browser or a native mobile app.

Look for a feature management platform that supports all components of the application, with simple SDKs that present a similar API for your developers and a consistent experience for you users.

Summary

Feature flags have become a mission-critical component of the modern application. Your feature management system must scale and perform to meet the demands of your business. The requirements and considerations outlined in this chapter give you a head start for designing your own or evaluating third-party feature management systems.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.137.186.178