Chapter 2. Fitness Functions

The mechanics of evolutionary architecture cover the tools and techniques developers and architects use to build systems that can evolve. An important gear in that machinery is the protection mechanism called a fitness function, the architectural equivalent of a unit test for the domain part of an application. This chapter defines fitness functions and explains the categories and usage of this important building block.

An evolutionary architecture supports guided, incremental change across multiple dimensions.

As noted in our definition, the word guided indicates that some objective exists that architecture should move toward or exhibit. We borrow a concept from evolutionary computing called fitness functions, which are used in genetic algorithm design to define success.

Evolutionary computing includes a number of mechanisms that allow a solution to gradually emerge via mutation—​small changes in each generation of the software. The evolutionary computing world defines a number of types of mutations. For example, one mutation is called a roulette mutation: if the algorithm utilizes constants, this mutation will choose new numbers as if from a roulette wheel in a casino. For example, suppose a developer is designing a genetic algorithm to solve the traveling salesperson problem to find the shortest route between a number of cities. If the developer notices that smaller numbers supplied by the roulette mutation yield better results, they may build a fitness function to guide the “decision” during mutation. Thus, fitness functions are used to evaluate how close a solution is to ideal.

What Is a Fitness Function?

We borrow this concept of fitness functions from the evolutionary computing world to define an architectural fitness function:

An architectural fitness function is any mechanism that provides an objective integrity assessment of some architectural characteristic(s).

Architectural fitness functions form the primary mechanisms for implementing evolutionary architecture.

As the domain part of our solution evolves, teams have developed a wide variety of tools and techniques to manage integrating new features without breaking existing ones: unit, functional, and user acceptance testing. In fact, most companies bigger than a certain size have an entire department dedicated to managing domain evolution, called quality assurance: ensuring that existing functionality isn’t negatively affected by changes.

Thus, well-functioning teams have mechanisms for managing evolutionary change to the problem domain: adding new features, changing behaviors, and so on. The domain is typically written in a fairly coherent technology stack: Java, .NET, or a host of other platforms. Thus, teams can download and use testing libraries suited to their combination of technology stacks.

Fitness functions are to architecture characteristics as unit tests are to the domain. However, teams cannot download a single tool for the wide variety of validations possible for architecture characteristics. Rather, fitness functions encompass a wide variety of tools in different parts of the ecosystem, depending on the architecture characteristics the team is governing, as illustrated in Figure 2-1.

fitness functions encompass a wide variety of tools and techniques
Figure 2-1. Fitness functions encompass a wide variety of tools and techniques

As shown in Figure 2-1, architects can use many different tools to define fitness functions:

Monitors

DevOps and operational tools such as monitors allow teams to verify concerns such as performance, scalability, and so on.

Code metrics

Architects can embed metrics checks and other verifications within unit tests to validate a wide variety of architecture concerns, including design criteria (many examples follow in Chapter 4).

Chaos engineering

This recently developed branch of engineering practices artificially stresses remote environments by injecting faults to force teams to build resiliency into their systems.

Architecture testing frameworks

In recent years, testing frameworks dedicated to testing architecture structure have appeared, allowing architects to encode a wide variety of validations into automated tests.

Security scanning

Security—​even if supervised by another part of the organization—​affects design decisions that architects make and thus falls under the umbrella of concerns that architects want to govern.

Before we define the categories of fitness functions and other factors, an example will help make the concept less abstract. The component cycle is a common antipattern across all platforms with components. Consider the three components in Figure 2-2.

A cycle exists when components have a cyclic dependency.
Figure 2-2. A cycle exists when components have a cyclic dependency

Architects consider the cyclic dependency shown in Figure 2-2 an antipattern because it presents difficulties when a developer tries to reuse one of the components—each of the entangled components must also come along. Thus, in general, architects want to keep the number of cycles low. However, the universe is actively fighting the architect’s desire to prevent this problem via convenience tools. What happens when a developer references a class whose namespace/package they haven’t referenced yet in a modern IDE? It pops up an auto-import dialog to automatically import the necessary package.

Developers are so accustomed to this affordance that they swat it away as a reflex action, never actually paying attention. Most of the time, auto-importing is a great convenience that doesn’t cause any problems. However, once in a while, it creates a component cycle. How do architects prevent this?

Consider the set of packages illustrated in Figure 2-3.

Component cycles represented as packages in Java.
Figure 2-3. Component cycles represented as packages in Java

ArchUnit is a testing tool inspired by (and using some of the facilities of) JUnit, but it’s used to test various architecture features, including validations to check for cycles within a particular scope, as illustrated in Figure 2-3.

An example of how to prevent cycles using ArchUnit appears in Example 2-1.

Example 2-1. Preventing cycles using ArchUnit
public class CycleTest {
    @Test
    public void test_for_cycles() {
        slices().
          matching("com.myapp.(*)..").
          should().beFreeOfCycles()
}

In this example, the testing tool “understands” cycles. An architect who wants to prevent cycles from gradually appearing in their codebase can wire this testing into a continuous build process and never have to worry about cycles again. We will show more examples of using ArchUnit and similar tools in Chapter 4.

We first define fitness functions more rigorously, and then examine conceptually how they guide the evolution of the architecture.

Don’t mistake the function part of our definition as implying that architects must express all fitness functions in code. Mathematically speaking, a function takes an input from some allowed set of input values and produces an output in some allowed set of output values. In software, we also generally use the term function to refer to something implementable in code. However, as with acceptance criteria in agile software development, the fitness functions for evolutionary architecture may not be implementable in software (e.g., a required manual process for regulatory reasons). An architectural fitness function is an objective measure, but architects may implement that measure in a wide variety of ways.

As discussed in Chapter 1, real-world architecture consists of many different dimensions, including requirements around performance, reliability, security, operability, coding standards, and integration, to name a few. We want a fitness function to represent each requirement for the architecture, requiring us to find (and sometimes create) ways to measure things we want to govern. We’ll look at a few examples and then consider the different kinds of functions more broadly.

Performance requirements make good use of fitness functions. Consider a requirement that all service calls must respond within 100 ms. We can implement a test (i.e., fitness function) that measures the response to a service request and fails if the result is greater than 100 ms. To this end, every new service should have a corresponding performance test added to its test suite (you’ll learn more about triggering fitness functions in Chapter 3). Performance is also a good example of the vast number of ways architects can think about common measures. For example, performance may suggest request/response timing, as measured by a mentoring tool, or another metric such as first contentful paint, a mobile device performance metric provided by Lighthouse. The purpose of a performance fitness function is not to measure all types of performance but ones that architects deem important for governance.

Fitness functions also can be used to maintain coding standards. A common code metric is cyclomatic complexity, a measure of function or method complexity available for all structured programming languages. An architect may set a threshold for an upper value, guarded by a unit test running in continuous integration, using one of the many tools available to evaluate that metric.

Despite need, developers cannot always implement some fitness functions completely because of complexity or other constraints. Consider something like a failover for a database from a hard failure. While the recovery itself might be fully automated (and should be), triggering the test itself is likely best done manually. Additionally, it might be far more efficient to determine the success of the test manually, although developers should still encourage scripts and automation.

These examples highlight the myriad forms that fitness functions can take, the immediate response to failure of a fitness function, and even when and how developers might run them. While we can’t necessarily run a single script and say “our architecture currently has a composite fitness score of 42,” we can have precise and unambiguous conversations about the state of the architecture. We can also entertain discussions about the changes that might occur on the architecture’s fitness.

Finally, when we say an evolutionary architecture is guided by the fitness function, we mean we evaluate individual architectural choices against the individual and the system-wide fitness functions to determine the impact of the change. The fitness functions collectively denote what matters to us in our architecture, allowing us to make the kinds of trade-off decisions that are both crucial and vexing during the development of software systems.

You may think, “Wait! We’ve been running code metrics as part of continuous integration for years—​this isn’t new!” You would be correct: the idea of validating parts of software as part of an automated process is as old as automation. However, we formerly considered all the different architecture verification mechanisms as separate—​code quality versus DevOps metrics versus security, and so on. Fitness functions unify many existing concepts into a single mechanism, allowing architects to think in a uniform way about many existing (often ad hoc) “nonfunctional requirements” tests. Collecting important architecture thresholds and requirements as fitness functions allows for a more concrete representation for previously fuzzy, subjective evaluation criteria. We leverage a large number of existing mechanisms to build fitness functions, including traditional testing, monitoring, and other tools. Not all tests are fitness functions, but some tests are—​if the test helps verify the integrity of architectural concerns, we consider it a fitness function.

Categories

Fitness functions exist across a variety of categories related to their scope, cadence, result, invocation, proactivity, and coverage.

Scope: Atomic Versus Holistic

Atomic fitness functions run against a singular context and exercise one particular aspect of the architecture. An excellent example of an atomic fitness function is a unit test that verifies some architectural characteristic, such as modular coupling (we show an example of this type of fitness function in Chapter 4). Thus, some application-level testing falls under the heading of fitness functions, but not all unit tests serve as fitness functions—​only the ones that verify architecture characteristic(s). The example in Figure 2-3 represents an atomic fitness function: it checks only for the presence of cycles between components.

For some architectural characteristics, developers must test more than each architectural dimension in isolation. Holistic fitness functions run against a shared context and exercise a combination of architectural aspects. Developers design holistic fitness functions to ensure that combined features that work atomically don’t break in real-world combinations. For example, imagine an architecture has fitness functions around both security and scalability. One of the key items the security fitness function checks is staleness of data, and a key item for the scalability tests is number of concurrent users within a certain latency range. To achieve scalability, developers implement caching, which allows the atomic scalability fitness function to pass. When caching isn’t turned on, the security fitness function passes. However, when run holistically, enabling caching makes data too stale to pass the security fitness function, and the holistic test fails.

We obviously cannot test every possible combination of architecture elements, so architects use holistic fitness functions selectively to test important interactions. This selectivity and prioritization also allows architects and developers to assess the difficulty in implementing a particular testing scenario, thus allowing an assessment of how valuable that characteristic is. Frequently, the interactions between architectural concerns determine the quality of the architecture, which holistic fitness functions address.

Cadence: Triggered Versus Continual Versus Temporal

Execution cadence is another distinguishing factor between fitness functions. Triggered fitness functions run based on a particular event, such as a developer executing a unit test, a deployment pipeline running unit tests, or a QA person performing exploratory testing. This encompasses traditional testing, such as unit, functional, and behavior-driven development (BDD) testing, among others.

Continual tests don’t run on a schedule but instead execute constant verification of architectural aspect(s), such as transaction speed. For example, consider a microservices architecture in which the architects want to build a fitness function around transaction time—​how long it takes for a transaction to complete, on average. Building any kind of triggered test provides sparse information about real-world behavior. Thus, architects build a continual fitness function that simulates a transaction in production while all the other real transactions run, often using a technique called synthetic transactions. This allows developers to verify behavior and gather real data about the system “in the wild.”

Notice that using a monitoring tool does not imply that you have a fitness function, which must have objective outcomes. Rather, using a monitoring tool in which the architect has created an alarm for deviations outside the objective measure of the metric converts the mere use of monitors into a fitness function.

Monitoring-driven development (MDD) is another testing technique gaining popularity. Rather than relying solely on tests to verify system results, MDD uses monitors in production to assess both technical and business health. These continual fitness functions are necessarily more dynamic than standard triggered tests and fall into the broader category called fitness function-driven architecture, discussed in more detail in Chapter 7.

While most fitness functions trigger either on change or continually, in some cases architects may want to build a time component into assessing fitness, leading to a temporal fitness function. For example, if a project uses an encryption library, the architect may want to create a temporal fitness function as a reminder to check if important updates have been performed. Another common use of this type of fitness function is a break upon upgrade test. In platforms like Ruby on Rails, some developers can’t wait for the tantalizing new features coming in the next release, so they add a feature to the current version via a back port, a custom implementation of a future feature. Problems arise when the project finally upgrades to the new version because the back port is often incompatible with the “real” version. Developers use break upon upgrade tests to wrap back-ported features to force re-evaluation when the upgrade occurs.

Another common use of a temporal fitness function comes from an important but not urgent requirement that arises on virtually every project eventually. Many developers have experienced the pain of upgrading more than one major version number of a core framework or library their project depends upon—​so many changes occur between major point releases, it’s often quite difficult to leap versions. However, upgrading a core framework is time-consuming and not deemed as critical, making it more likely to accidentally slip too far behind. Architects can use a temporal fitness function in conjunction with a tool like Dependabot or snyk, which tracks releases, versions, and security patches for software, to create increasingly insistent reminders to upgrade once the corporate criteria (e.g., first patch release) have been met.

Case Study: Triggered or Continuous?

Often the choice of continuous versus triggered fitness function comes down to trade-offs between the approaches. Many developers in distributed systems such as microservices want the same kind of dependency check but on allowed communication between services rather than cycles. Consider the set of services illustrated in Figure 2-4, a more advanced version of the cyclic dependency fitness function shown in Figure 2-3.

Set of orchestrated microservices, where communication should not exist between nonorchestrator services
Figure 2-4. Set of orchestrated microservices, where communication should not exist between nonorchestrator services

In Figure 2-4, the architect has designed the system so that the orchestrator service contains the state of the workflow. If any of the services communicates with each other, bypassing the orchestrator, the team won’t have accurate information about the workflow state.

In the case of dependency cycles, metrics tools exist to allow architects to do compile-time checks. However, services aren’t constrained to a single platform or technology stack, making it highly unlikely that someone has already built a tool that exactly matches a particular architecture. This is an example of what we alluded to earlier—​often, architects must build their own tools rather than rely on third parties. For this particular system, the architect can build either a continuous or a triggered fitness function.

In the continuous case, the architect must ensure that each of the services provides monitoring information (typically via a particular port) that broadcasts who the service calls during the course of workflows. Either the orchestrator service or a utility service monitors those messages to ensure that illegal communication doesn’t occur. Alternatively, rather than using monitors, the team could use asynchronous message queues, have each domain service publish a message to the queue indicating collaboration messages, and allow the orchestrator to listen to that queue and validate collaborators. This fitness function is continuous because the receiving service can react immediately to disallowed communication. For example, perhaps this fault indicates a security concern or other detrimental side effect.

The benefit of this version of the fitness function is immediate reaction: architects and other interested parties know immediately when governance has been violated. However, this solution adds runtime overhead: monitors and/or message queues require operation resources, and this level of observability may have a negative impact on performance, scalability, and so on.

Alternatively, the team may decide to implement a triggered version of this fitness function. In this case, on a regular cadence, the deployment pipeline calls a fitness function that harvests logfiles and investigates communication to determine if it is all appropriate. We show an implementation of this fitness function in “Communication Governance in Microservices”. The benefit of this fitness function is lack of possible runtime impact—​it runs only when triggered and looks at log records. However, teams shouldn’t use a triggered version for critical governance issues such as security where the time lag may have negative impacts.

As in all things in software architecture, the decision between triggered and continuous fitness functions will often provide different trade-offs, making this a case-by-case decision.

Result: Static Versus Dynamic

Static fitness functions have a fixed result, such as the binary pass/fail of a unit test. This type encompasses any fitness function that has a predefined desirable value: binary, a number range, set inclusion, and so on. Metrics are often used for fitness functions. For example, an architect may define acceptable ranges for average cyclomatic complexity of methods in the codebase.

Dynamic fitness functions rely on a shifting definition based on extra context, often real-time content. For example, consider a fitness function to verify scalability along with request/response responsiveness for a number of users. As the number of concurrent users rises, the architects will allow responsiveness to degrade slightly, but they don’t want it to degrade past the point where it will become a problem. Thus, a responsiveness fitness function will take into account the number of concurrent users and adjust the evaluation accordingly.

Notice that dynamic and objective do not conflict—​fitness functions must evaluate to an objective outcome, but that evaluation may be based on dynamic information.

Invocation: Automated Versus Manual

Architects like automated things—​part of incremental change includes automation, which we delve into deeply in Chapter 3. Thus, it’s not surprising that developers will execute most fitness functions within an automated context: continuous integration, deployment pipelines, and so on. Indeed, developers and DevOps have performed a tremendous amount of work under the auspices of Continuous Delivery to automate many parts of the software development ecosystem previously thought impossible.

However, as much as we’d like to automate every single aspect of software development, some parts of software development resist automation. Sometimes architects cannot automate away a critical dimension within a system, such as legal requirements or exploratory testing, which leads to manual fitness functions. Similarly, a project may have aspirations to become more evolutionary but not yet have appropriate engineering practices in place. For example, perhaps most QA is still manual on a particular project and must remain so for the near future. In both of these cases (and others), we need manual fitness functions that are verified by a person-based process.

The path to better efficiency eliminates as many manual steps as possible, but many projects still require manual procedures. We still define fitness functions for those characteristics and verify them using manual stages in deployment pipelines (covered in more detail in Chapter 3).

Proactivity: Intentional Versus Emergent

While architects will define most fitness functions at project inception as they elucidate the characteristics of the architecture, some fitness functions will emerge during development of the system. Architects never know all the important parts of the architecture at the beginning (the classic unknown unknowns problem we address in Chapter 7), and thus must identify fitness functions as the system evolves. Architects write intentional fitness functions at project inception and as part of a formal governance process, sometimes in collaboration with other architect roles such as enterprise architects.

Fitness functions not only verify the initial assumptions by architects on projects, but they also provide ongoing governance. Thus, it’s common for architects to notice some behavior that would benefit from better governance, leading to an emergent fitness function. Architects should keep a wary eye open for misbehavior in a project, especially those that can be verified via fitness functions, and add them aggressively.

These two sometimes form a spectrum, beginning as intentional protection for some aspect but evolving into a more nuanced or even different fitness function over time. Just like unit tests, fitness functions become part of the team’s codebase. Thus, as architectural requirements change and evolve, the corresponding fitness functions must change similarly.

Coverage: Domain-Specific Fitness Functions?

We are sometimes asked if some particular problem domains tend toward certain architectural fitness functions. While nothing is impossible in software architecture and you might use the same automated testing framework to implement some fitness functions, generally fitness functions are used only for abstract architectural principles, not with the problem domain. What we see in practice if you use the same test automation tools is a separation of tests. One set of tests will focus on testing domain logic (e.g., traditional unit or end-to-end tests) and another set of tests on fitness functions (e.g., performance or scalability tests).

This separation is utilitarian to avoid duplication and misguided effort. Remember, fitness functions are another verification mechanism in projects and are meant to coexist alongside other (domain) verifications. To avoid duplicating efforts, teams are wise to keep fitness functions to pure architecture concerns and allow the other verifications to handle domain issues. For example, consider elasticity, which describes a website’s ability to handle sudden bursts of users. Notice that we can talk about elasticity in purely architectural terms—​the website in question could be a gaming site, a catalog site, or a streaming movie site. Thus, this part of the architecture is governed by a fitness function. In contrast, if a team needed to verify something like a change of address, that requires domain knowledge and would fall to traditional verification mechanisms. Architects can use this as a litmus test to determine where the verification responsibility lies.

Thus, even within common domains (such as finance), it is difficult to predict a standard set of fitness functions. What each team ultimately views as important and valuable varies to an annoyingly wide degree between teams and projects.

Who Writes Fitness Functions?

Fitness functions represent the architectural analog to unit tests and should be treated similarly in terms of development and engineering practices. In general, architects write fitness functions as they determine the objective measures for important architecture characteristics. Both architects and developers maintain the fitness functions, including preserving a passing state at all times—​passing fitness functions are an objective measure of an architecture’s fitness.

Architects must collaborate with developers in the definition and understanding of both the purpose and utility of fitness functions, which add an extra layer of verification to the overall quality of the system. As such, they will occasionally fail as changes violate governance rules—​a good thing! However, developers must understand the purpose of the fitness function so that they can repair the fault and continue the build process. Collaboration between the two roles is critical so that developers don’t misunderstand the governance as a burden rather than a useful constraint to preserve important features.

Tip

Keep knowledge of key and relevant fitness functions alive by posting the results of executing fitness functions somewhere visible or in a shared space so that developers remember to consider them in day-to-day coding.

Where Is My Fitness Function Testing Framework?

For testing the problem domain, developers have a wide variety of platform-specific tools because the domain is purposefully written in a particular platform/technology stack. For example, if the primary language is Java, developers can choose from a wide array of unit, functional, user acceptance, and other testing tools and frameworks. Consequently, architects look for the same level of “turnkey” support for architecture fitness functions—which generally doesn’t exist. We cover a few easy-to-download-and-run fitness function tools in Chapter 4, but such tools are sparse compared to domain testing libraries. This is due mostly to the highly varied nature of fitness functions, as illustrated in Figure 2-1: operational fitness functions require monitoring tools, security fitness functions require scanning tools, quality checks require code-level metrics, and so on. In many cases, a particular tool doesn’t exist for your particular blend of architectural forces. However, as we illustrate in future chapters, architects can use a bit of programming “glue” to compose useful fitness functions with little effort, just not as little as downloading a prebuilt framework.

Outcomes Versus Implementations

It is important for architects to focus on the outcomes—​the objective measures for architecture characteristics—​rather than implementation details. Architects often write fitness functions in technology stacks other than the main domain platform, or utilize DevOps tools or any other convenient process that enables them to objectively measure something of interest. The important metaphorical analogy with function in the term fitness function implies something that takes inputs and produces outputs without side effects. Similarly, a fitness function measures an outcome—an objective evaluation of some architecture characteristic.

Throughout the book, we show examples of fitness function implementations, but it is important for readers to focus on the outcome and why we measure something rather than how an architect makes a particular measurement.

Although software architects are interested in exploring evolutionary architectures, we aren’t attempting to model biological evolution. Theoretically, we could build an architecture that randomly changed one of its bits (mutation) and redeployed itself. After a few million years, we would likely have a very interesting architecture. However, we don’t have millions of years to wait.

We want our architecture to evolve in a guided way, so we place constraints on different aspects of the architecture to rein in undesirable evolutionary directions. A good example is dog breeding: by selecting the characteristics we want, we can create a vast number of differently shaped canines in a relatively short amount of time.

We can also think about the system-wide fitness function as a collection of fitness functions with each function corresponding to one or more dimensions of the architecture. Using a system-wide fitness function aids our understanding of necessary trade-offs when individual elements of the fitness function conflict with one another. As is common with multifunction optimization problems, we might find it impossible to optimize all values simultaneously, forcing us to make choices. For example, in the case of architectural fitness functions, issues like performance might conflict with security due to the cost of encryption. This is a classic example of the bane of architects everywhere—​the trade-off. Trade-offs dominate much of an architect’s headaches during the struggle to reconcile opposing forces, such as scalability and performance. However, architects have a perpetual problem of comparing these different characteristics because they fundamentally differ (an apples to oranges comparison) and all stakeholders believe their concern is paramount. System-wide fitness functions allow architects to think about divergent concerns using the same unifying mechanism of fitness functions, capturing and preserving the important architectural characteristics. The relationship between the system-wide fitness function and its constituent smaller fitness functions is illustrated in Figure 2-5.

bea2 0205
Figure 2-5. System-wide versus individual fitness functions

The system-wide fitness function is crucial for an architecture to be evolutionary, as we need some basis to allow architects to compare and evaluate architectural characteristics against one another. Unlike with the more directed fitness functions, architects likely will never try to “evaluate” the system-wide fitness function. Rather, it provides guidelines for prioritizing decisions about the architecture in the future. While fitness functions may not help resolve the trade-off, they help architects more clearly understand the forces at play, with objective measures, so that they can reason about the necessary system-wide trade-offs.

A system is never the sum of its parts. It is the product of the interactions of its parts.

Dr. Russel Ackoff

Without guidance, evolutionary architecture becomes simply reactionary architecture. Thus, for architects, a crucial early architectural decision for any system is to define important dimensions such as scalability, performance, security, data schemas, and so on. Conceptually, this allows architects to weigh the importance of a fitness function based on its importance to the system’s overall behavior.

Summary

The original seed of the idea of applying fitness functions to software architecture occurred to Rebecca when she realized she could use some of her experience derived from another technical domain (evolutionary computing) and apply it to software: fitness functions. Architects have verified parts of architecture forever, but they haven’t previously unified all the different verification techniques into a single overarching concept. Treating all these different governance tools and techniques as fitness functions allows teams to unify around execution.

We cover more aspects of operationalizing fitness functions in the next chapter.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.225.98.111