Chapter 7. Trusting Applications

Marc Andreessen, a notable Silicon Valley investor, famously declared that “software is eating the world.” In many ways, this statement has never been truer. It is the software running in your datacenter that makes all of the magic happen, and as such, it is no secret that we wish to trust its execution.

Code, running on a trusted device, will be faithfully executed. A trusted device is a prerequisite for trusting code, which we covered in Chapter 5. However, even with our execution environment secured, we still have more work to do to trust that the code that’s running on a device is trustworthy.

As such, trusting the device is just half of the story. One must also trust the code and the programmers who wrote it. With the goal being to ensure the integrity of a running application, we must find ways to extend this human trust from the code itself all the way to its actual execution.

Establishing trust in code requires that:

  • The people producing the code are themselves trusted
  • The code was faithfully processed to produce a trustworthy application
  • Trusted applications are faithfully deployed to the infrastructure to be run
  • Trusted applications are continually monitored for attempts to coerce the application with malicious actions

This chapter will discuss approaches to securing each of these steps, with a focus on the inheritance of trust from human to production application.

Understanding the Application Pipeline

The creation, delivery, and execution of code within a computer system is a very sensitive chain of events. These systems are an attractive target for adversaries due to their ability to gain greater access. Attack vectors exist at every step, and subversion at these stages can be very difficult to detect. Therefore, we must work to ensure that every link of this chain (shown in Figure 7-1) is secured in a way that makes subversion detectable.

This process is similar to supply chain security, the collective efforts of governments around the world to enhance security. Ensuring that military equipment is securely built/sourced is critical in ensuring the effectiveness of the fighting force, and software creation and delivery is no different.

Supply Chain Criticality

In 2007, the Israeli government conducted an airstrike against a suspected nuclear facility in Syria. One of many mysteries surrounding this strike is the sudden failure of Syrian radar systems, providing the Israelis with cover. The failure of these radar systems, which were supposedly state of the art, is now widely believed to be attributable to a hardware kill switch hidden in a commercial chip used by the radar equipment. While never fully verified, stories like this one highlight the importance of secure supply chains, whether it be hardware or software.

In support of a secure software delivery chain, every step of the process should be fully auditable with cryptographic validation occurring at each critical point. Generally speaking, these steps can be broken down into four distinct phases:

  • Source code
  • Build/compilation
  • Distribution
  • Execution

Let’s start with trusting the source code itself.

Figure 7-1. A build pipeline depends on both the security of the engineers creating source and configuring the system, as well as the security of the components of the pipeline

Trusting Source

Source code is the first step in running any piece of software. To put it very simply, it’s difficult to trust source code that is written by an untrusted human. Even with careful code auditing, it is still possible for a malicious developer to purposefully encode (and hide!) a vulnerability in plain sight. In fact, there is even a well-known competition dedicated to this dark art. While even well-meaning developers can inadvertently add weakness to an application, a zero trust network will focus on identifying malicious use instead of removing trust from those users.

Setting the trusted developer problem aside for a minute, we still face the problem of securely storing and distributing the source code itself. Typically, source code is stored in a centralized code repository, against which many developers interact and commit work. These repositories must also fall under tight control, particularly if they are being used directly by systems that build/compile the code in question.

Securing the Repository

Maintaining traditional security approaches when it comes to securing a software repository is still effective, and does not prohibit the addition of more advanced security features. This includes basic principles such as the principle of least access, whereby users are only given as much access to the repository as is required to complete the task at hand. In practice, this usually manifests itself as heavily limited/restricted write access.

While this approach is still valid and recommended, the story has changed a little bit with the introduction of distributed source control. With the code repository living in multiple places, it is not always possible to secure a single, centralized entity. In this circumstance, however, there remains an analog for this centralized repository—the system storing the code from which the build system reads.

In this case, it is still highly desirable to protect this system through traditional means; however, the problem becomes more difficult since code can enter the distributed repository in any number of ways. The logical extension, then, is that securing the build source repository alone is not enough.

Authentic Code and the Audit Trail

Many version control systems (VCS), particularly those which are distributed, store source history using cryptographic techniques. This approach, called content addressable storage, uses the cryptographic hash of the content being stored as the identifier of that object in a database, rather than its location or coordinates. It’s possible to see how a source file could be hashed and stored in such a database, thereby ensuring that any change in the source file results in a new hash. This property means that files are stored immutably: it’s impossible to change the contents of the files once stored.

Some VCS systems take this storage mechanism a step further by storing the history itself as an object in the content addressable database. Git, a popular distributed VCS project, stores the history of commits to the repository as a directed acyclic graph (DAG). The commits are objects in the database, storing details like the commit time, author, and identifiers of ancestor commits. By storing the cryptographic hashes of ancestor commits on each commit itself, we form a Merkle tree, which allows one to cryptographically validate that the chain of commits are unmodified (Figure 7-2).

If a commit in the DAG were to be modified, its update will affect all the descendant commits in the graph, changing each commit’s content, and by extension, its identifier. With the source history distributed to many contributors, the system gains another beneficial property: it’s impossible to change the history without other contributors noticing.

Figure 7-2. Git’s database makes unwanted changes difficult, since objects are referenced using a hash of their contents

Storing the DAG in this manner gives us tamper-proof history: it’s impossible to change the history subversively. However, this storage does nothing to ensure that new commits in the history are authorized and authentic. Imagine for a moment that a trusted developer is persuaded to pull a malicious commit into their local repository before pushing it to the official repository. This commit is now in the repository by leaning on the trusted developer’s push access. Even more concerning, the authorship metadata is just plain text: a malicious committer can put whatever details they want in that field (a fact that was used amusingly to make commits appear to be authored by Linus Torvalds on GitHub).

To guard against this attack vector, Git has the ability for commits and tags to be signed using the GPG key of a trusted developer. Tags, which point to the head commit in a particular history, can be signed using a GPG key to ensure the authenticity of a release. Signed commits allow one to go a step further and authenticate the entire Git history, making it impossible for an attacker to impersonate another committer without first stealing that committer’s GPG key.

Signed source code clearly provides significant benefit and should be used wherever possible. It provides robust code authentication not only to just humans, but machines too. This is especially important if CI/CD systems build and deploy the code automatically. A fully signed history allows build systems to cryptographically authenticate the code as trusted before compiling it for deployment.

In the Beginning, There Was Nothing

Many repositories begin with unsigned commits, transitioning to signed commits later on. In this brownfield case, the first commit to be signed is essentially endorsing all commits that came before it. This is important to understand, as you may wish to perform an audit at this time. Having said that, the overhead or difficulty of performing such an audit should not dissuade or delay the transition to signed code; the audit, if you choose to do one, can be performed in due time.

Code Reviews

As we learned in Chapter 6, it can be dangerous to concentrate powerful capabilities onto a single user. This is no different when considering source code contributions. Signed contributions enable us to authenticate the developer committing the code, but does not ensure that the code being committed is correct or safe. Of course, we do place a nontrivial amount of trust in the developer, though this does not mean that said developer should unilaterally commit code to sensitive projects.

To mitigate this risk, most mature organizations implement a code review process. Under code review, all contributions must be approved by one or more additional developers. This simple process drastically improves not just the quality of the software, but also reduces the rate at which vulnerabilities are introduced, whether they be intentional or accidental.

Trusting Builds

Build servers are frequently targeted by persistent threats, and for good reason. They have elevated access, and produce code that is executed directly in production. Detecting artifacts that have been compromised during the build stage can be very difficult, so it is important to apply strong protections to these services.

The Risk

In trusting a build system, there are generally three things that we want to assert:

  • The source code it built is the code we intended to build.
  • The build process/configuration is that which we intended.
  • The build itself was performed faithfully, without manipulation.

Build systems can ingest signed code and produce a signed output, but the function(s) applied in between (i.e., the build itself) is generally not protected cryptographically—this is where the most significant attack vector lies.

This particular vector is a powerful one, as shown in Figure 7-3. Without the right processes and validation, subversion of this kind can be difficult or impossible to detect. For instance, imagine a compromised CI/CD system that ingests signed C code, and compiles it into a signed binary, which is then distributed and run in production. Production systems can validate that the binary is signed, but would have no way of knowing if additional malicious code has been compiled in during the build process. In this way, a seemingly secure system can successfully run malicious code in production without detection. Perhaps even worse, the consumers are fooled into thinking the output is safe.

Figure 7-3. The build configuration and its execution is not protected cryptographically, in contrast to the source code and the generated artifact. This break in the chain poses great threat, and is a powerful attack vector.

Due to the sensitive nature of the build process, outsourcing the responsibility should be carefully evaluated. Things like reproducible builds can help identify compromises in this area (more on that in a bit), but can’t always prevent their distribution. Is this really something you want a third-party provider to do for you? How much do you trust them? Their security posture should be weighed against your own chance of being a high value target.

Host Security Is Still Important

This section focuses on securing various steps of the software build process, but it is important to note that the security of the build servers themselves is still important. We can secure the input, output, and configuration of the build, but if the build server is compromised then it can no longer be trusted to faithfully perform its duties. Reproducible builds, immutable hosts, and the zero trust model itself can help in this regard.

Trusted Input, Trusted Output

If we think of the build system as a trusted operation, it’s clear that we need to trust the input of that operation in order to produce trusted output.

Let’s start with trusting the input to the build system. We discussed mechanisms for trusting the source control systems earlier. The build system, as a consumer of the version control system, is responsible for validating the trustworthiness of the source. The version control system should be accessed over an authenticated channel, commonly TLS. Additionally, for extra security guarantees, tags and/or commits should be signed and the build system should validate those signatures—or chain of signatures—before starting a build.

The build configuration is another important input to the build system. Attacking the build configuration could allow an attacker to direct the build system to link against a malicious library. Even seemingly safe optimization flags can be malicious in security critical code, where timing attack mitigation code can be accidentally optimized away. Putting this configuration under source control, where it can be versioned and attested to via signed commits, helps to ensure that the build configuration is also a trusted input.

With the input sufficiently secured, we can turn our attention to the output of the build process. The build system needs to sign the generated artifacts so downstream systems can validate their authenticity. Build systems typically also generate cryptographic hashes of the build artifacts to guard against corruption or malicious attempts to replace the binaries once produced. Securing the build artifacts and hashes, and then distributing them to downstream consumers, completes the trusted output of the build system.

Reproducible Builds

Reproducible builds are the best tool we have in guarding against subversion of the build pipeline. In short, software supporting reproducible builds is compiled in a deterministic way, ensuring that the resulting binary is exactly the same for a given source code, no matter who built it. This is a very powerful property, as it allows multiple parties to examine the source code and produce identical builds, thus gaining confidence that the build process used to generate a particular binary was not tampered with.

This can be done in a number of ways, but it generally involves a codified build process, and enables developers to set up their own build environment to produce binaries that match the distributed versions bit-for-bit. With reproducible builds, one can “watch” the output of a CI/CD system, and compare its output to results compiled locally. In this way, malicious interference or code injection during the build process can be easily detected. When combined with signed source code, we arrive at a fairly robust process that is able to authenticate both the source code and the binary produced by it.

Virtualized Build Environments Enable Reproducible Builds

Having reproducible builds sounds easy on paper, but reproducing a built binary so it’s byte for byte identical is a very hard problem. Distributions have historically built packages inside a virtual filesystem (a chroot jail) to ensure that all dependencies of the build are captured in the build configuration. Virtual machines or containers can be useful tools to ensure that the build environment is fully insulated from the host running the build.

Decoupling Release and Artifact Versions

Immutable builds are critical in ensuring the security of a build and release system. Without it, replacing a known good version is possible, opening up the door for attacks that target the underlying build artifact. This would enable an attacker to masquerade a “bad” version as a “good” version. For this reason, artifacts generated by build systems should have Write Once Read Many semantics.

Given the immutable artifact requirement, a natural tension arises with the versioning of those artifacts. Many projects prefer to use meaningful version numbers (e.g., semantic versioning) in their releases to communicate the potential impact to downstream consumers with an upgrade of their software. This desire to attach meaning to the version number can be difficult to incorporate into a build system that needs to ensure that every version is immutable.

For example, when working toward a major release, a project might have a misconfigured build that causes the build system to produce incorrect output. The maintainers now face a choice. They could republish the release using a patch-level bump, or they might decide to bend the rules and republish the same version using a new build artifact. Many projects choose the latter option, preferring the benefit of a clearer marketing story than the more correct reversion. This is a bad habit to get into when considering the masquerade just described.

It’s clear from this example that in either case, two separate build artifacts were produced, and the version number associated with the build artifact is a separate choice for the project. Therefore, when creating a build system, it’s better to have the build system produce immutable versions independent of the publicly communicated version. A later system (the distribution system) can manage the mapping of release versions to build artifact versions. This approach enables us to maintain immutable build artifacts without sacrificing usability or introducing bad security practices.

Trusting Distribution

The process of choosing which build artifacts to deliver to downstream consumers is called distribution. The build system produces many artifacts, some of which are meant for downstream consumption. Therefore, we need to ensure that the distribution system maintains control over which artifacts are ultimately delivered.

Promoting an Artifact

Based  on our earlier discussion on immutable build artifacts, promotion is the act of designating a build artifact as the authoritative version without changing the contents of that artifact. This act itself should be immutable: once a version is assigned and released, it cannot be changed. Instead, a new artifact needs to be produced and released under an incrementally higher version number.

This constraint presents a chicken-and-egg scenario. Software typically includes a way to report its version number to the user, but if the version number isn’t assigned until later in the build process, how does one add that version information without changing the build artifact?

A naive approach would be to subtly change the artifact during the promotion process, for example, by having the version number stored in a trivially modified location in the build artifact. This approach, however, is not preferred. Instead, release engineers should make a clear separation between the publicly released version number and the build number, which is an extra component of the release information. With this model, many build artifacts are produced which use the same public release version, but each build is additionally tagged with a unique build number (Figure 7-4). The act of releasing that version is therefore choosing the build artifact that will be signed and distributed. Once such a version is released, all new builds should be configured to use the next target version number.

Figure 7-4. This Firefox public release version is 51.0.1, but the package name retains a build ID

Of course, this promotion must be communicated to the consumer in a way that they can validate they are in possession of the promoted build, and not some intermediary and potentially flawed build. There are a number of ways to do this, and it is largely a solved problem. One solution is to sign the promoted artifacts with a release-only key, thus communicating to the consumers that they have a promoted build. Another way to do this is to publish a signed manifest, outlining the released versions and their cryptographic hashes. Many popular package distribution systems, such as APT, use this method to validate builds obtained from their distribution systems.

Distribution Security

Software distribution is similar to electricity distribution, where electricity is generated by a centralized source, and carried over a distribution network in order to be delivered to a wide consumer base. Unlike electricity, however, the integrity of the produced software must be protected while it transits the distribution system, and allow the consumer to independently validate its integrity. There are a number of widely adopted package distribution and management systems, practically all of which have implemented protections around the distribution process and allow consumers to validate the authenticity of packages received through them. Throughout this section, we will use the popular package management software Advanced Packaging Tool (APT) as an example of how certain concepts are implemented in real life, though it is important to keep in mind that there are many options available to you—APT is merely one.

Integrity and Authenticity

There are two primary mechanisms used to assert integrity and authenticity in software distribution systems: hashing and signing. Hashing a software release involves computing and distributing a cryptographic hash representing the binary released, which the consumer can validate to ensure that the binary has not been changed since it left the hands of the developer. Signing a release involves the author encrypting the hash of the release with their private key, allowing consumers to validate that the software was released by an authorized party. Both methods are effective, and are not necessarily mutually exclusive. In order to better understand how these methods can be applied in a distribution system, it is useful to look at the structure and security of an APT repository.

An APT repository contains three types of files: a Release file, a Packages file, and the packages themselves. The packages file acts as an index for all of the packages in the repository. It stores a bit of metadata on every package the repository contains, such as filenames, descriptions, and checksums. The checksum from this index is used to validate the integrity of the downloaded package before it is installed. This provides integrity, assuring us that the contents have not changed in flight. It is, however, mostly only effective against corruption, since an attacker can simply modify the index hashes if the goal is to deliver modified software. This is where the Release file comes in.

The Release file contains metadata about the repo itself (as opposed to the Packages file, which stores metadata about the packages contained within it). This includes things like the name and version of the OS distribution the repo is meant for. It also includes a checksum of the Packages file, allowing the consumer to validate the integrity of the index, which in turn can validate the integrity of the packages we download. That’s great, except still an attacker can simply modify the Release file with the updated hash of the Packages file and be on their way.

So, we introduce cryptographic signatures (Figure 7-5). A signature provides not only integrity for the contents of the signed file (since a hash is included in the signature), but also authenticity, since successful decryption of the signature proves that the generating party was in the presence of the private key.

Using this principle, the maintainer of the software repo signs the Release file with a private key, to which there is a well-known and well-distributed public key. Any time the repo is updated, package file hashes are updated in the index, and the index’s final hash is updated in the Release file, which is then signed. This chain of hashes, the root of which is signed, provides the consumer with the ability to authenticate the software they are about to install.

In the event that you’re unable to sign a software release in some way, it is essential to fall back to standard security practices. You will need to ensure that all communication is mutually authenticated—this means traffic to, from, and in between any distribution repository. Additionally, you’ll need to ensure that the storage the repository leverages is adequately secured, be it AWS S3 or otherwise.

Figure 7-5. The maintainer signs the Release file, which contains a hash of the Packages index, which contains hashes of the packages themselves

Trusting a Distribution Network

When distributing software with a large or geographically disparate consumer base, it is common to copy the software to multiple locations or repositories in order to meet scaling, availability, or performance challenges. These copies are often referred to as mirrors. In some cases, particularly when dealing with publicly consumed software, the servers hosting the mirrors are not under the control of the organization producing the software. This is obviously a concern, and underscores the requirement of a software repo to be authenticated against the author (and not the repo owner).

Referring back to APT’s hashing and signing scheme, it can be seen that we can, in fact, authenticate the Release file against the author using its signature. This means that for every mirror we access, we can check the Release signature to validate that the mirror is in fact a faithful copy of the original release.

One might think that by signing the Release file, software can be distributed through untrusted mirrors safely. Additionally, repositories are often hosted without TLS under the assertion that the signing of the release is sufficient for protecting the distribution network. Unfortunately, both of these assertions are incorrect.

There are several classes of attacks that open up when connecting to an untrusted mirror, despite the fact that the artifact you’re obtaining is ultimately signed. For instance, a downgrade to an older (signed) version can be forced, as the artifact served will still be legitimate. Other attack vectors can include targeting the package management client itself. In the interest of protecting your clients, always make sure they are connecting to a trusted distribution mirror.

The dearth of TLS-protected repositories presents another vulnerability to the distribution of software. Attackers that are in a position to modify the unprotected response could perform the same attacks that an untrusted mirror could. Therefore, the best solution to this problem is moving package distribution to TLS-protected mechanisms. By adding TLS, clients can validate that they are in fact connecting to a trusted repository and that no tampering of the communication can occur.

Humans in the Loop

With a secure pipeline crafted, we can make considered decisions on where humans are involved in that pipeline. By limiting human involvement only to a few key points, the release pipeline stays secure while also ensuring that attackers are not able to leverage automation in the pipeline to deliver malicious software.

The ability to commit code to the version control system is a clear spot where humans are involved. Depending on the sensitivity of the project, requiring humans to only check in signed commits provides strong confidence that the commit is authentic.

Once committed, humans needn’t be involved in the building of software artifacts. Those artifacts should ideally be produced automatically in a secured system. Humans should, however, be involved in the process of choosing which artifact is ultimately distributed. This involvement could be implemented using various mechanisms: copying an artifact from the build database to the release database or tagging a particular commit in source control, for example. The mechanism by which humans certify a releasable binary doesn’t much matter, as long as that mechanism is secured.

It’s tempting when building secure systems to apply extreme measures to mitigate any conceivable threat, but the burden placed on humans should be balanced against the potential risk. In the case of software that is widely distributed, the private signing key should be well guarded, since the effort of rotating a compromised key would be extreme. Organizations that release software like this will commonly use “code signing ceremonies,” where the signing key is stored on a hardware security module (HSM) and unlocked using authorization from multiple parties, as a mitigation against the theft of this highly sensitive key. For internal use–only software, the effort to rotate a key might be reasonably less, so more lax security practices are reasonable. An organization might still prefer a code signing ceremony for particularly sensitive internal applications—a system that stores credit card details, for example.

Humans and Code Signing Keys

Bit9 is a software security firm that develops an application enabling application whitelisting. They had many high-profile clients, from government agencies to Fortune 100 companies. In 2013, an attack against their corporate network was able to recover one of Bit9’s private code signing keys, which was then used to sign and install malware into a handful of its customers. It is widely believed that this was done in order to bypass the strong security provided by Bit9’s software itself, and underscores the importance of securing code signing keys. If you carry high risk, as Bit9 did, it might be a good idea to employ a code signing ceremony.

Trusting an Instance

Understanding what is running in your infrastructure is important when designing a zero trust network. After all, how can you know what to expect on your network if you don’t know what to expect on your hosts? A solid understanding of the software (and versions) running in your datacenter will go a long way in both breach detection and vulnerability mitigation.

Upgrade-Only Policy

Software versions are important constructs in determining exactly which version of the code you have and how old it is. Perhaps most importantly, they are used heavily in order to determine what vulnerabilities one might be exposed to, given the version they are running.

Vulnerability announcements/discoveries are typically associated with a version number (online service vulnerabilities being the exception), and generally include the version numbers in which the vulnerability was fixed. With this in mind, we can see that it might be desirable to induce a version downgrade in order to expose a known vulnerability. This is an effective attack vector as the software being coerced to run is frequently authorized and trusted, since it is a perfectly valid release, albeit an older one.

If the software is built for internal distribution, perhaps the distribution system serves only the latest copy. Doing this prevents a compromised or misconfigured system from pulling down an old version that may contain a known vulnerability. It is also possible to enforce this roll-forward mentality in hardware. Apple iOS famously uses a hardware security chip to validate software updates and to ensure that only signed software built after the currently installed software can be loaded.

Authorized Instances

The importance of knowing what’s running is more nuanced than simply understanding what is the latest version to have been deployed. There are many edge cases that arise, such as a host that has fallen out of the deployment system; one that has been previously authorized but is now “rogue” by way of no longer receiving updates. In order to guard against cases like this, it’s critical that running instances be individually authorized.

It is possible to use techniques described in Chapter 4 to build dynamic network policy in an effort to authorize application instances, but network policy is often host/device oriented rather than application oriented. Instead, we can leverage something much more application-centric in the pursuit of authorizing a running instance: secrets.

Most running applications require some sort of secret in order to do their job. This secret can manifest itself in many ways: an API key, an X509 certificate, or even credentials to a message queue are common examples. Applications must obtain the secret(s) in order to run, and furthermore, the secret must be valid. The validity of a secret (as obvious as it sounds) is the key to authorizing a running application, as with validation comes invalidation.

Attaching a lifetime to a secret is extremely effective in limiting its abuse. By creating a new secret for every deployed instance and attaching a lifetime to the secret, we can assert that we know precisely what is running, since we know precisely how many secrets we have generated, whom we gave them to, and their lifetimes. Allowing secrets to expire mitigates the impact of “rogue” instances by ensuring they will not operate indefinitely.

Of course, someone must be responsible for generating and injecting these secrets at runtime, and this is no small responsibility. The system carrying this responsibility is ultimately the system that is authorizing the instance to run. As such, it makes sense for this responsibility to fall in the hands of the deployment system, since it already carries similar responsibility.

It doesn’t take much thought to realize the power of a system which can create and (potentially) retrieve secrets. With great power comes great responsibility. If allowing an autonomous system to generate and distribute secrets comes with too much risk for your organization, you might consider including a human at this step. Ideally, this would manifest as a human-approved deployment in which a TOTP or other authenticating code is provided. This code will, in turn, be used to authorize the creation/retrieval of the secrets by the deployment system.

Runtime Security

Trusting that an application instance is authorized/sanctioned is only one half of the concern. There is also the need to validate that it can run safely and securely through its lifecycle. We know how to deploy an application securely, and validate that its deployment is authorized, but will it remain an authorized and trustworthy deployment for the entirety of its life?

There are many vectors which can compromise perfectly authorized application instances, and it might be no surprise to learn that these are the most commonly used vectors. For instance, it is typically easier to corrupt an existing government agent than it is to masquerade as one or attempt to become one. For this reason, individuals with outstanding debt are commonly denied security clearance. They might be fully trusted at the time they are granted clearance, but how susceptible are they to bribery if they are in debt? Can they be trusted in this case?

Secure Coding Practices

Most (all?) application-level vulnerabilities start with a latent bug, which an attacker can leverage to coerce the trusted application to perform an undesirable action. Fixing each bug in isolation will result in a game of whack-a-mole, where developers fix one security-impacting bug only to find two more. Truly mitigating this exposure requires a shift in mindset of the application developers to secure coding practices.

Injection attacks, where user-supplied data is crafted to exploit a weakness in an application or related system, commonly occur when user data is not properly validated before being processed. This type of attack is mitigated by introducing several layers of defenses. Application libraries will carefully construct APIs that avoid trusting user-supplied data. Database querying libraries, for example, will provide APIs to allow the programmer to separate the static query from variables that are provided by the user. By instituting a clear separation between logic and data, the potential for injection attacks is greatly reduced.

Having clear APIs can also support automated scans of application software. Security-aware organizations are increasingly running automated analysis tools against their source code to detect and warn application developers of insecure coding practices. These systems warn about using insecure APIs, for example, by highlighting database queries that are constructed using string concatenation instead of the API discussed earlier. Beyond warning about insecure APIs, application logic can be traced to identify missing checks. For example, these tools might confirm that every system transaction includes some authorization check, which mitigates vulnerabilities that allow attackers to reference data that they should not be allowed to access. These examples represent only a handful of the capabilities possessed by code analysis tools.

Proactively identifying known vulnerabilities is useful, but some vulnerabilities are too subtle to deterministically detect. As a result, another mitigation technique in use is fuzzing. This practice sends random data to running applications to detect unexpected errors. These errors, when exposed, are often the sort of weaknesses that attackers use to gain a foothold in the system. Fuzzing can be executed as part of a functional testing suite early in the build pipeline, or even continuously against production infrastructure.

There are entire books written on secure coding practices, some of which are dependent on the type of application being created. Programmers should familiarize themselves with the appropriate practices to improve the security of their applications. Many organizations choose to have security consultants inspect their applications and development practices to identify problems.

Isolation

Isolating deployed applications by constraining the set of resources they can access is important in a zero trust network. Applications have traditionally been executed inside a shared environment, where a user’s applications are running in an execution environment with very few constraints on how those applications can interact. This shared environment creates a large amount of risk should an application be compromised, and presents challenges similar to the perimeter model.

Application isolation seeks to constrain the damage of a potentially compromised application by clearly defining the resources that are available to the application. Isolation will constrain capabilities and resources that the operating system provides:

  • CPU time
  • Memory access
  • Network access
  • Filesystem access
  • System calls

When implemented at its best, every application is given the least amount of access necessary to complete its work. A well-constrained application that becomes compromised will quickly find that no additional leverage in the larger system is gained. As a result, by isolating applications, the potential damage from a compromised application is greatly reduced. In a multiprocess environment (e.g., a server running several services), other still-safe services are protected from attempts to move laterally on that system.

Application isolation can be accomplished using a number of different technologies:

  • SELinux, AppArmor
  • BSD jails
  • Virtualization/containerization
  • Apple’s App Sandbox
  • Windows’ Isolated Applications

Isolation is generally seen as breaking down into two types: virtualization and shared kernel environments. Virtualization is often considered more secure, since the application is contained inside a virtual hardware environment, which is serviced by a hypervisor outside the VM’s execution environment. Having a clear boundary between the hypervisor and the virtual machine creates the smallest surface area of the two.

Shared kernel environments, like those used in containerized or application policy systems, provide some isolation guarantees, but not to the same degree as a fully virtualized system. A shared kernel execution environment uses fewer resources to run the same set of applications, and is therefore gaining favor in cost-conscious organizations. As virtualization tries to address the resource-efficiency problem, by providing more direct access to the underlying hardware, the security benefits of the virtualized environment begin to look more like the shared kernel environment. Depending on your threat model, you may choose to not share hardware at all.

Active Monitoring

As with any production system, careful monitoring and logging is of the utmost importance, and is particularly critical in the context of security. Traditional security models focus their attention on external attack vectors. Zero trust networks encourage the same level of rigor for internal activity. Early detection of an attack could be the difference between complete compromise and prevention altogether.

Apart from the general logging of security events throughout the infrastructure such as failed or successful logins, which is considered passive monitoring, there exists an entire class of active monitoring as well. For instance, the fuzzing scans we previously discussed can take time to turn up new vulnerabilities—perhaps more time than you’re willing to spend early on in the release pipeline. An active monitoring strategy advocates that the scans also be run against production, continuously.

Don’t Do That in Production!

Occasionally, the desire to take certain actions in production can be met with resistance for fear of impacting availability or stability of the overall system. Security scans frequently fall into this bucket. In reality, if a security scan can destabilize your system, then there is a greater underlying problem, which might even be a vulnerability in and of itself. Rather than avoiding potentially dangerous scans in production, ask why they might be risky, and work to ensure that they can be run safely by resolving any system deficiencies contributing to the concern.

Of course, fuzzing is just one example. Automated scanning can be a useful tool for ensuring consistent behavior in a system. For example, a database of anticipated listening services could be compared against an automated scan of actual listening services so deviations can be addressed. Not all scanning will result in such clear action, however. Scanning of installed software, for example, will typically be used to drive prioritization of upgrades based on the threats a network is exposed to or expects to see.

Effective system scanning requires multiple types of scanner, each of which inspects the system in a slightly different manner:

  • Fuzzing (i.e., afl-fuzz)
  • Injection scanning (i.e., sqlmap)
  • Network port scanning (i.e., nmap)
  • Common vulnerability scanning (i.e., nessus)

So, what to do when all this monitoring actually discovers something? The answer typically depends on the strength of the signal. Traditionally, suspicious (but not critical) events get dumped into reports and periodically reviewed. This practice is by far the least effective, as it can lead to report fatigue, with reports going unnoticed for weeks at a time. Alternatively, important events can page a human for active investigation. These events have a strong enough signal to warrant waking someone up. In most cases, this is the strongest line of defense.

Applications Monitoring Applications

One novel idea in the context of application security monitoring is the idea that applications participating in a single cluster or service can actively monitor the health of their peers, and gain consensus with others on their sanity. This might manifest itself as TPM quotes, behavioral analysis, and everything in between. By allowing applications to monitor each other, you gain a high signal-to-noise ratio while at the same time distributing the responsibility throughout the infrastructure. This approach most effectively guards against side-channel attacks, or attacks enabled through multi-tenancy, since these vectors are less likely to be shared across the entire cluster.

In highly automated environments, however, a third option opens up: active response. Strong signals that “something is wrong” can trigger automated actions in the infrastructure. This could mean revoking keys belonging to the suspicious instance, booting it out of cluster membership, or even signaling to datacenter management software that the instance should be moved offline and isolated for forensics.

Of course, as with any high-level automation, one can do a lot of damage very quickly when utilizing active responses. It is possible to introduce denial-of-service attacks with such mechanisms, or perhaps more likely, shut down a service as a result of operator error. When designing active response systems, it is important to put a number of fail-safes in place. For instance, an active response that ejects a host from a cluster should not fire if the cluster size is dangerously low. Being thoughtful about building active response limitations such as this goes a long way in ensuring the sanity of the active response process itself.

Summary

This chapter dove into how applications in a zero trust network are secured. It might seem counter-intuitive that a zero trust network needs to be concerned with application security. After all, the network is untrusted so untrustworthy applications existing on the network should be expected. However, while the network works to detect and identify malicious application activity, that goal is made impossible if deployed applications are not properly vetted before being authorized to run. As a result, most of this chapter focused on how to securely develop, build and deploy applications in a zero trust network, and then monitor the running instances to ensure that they stay trustworthy.

The chapter introduced the concept of a trusted application pipeline, which is the mechanism by which software written by trusted developers is transformed into built applications that are then deployed into infrastructure. This pipeline is a highly valuable target for would-be attackers, and so it deserves special attention. We dug into secure source code hosting practices, sound practices for turning source code into trusted artifacts, and securely selecting and distributing those artifacts to downstream consumers. The application pipeline can be visualized as a series of immutable transformations on input from earlier in the pipeline, so we explored how to meet the goals of that pipeline without introducing too much friction in the process.

Human attention is a scarce but important resource in a secure system. With the rate of software releases ever increasing, it’s important to mindfully consider when humans are best introduced in the proces. We discussed where to put humans in the loop to ensure that the pipeline remains secure.

Once applications are built, the process of securing their continued execution in a production environment shifts a bit. Old trusted applications may in the future become untrusted as vulnerabilities are discovered, so we discussed the importance of an upgrade-only policy when running applications. Secrets management is often a difficult task for security engineers, where changing credentials is often very burdensome. With a smooth credential provisioning process, however, a new opportunity emerges to frequently rotate credentials, using the credentialing process itself as a mechanism for ensuring only authorized applications continue to run in a production environment.

We ended the chapter with a section discussing good application security hygiene. Learning secure coding practices, deploying applications in isolated environments, and then monitoring them aggressively is the final leg in a trustworthy production environment.

With all the components of a zero trust network explored, the next chapter focuses on how network communication itself is secured.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.107.90