Chapter 9. Buildpacks and Docker

As we explored in Chapter 8, apps1 that have been deployed to Cloud Foundry run as containerized processes. Cloud Foundry supports running OCI–compatible container images such as Docker as first-class citizens. It also supports running a standalone app artifact (e.g., .jar file or Ruby app) deployed “as is,” containerizing apps on the user’s behalf.

Users can deploy a containerized image by using the cf push command. This command, along with additional arguments, is used for deploying both standalone apps and OCI-compatible container images.

When deploying just a standalone app artifact, Cloud Foundry stages your pushed app by composing a binary artifact known as a droplet. A droplet is an encapsulated version of your app along with all of the required runtime components and app dependencies. Cloud Foundry composes this droplet via its buildpack mechanism. The resulting droplet, combined with a stack (filesystem), is equivalent to a container image. Cloud Foundry runs this droplet in the same way as any other OCI-compatible container image: as an isolated containerized process.

When you push an app, Cloud Foundry automatically detects how to run that app and then invokes the use of Application Life-Cycle Binaries (ALB). The correct set of ALBs will be installed on the Cell (the Diego machine) where the app needs to run. For example, if a Docker image is pushed, the Docker ALBs will be used. If your app requires further compilation via a buildpack, using the buildpack ALBs, Cloud Foundry will additionally detect which buildpack is required to stage and run the app.

The ability to deploy both individual apps and Docker images allows for flexibility. Companies already extensively using Docker images can deploy their existing Docker images to Cloud Foundry. Companies that want to keep their focus on just their apps can use Cloud Foundry to containerize their app artifacts for them. Although Cloud Foundry supports both approaches, there are trade-offs and benefits to each one. This chapter explores those trade-offs and benefits. It then further explains the nature of buildpacks, including a review of their composition, how they work, and how you can modify them.

Why Buildpacks?

There is an explicit ethos to the buildpack approach. The buildpack model promotes a clean separation of roles between the Platform Operator, who provides the buildpack, and the developer, who produces an app, which then consumes the buildpack in order to run. For example, the Platform Operator defines the language support and the dependency resolution (e.g., which versions of Tomcat or OpenJDK are supported). The developer is then responsible only for providing the app artifacts. This separation of concerns is advantageous to both the developer and the app operator.

Generally speaking, developers should not need to be concerned with building containers or tweaking middleware internals; their focus should be on the business logic of their app. By adopting the use of buildpacks, the developers’ focus shifts from building containers and tinkering with middleware to just providing their app artifact. This shift aims to remove the undifferentiated heavy lifting of container construction in order to promote velocity of the app’s business code.

Containerizing apps on the developer’s behalf also offers additional productivity, security, and operational benefits because you can always build the resulting container image from the same known, vetted, and trusted components. Explicitly, as you move your app artifact between different Cloud Foundry environments (e.g., the app is pushed to different spaces as it progresses through different stages of a CI pipeline), the resulting compiled droplet can and should be staged with the same app, dependencies, and stack. The buildpack configuration facilitates this repeatability. This leaves only the app source code to require additional vulnerability and security scanning on every commit.

An additional feature of buildpacks is increased velocity when patching common vulnerabilities and exposures (CVEs) that affect the app. You can update buildpacks to Cloud Foundry when runtime CVEs emerge. Rather than updating and then redeploying each container image in turn, the Platform Operator simply updates the buildpack or runtime dependency once. Cloud Foundry can then restage and redeploy each app with the latest updates and patches.

Finally, there is a clear benefit to separating the build and run stages. A droplet is built during the compile phase of the buildpack. Building (staging) the droplet is done on a new container that is then destroyed after the droplet is created and uploaded to a blobstore. To run the app, the compiled droplet is then downloaded into a freshly created container. This separation between staging and running an app means that build tools do not end up alongside the executed droplet, further reducing the attack surface of the running app.

Why Docker?

The Docker image format (along with the standard OCI image format) has gained a huge amount of traction in many companies. There are benefits to encapsulating your app and all of its dependencies into a single executable image that can be moved between different environments. An often-cited benefit is that what you run in development is guaranteed to be the same in your production environment because you are not repeatedly rebuilding the container image per environment. With Cloud Foundry, every cf push results in a new droplet because you are restaging the application with a buildpack. This means that if your CI pipeline does a cf push to a staging environment and then a cf push to a production environment, you will have created two different droplets albeit with exactly the same components. A single image removes any doubt that development and production apps could have subtle differences. For example, with the buildpack approach, if your buildpack dependency configuration is too broad, there is a small risk that you can update a dependency during the execution of your pipeline resulting in a subtly different droplet. It is easy to mitigate this by locking down the buildpack dependency scope through configuration. Nonetheless, this is an important consideration to be aware of at this point. In the future, Cloud Foundry will allow droplets to be portable across different environments through download/upload mechanisms.

Another benefit of using Docker is that Docker images work well on desktop computers, allowing for a fast “getting started” experience. For comparison, a full Cloud Foundry installation on a local box (bosh-lite) requires BOSH skills and therefore some upfront investment. However, you can obtain a lightweight, easy-to-use, single-tenant version of Cloud Foundry by using PCFDev.

The trade-off of using Docker images is that more responsibility remains with the developer who constructs the image. Additionally, there are more components requiring a package scan on every new code commit. For example, the opaque nature of Docker images makes detecting and patching CVEs significantly more difficult than containers constructed and managed by the platform.

Whatever approach you decide on, by design, Cloud Foundry natively supports it. This leaves the choice down to what best suits your operational requirements.

App and Dependency Security Scanning

Many companies have already invested heavily in app security scanning. This includes the app and all of its external build and runtime dependencies. Therefore, there are some compelling benefits of using buildpacks to create a droplet for you. From a security and operational standpoint, it is likely to be more secure to keep just the app as the unit of deployment and allow the vetted components of the platform to handle all remaining dependencies.

Buildpacks Explained

Buildpacks are an essential component when deploying app artifacts to Cloud Foundry. Essentially, a buildpack is a directory filesystem that provides the following:

  • Detection of an app framework and runtime support

  • App compilation (known as staging), including all the required app dependencies

  • Application execution

You can locate buildpacks remotely; for example, on GitHub, accessed via any Git URL, as in the case of the Java Buildpack. Buildpacks can also reside natively on Cloud Foundry through a process of being packaged and uploaded for offline use.

You can specify additional buildpack metadata, such as the app name, RAM, service-binding information, and environment variables, on the command line or in an app manifest file.

Using an app manifest provides an easy way to handle change control because you can check manifests into source control. You can see a simple example of an app manifest in the spring-music repository:

applications:
- name: spring-music
  memory: 512M
  instances: 1
  random-route: true
  path: build/libs/spring-music.war

Buildpacks typically examine the user-provided artifacts (applications along with their manifest and any CF CLI arguments) to determine the following:

  • Which dependencies should be downloaded

  • How apps and runtime dependencies should be configured

Unlike pushing a Docker image, buildpack-built containerized processes undergo a process known as staging.

Staging

Buildpacks only make one requirement on the filesystem: it must contain a bin directory containing three scripts:

  1. Detect

  2. Compile

  3. Release

Detect, compile, and release are the three life-cycle stages of the buildpack. The three stages are completely independent and it is not possible to pass variables between these scripts.

Collectively, the detect, compile, and release stages are known in Cloud Foundry parlance as staging. Staging happens on a new clean container. The output of staging is a droplet that is uploaded to the Cloud Controller blobstore for later use.

You can write buildpacks in any language. For Ruby-based buildpacks, the buildpack scripts invoke the following piece of Ruby to ensure that Ruby is passed to the environment to run the scripts:

#!/usr/bin/env ruby

This basically instructs Bash to use the Ruby interpreter when it is executing the script. Alternatively, if you had a buildpack written in Node.js, the script would provide the path to Node.js. For buildpacks written in Bash, you simply invoke the detect, compile, and release scripts in the buildpack’s bin directory, as shown here:

$ bin/detect <build-dir>

The Java Buildpack

Technically, you can make no assumptions about the environment in which the three buildpack scripts will run, and so historically buildpacks were written using Bash. Cloud Foundry ensures that Ruby will be present, thus the Java buildpack (JBP) has deviated from the standard buildpacks to be written in Ruby. There is a difference in philosophy between the JBP and other buildpacks. With other buildpacks such as the Ruby buildpack, the ability to handle things like enterprise proxy servers is handled by setting up a lot of the environment by hand. The JBP is different. It attempts to handle as much of this environment configuration for you via the buildpack components directly.

For the rest of this chapter, we will use the JBP as a great example of how buildpacks work.

Detect

Detect is called only if Cloud Foundry does not know which buildpack to run. Assuming that the user did not specify which buildpack to use at the outset, detect will be the first script to be run. It is invoked when you push an app to Cloud Foundry. Cloud Foundry will iterate through all known buildpacks, based on buildpack ordering, until it finds the first available buildpack that can run the app.

The detect script will not be run if a particular buildpack was specified by the user; for example:

$ cf push <my_app> -b my-buildpack

Detect is required to return very little. Strictly speaking, detect only needs to return an exit code (either 0 or some other nonzero integer). The JBP, however, returns a list of key–value pairs that describe what it is going to do; for example, use Java Version = Open Jdk JRE 1.8.0_111 and Tomcat = 8.0.38, etc.

System Buildpack Ordering

There are some important considerations when using the detect script. Because the detect script uses the first available buildpack that can run the app, it is important to define the correct ordering of system buildpacks. For example, both the JBP and a TomEE buildpack could run a WAR file. When you use the cf push command without explicitly defining the buildpack, the first buildpack in the buildpack list is used.

For this reason, it is best practice to always explicitly define your desired buildpack and to do so using a manifest that you can check into a source-control repository. With that said, because some users might still rely on the detect script, both the Cloud Foundry operator and user should always pay strict attention to buildpack ordering.

Compile

Compile is responsible for all modification tasks that are required prior to execution. Compile takes the pushed app and turns it to a state in which it is ready to run. The /bin/compile script can move files around, change file contents, delete artifacts, or do anything else required to get the app into a runnable state. For example, in the case of Java, it downloads Java (Open JDK), Tomcat, JDBC drivers, and so on, and places all of these dependencies in their required location. If compile needs to reconfigure anything, it can reach into your app and, in the example of the JBP, rewrite the Spring configuration to ensure that everything is properly set up. The compile phase is when you would make any other modifications such as additional app-specific load balancer configurations.

The JBP Compile Phase

The workflow for the JBP is subtly different from other buildpacks. Most other buildpacks will accept app source code. The JBP requires a precompiled app at the code level; for example, a JAR or WAR file.

As discussed in “A Marketplace of On-Demand Services”, an app can have various services bound to it; for example, database or app monitoring. The compile script downloads any service agents and puts them in the correct directory. Compile also reconfigures the app to ensure that the correct database is configured. If cf push specifies that a specific service should be used, but the service is not available, the deployment will fail and the app will not be staged.

The JBP has some additional capabilities; for example, the Tomcat configuration contains some Cloud Foundry–specific values to enable sending logging output to the Loggregator’s Doppler component.

Release

The release stage provides the execution command to run the droplet. The release script is part of the droplet and will run as part of staging.

Some buildpacks (such as Java) need to determine how much memory to allocate. The JBP achieves this by using a program that calculates all required settings, such as how big the heap should be. You can run the script at the start of the application as opposed to only during the staging process. This flexibility allows for scaling because the script will be invoked every time a new instance is instantiated. cf scale can scale the number of instances and the amount of memory. If you change the amount of RAM, there is no restaging, but the app will be stopped and restarted with the required amount of memory specified, based on the specific memory weightings used.

The rest of the release script sets up variables and the runtime environment. For example, with Java, environment variables such as JAVA_HOME and JAVA_OPTS (see Java Options Framework) are set, and then finally, the release script will invoke any app server scripts such as Catalina.sh to start Tomcat.

Buildpack Structure

The three aforementioned life-cycle stages (detect, compile, release) are echoed by the code. The code sits in the <buildpack>/lib directory, all the tests sit in <buildpack>/spec, and, in the case of Ruby-based buildpacks, rake is the task executor.

The <buildpack>/config directory contains all the configurations. Components.yml is the entry point containing a list of all the configurable components.

For example, when looking at the JREs section, we see the following:

jres:
  - "JavaBuildpack::Jre::OpenJdkJRE"
# - "JavaBuildpack::Jre::OracleJRE"
# - "JavaBuildpack::Jre::ZuluJRE"
Note

The Oracle JRE is disabled because you need a license from Oracle to use it.

Here’s the OpenJDK configuration YAML:

jre:
  version: 1.8.0_+
  repository_root: "{default.repository.root}/openjdk/{platform}/{architecture}"
memory_calculator:
  version: 1.+
  repository_root: "{default.repository.root}/memory-calculator/{platform}/{architecture}"
  memory_sizes:
    metaspace: 64m..
    permgen: 64m..
  memory_heuristics:
    heap: 75
    metaspace: 10
    permgen: 10
    stack: 5
    native: 10

This specifies the use of Java 8 or above (“above” being denoted via the “+”) and the repository root where the JRE is located. In addition, it contains the required configuration for the memory calculator app, including the memory weightings.2

Modifying Buildpacks

Cloud Foundry ships with a set of default built-in system buildpacks. To view the current list of built-in system buildpacks, run the $ cf buildpacks command via the Cloud Foundry CLI.

If some of the buildpacks require adjustment, in some cases you can override specific configuration settings.

If your app uses a language or framework that the Cloud Foundry system buildpacks do not support, you can write your own buildpack or further customize an existing buildpack. This is a valuable extension point. Operators can, however, choose to disable custom buildpacks in an entire Cloud Foundry deployment if there is a desire for uniformity of supported languages and runtime configuration.

After you have created or customized your new buildpack, you can consume the new buildpack by doing either of the following:

  • Specifying the URL of the new repository when pushing Cloud Foundry apps

  • Packaging and uploading the new buildpack to Cloud Foundry, making it available alongside the existing system buildpacks

For more information on adding a buildpack to Cloud Foundry, go to the Cloud Foundry documentation page.

Overriding Buildpacks

If you only need to change configuration values such as, in the case of Java, the default version of Java or the Java memory weightings, you can override the buildpack configuration by using environment variables. The name of the overriding environment variable must match the configuration file that you want to override (with the .yml extension) and it must be prefixed with JBP_CONFIG. The value of the environment variable should be valid inline YAML.

As an example, to change the default version of Java to 7 and adjust the memory heuristics, you can apply the following environment variable to the app:

$ cf set-env my-application JBP_CONFIG_OPEN_JDK_JRE '[jre: {version: 1.7.0_+},
+ memory_calculator: {memory_heuristics: {heap: 85, stack: 10}}]'

If the key or value contains a special character such as “:”, you will need to escape them by using double quotes. Here is an example showing how to change the default repository path for the buildpack:

$ cf set-env my-application JBP_CONFIG_REPOSITORY
+ '[ default_repository_root: "http://repo.example.io" ]'

You cannot apply a new configuration using this process: you can only override an existing configuration. All new configurations require buildpack modifications, as discussed in “Modifying Buildpacks”.

The ability to override any configuration in a config.yml file has made simple configuration changes to the JBP very straightforward. You can specify environment variables both on the command line or in an app manifest file. You can find more detailed advice on extending the JBP at https://github.com/cloudfoundry/java-buildpack/blob/master/docs/extending.md.

Using Custom or Community Buildpacks

It is worth noting that the Cloud Foundry community provides additional external community buildpacks for use with Cloud Foundry.

Tip

You can find a complete list of community buildpacks at https://github.com/cloudfoundry-community/cf-docs-contrib/wiki/Buildpacks.

Forking Buildpacks

You might have a requirement to extend or modify the buildpack; for example, maybe you need to add an additional custom monitoring agent. The buildpack feature supports modification and extension through the use of the Git repository forking functionality to create a copy of the buildpack repository. This involves making any required changes in your copy of the repository. When forking a buildpack, it is recommended you synchronize subsequent commits from upstream.

Best practice is that if the modifications are generally applicable to the Cloud Foundry community, you should submit the changes back to Cloud Foundry via a pull request.

Restaging

After the first cf push, both app files and compiled droplets are retained in the Cloud Controller’s blobstore. When you use cf scale to scale your app, Cloud Foundry uses the existing droplet.

From time to time, you might want to restage your app in order to recompile your droplet; for example, you might want to pick up a new environment variable or a new app dependency.

The droplet that results from a restage will be completely new. Restage reruns the buildpack against the existing app files (source, JAR, WAR, etc.) stored in the Cloud Controller blobstore. The restage process picks up all new buildpack updates and any new runtime dependencies that the buildpack can accept. For example, if you, the Platform Operator, have specified the use of Java 8 or above (jre version: 1.8.0_+), the latest available version of Java 8 will be selected. If a specific version of Java was specified (such as jre version: 1.8.0_111), only that version will be selected.

As with cf push, cf restage runs whatever buildpack is associated with the app. By default, you do not need to specify a buildpack; the platform will run the buildpack detect script in the order specified by the command system buildpacks. Alternatively, you can explicitly specify a buildpack name or URL. Diego Cells support both .git and .zip buildpack URLs.

Packaging and Dependencies

There are different approaches that you can take when accessing a buildpack and its dependencies. The approach you choose is determined by two concerns:

  • How you access the buildpack

  • How you access the buildpack dependencies

You can access the buildpack either remotely, via a Git URL, or by packaging and uploading to Cloud Foundry. You can access buildpack dependencies either by the buildpack remotely (often referred to as online or remote dependencies) or packaged along with a packaged buildpack (referred to as offline dependencies).

Given these considerations, there are three standard approaches to consuming buildpacks:

Online

You access this via a Git URL with both buildpack and dependencies being pulled from a remote repository.

Minimal-package

This is a packaged version of the buildpack that is as minimal as possible. The buildpack is uploaded to Cloud Foundry’s blobstore, but it is configured to connect to the network for all dependencies. This package is about 50 KB in size.

Offline-package

This version of the buildpack is designed to run without network access. It packages the latest version of each dependency (as configured in the config directory) and disables remote_downloads. This package is about 180 MB in size.

Cloud Foundry deployments residing within an enterprise often have limited access to dependencies due to corporate regulations. Therefore, the second or third options are generally established within an enterprise setting.

With all three approaches, it is recommended that you maintain a local mirror of the buildpack dependencies hosted by Cloud Foundry on Amazon S3. To clone this repository, follow the instructions at the Cloud Foundry GitHub repo. With technologies such as Artifactory, you can set up a pull-through model that watches the source blobstore and pulls down any updates onto your local mirror for internal use only. This approach allows the security team to package-scan all dependencies and provides the Platform Operators with a level of governance over what dependencies can be consumed. It also makes it possible for you to run Cloud Foundry without requiring internet access.

Thus, the decision criterion for these options is one of flexibility versus governance.

The offline-package approach allows for complete control over the buildpack and dependencies. Packing provides an explicit guarantee of known, vetted, and trusted dependencies used to deploy the app. The downside is that you will need an additional CI pipeline (see the section that follows) to build and upload any buildpack and dependency changes.

The advantage of the online approach is that it provides the flexibility to make changes to both the buildpack and the consumption of dependencies without the need to run the changes through a pipeline. You can mitigate concerns surrounding control by strict governance of the mirrored dependency repository. For example, if you want to disable the use of Java 7, you can simply remove it from the repository and update the buildpack accordingly. If a developer then reconfigures his custom buildpack to use Java 7, his deployment will fail. Online buildpacks provide the most flexibility, but without the proper governance, this approach can lead to buildpack sprawl. This governance is managed in a production environment through the use of a pipeline to deploy apps. Buildpack sprawl during development is not a bad thing, provided developers keep in mind the available buildpack options and availability of app dependencies in the production environment.

Buildpack and Dependency Pipelines

When using the JBP, the approach of using a local mirror for dependencies involves forking the buildpack and updating the dependency repository. Keeping this mirror and forked buildpack up-to-date and synchronized with the online buildpack and latest dependencies is vital to ensure that you are guarding against the latest CVEs. To this point, it is prudent to set up a pipeline to maintain buildpack concurrency.

You can set up the dependency pipeline flow as follows:

  • Trigger weekly updates from an RSS feed of CVEs that pertain to java_buildpack dependencies (or invoked as soon as a patch is added to the Amazon S3 repository via a pull-down mechanism)

  • Use scripts to pull down pertinent items from Cloud Foundry’s Amazon S3 buckets (see the Cloud Foundry GitHub repo)

  • Push all new dependencies to the local mirror repository, removing any outdated or compromised dependencies

You can set up the buildpack pipeline flow for packaging the java_buildpack as follows:

  • Git clone and pull down the latest version of the java_buildpack repository

  • Override the dependency repository to point to your local buildpack repository (e.g., an Artifactory repository)

  • Push the online buildpack to Cloud Foundry

  • Build offline buildpack and push offline buildpack to Cloud Foundry

  • Restage all affected apps

Note that even if you are using the online-buildpack approach, it is still valuable to have the offline buildpack available in case there is any downtime of your local repository.

Summary

Cloud Foundry supports pushing Docker images and standalone apps and tasks. Upon pushing an app or task, Cloud Foundry uses a buildpack to containerize and run your app artifact. Buildpacks are a vital component in the deployment chain because they are responsible for transforming deployed code into a droplet, which can then be combined with a stack and executed on a Diego Cell in a container.

Buildpacks enable Cloud Foundry to be truly polyglot and provide an essential extension point to Cloud Foundry users. They also facilitate a separation of concerns between operators who provide the language support and runtime dependencies, and developers who are then free to keep their focus on just their app code.

Whatever approach you choose, be it deploying a standalone app or a Docker image, Cloud Foundry supports both as first-class citizens.

1 Apps in this context refers both to Long-Running Processes and tasks. Conceptually, a task is simply a short-lived application with finite characteristics and is guaranteed to run at most once.

2 The use of memory weightings for the memory calculator is in the process of being simplified.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.226.104.153