Chapter 12. BOSH Deployments

This chapter provides a brief overview of BOSH deployments, including Cloud Foundry’s deployment manifest. BOSH deploys software to infrastructure (often an IaaS layer) using a deployment manifest, one or more stemcells, and one or more releases.

As described in Chapter 11, releases exist to package of all your required artifacts. Releases are then deployed to one or more machines (often VMs) known as instance groups, via a deployment manifest. A deployment manifest describes how to install releases. It defines how the various processes (release jobs) that have been packaged by releases should be distributed and deployed to various instance groups.

There is no direct correlation of release contents and what process runs on an instance group. This correlation is explicitly defined in the deployment manifest. Releases are always IaaS-agnostic. A release coupled with a deployment manifest can also be IaaS-agnostic if written correctly. As of this writing, only stemcells remain explicitly IaaS format–specific. IaaS format–specific means that certain stemcells can be reused across the same IaaS format; for example, OpenStack stemcells can be reused for RackHD, and vSphere stemcells can be reused for vCloudDirector and VirtualBox. However, where the IaaS format differs, such as between GCP and AWS, separate stemcells that are specific to each IaaS format are required.

YAML Files

Both BOSH and Cloud Foundry make extensive use of YAML files. This includes the following:

  • Manifests for creating a BOSH Director environment

  • Release files that describe release jobs and related blobs

  • Deployment manifests

  • App deployment manifests

  • Cloud configuration IaaS information

To read manifests, you need to understand basic YAML syntax. YAML is a human-readable data serialization standard for programming languages. It is generally considered easier for humans to read and write YAML versus other common data formats like JSON or XML.

You can edit YAML with a regular text editor. If you want to dig deeper into YAML, you can find further information at http://yaml.org.

Understanding YAML Syntax

BOSH deployment manifests are currently by far the most complex aspect of Cloud Foundry. The recent work on BOSH 2.0 has made great strides toward simplifying the deployment manifest. However, for cloud operators, understanding how manifests are constructed to relay correct information to BOSH is vital. This section uses a snippet from the Cloud Foundry deployment manifest to walk you through the salient points of the YAML structure widely used in deployment manifests.

Many YAML files make extensive use of lists. Lists can contain individual items or hashes. A hash is represented in a simple key: value form (the colon must be followed by a space). A simple hash could contain a key of “name,” with the value “java_buildpack.” Most lists in Cloud Foundry contain hashes. All members of a list are lines beginning at the same indentation level starting with a “- ” (a - and a space):

# A list of buildpacks

    install_buildpacks:
    - name: java_buildpack
      package: buildpack_java
    - name: ruby_buildpack
      package: buildpack_ruby
    - name: nodejs_buildpack
      package: buildpack_nodejs
    - name: go_buildpack
      package: buildpack_go
    - name: python_buildpack
      package: buildpack_python
    - name: php_buildpack
      package: buildpack_php

Hash values can contain further hashes or even lists of hashes. The install_buildpacks element is itself a hash key that has a value containing a list of nested hashes. In this case, each nested hash is actually two nested hashes (name and package). Here’s another more complex example:

# A list of resource pool hashes: each item is a hash representing a single
# resource pool, uniquely identified by the "name" key

resource_pools:
  - name: small_z1
    cloud_properties:
      instance_type: c3.large
      ephemeral_disk:
        size: 10_240
        type: gp2
      availability_zone:

  - name: small_z2
    cloud_properties:
      instance_type: c3.large
      ephemeral_disk:
        size: 10_240
        type: gp2
      availability_zone:

The resource_pools key has a value containing a list of two hashes: small_z1 and small_z2. Let’s explore the first hash element in the list:

  - name: small_z1
    cloud_properties:
      instance_type: c3.large
      ephemeral_disk:
        size: 10_240
        type: gp2
      availability_zone: (( meta.zones.z1 ))

This hash has two keys. The first key, name, maps to a unique value that identifies this resource pool as small_z1. The second key, cloud_properties, is another nested hash that contains metadata about the specific resource pool. With the exception of readability, the ordering of the name and cloud_properties keys is not important to BOSH. Let’s explore the cloud_properties value:

      instance_type: c3.large
      ephemeral_disk:
        size: 10_240
        type: gp2
      availability_zone:

This nested hash has three keys: instance_type, ephemeral_disk, and availability_zone. The ephemeral_disk key has a value containing a further two nested hashes: size and type.

The remainder of this chapter explores the specifics of deployment manifests.

Deployment Manifests

As discussed, deployments are configured and created based on a YAML file called a deployment manifest (often referred to as simply a manifest). Deployment manifests provide a way to state, as code, an explicit combination of stemcells, release jobs, and operator-specified properties. BOSH deployments are deployed to a specific IaaS.

Cloud Configuration

IaaS-specific BOSH concerns are defined in the Director’s cloud configuration, enabling BOSH deployments to be completely IaaS-agnostic. IaaS-specific concerns such as network information or resource pools (including VM type) are configured once per Director. Cloud configuration was discussed in “Cloud Configuration”.

Because all the IaaS configuration is now encapsulated in a separate cloud configuration YAML file, deployment manifests are significantly reduced, defining only configurations specific to the actual deployment. These configurations include the following:

  1. Deployment name

  2. Director universally unique identifier (UUID) (not required with BOSH 2.0)

  3. Stemcells

  4. Releases

  5. Instance groups (VMs)

  6. Properties

  7. Updates

Director UUID and Deployment Name

The first name in the manifest is the deployment name. This name does not matter. The BOSH Director UUID is required to ensure that the Platform Operators install a deployment on the correct director. For example, imagine that you are targeted on your BOSH production environment, but you want to deploy a development release to your BOSH test environment. If you forget to retarget BOSH to the test environment, the fact that the deployment manifest specifies the BOSH test environment UUID will cause BOSH to prevent the release from being deployed to the production environment. The need for specifying UUID in the manifest has been deprecated with the BOSH CLI v2. The BOSH CLI v2 deals with the aforementioned issue by requiring users to explicitly specify the BOSH environment and deployment when deploying; for example:

$ bosh -e prod -d dep deploy dep-manifest.yml

Release Names

With respect to release names and version (in the deployment manifest releases section), this naming does matter. If you type bosh releases, you will see the releases uploaded to the Director. However, the name you specify at the release.name should match the name of the uploaded release. Any version number defined here dictates the specific release version. You can use the release version for tracking deployed software.

Stemcell

The stemcell is either picked automatically based on the OS name and version provided:

stemcells:
- alias: default
  os: ubuntu-trusty
  version: 3074

Or, you can specify an explicit stemcell name for an exact stemcell match:

stemcells:
- alias: default
  name: bosh-aws-xen-hvm-ubuntu-trusty-go_agent
  version: 3074

The preferred approach is to pick up the stemcell version based on OS and version because this allows the manifest to remain IaaS-agnostic. An error will be raised during a deploy if none of the stemcells that have been uploaded to the Director match the version defined in the deployment manifest.

Instance Groups

The deployment manifest can have one or more instance groups. Instance groups collectively provide the distributed system as a whole. An instance group is a logical unit containing the desired components (release jobs) from releases (such as a database service or a web UI app). Here are the aspects of an instance group:

  • They are defined in the deployment manifest by the Platform Operator.

  • Each is encapsulated on a separate VM.

  • They can have N number of instances.

  • They can be comprised of one or more release jobs from one or more releases.

  • Each represents a discrete functionality (such as that of the Cloud Controller or the GoRouter).

A deployed instance group exists on N number of machines. Therefore, an instance group can have multiple deployed instances. Instance groups are based on a defined stemcell, and they execute either a long-running service or a short-running BOSH task known as an errand.1 Errands are discussed further in “Errand”.

The operator who writes the deployment manifest decides which instance groups are required for a specific deployment. For example, by default a RabbitMQ deployment has an HAProxy instance group, broker instance group, and server instance group. For a Cloud Foundry deployment, there are numerous different instance groups.

By convention, the instance group name contains only a descriptive name. They can then be striped across different AZs through the AZ configuration described in “AZs”. The network, vm_type, and persistent_disk_type are defined in the cloud configuration manifest. You must also specify any additional instance group properties.

Jobs defines a list of release jobs required to be run on an instance group. Each instance group must be backed by the software from one or more BOSH releases. When defining an instance group, the deployment manifest author must specify which releases make up the component. For example, the cf-deployment manifest defines, among other things, a UAA instance group. The UAA instance group currently specifies five release jobs (consul_agent, uaa, route_registrar, metron_agent, and statsd-injector) obtained from four BOSH releases (consul, uaa, routing, and loggregator). Those four BOSH releases each contain packaged software relevant to the UAA:

- name: uaa
  ...
  jobs:
  - name: consul_agent
    release: consul
    properties:
    ...
  - name: uaa
    release: uaa
    properties:
    ...
  - name: route_registrar
    release: routing
    properties:
    ...
  - name: metron_agent
    release: loggregator
    properties:
    ...
  - name: statsd-injector
    release: loggregator
    properties:
    ...

When you BOSH deploy a release (or releases), BOSH creates one or more instance group machines. The BOSH Director communicates with the agent on the instance group machine to execute the commands defined in a control script. The agent currently executes these commands by using that control script. The control script for Linux is currently Monit, an open source process supervision tool (see “Control scripts”).

The deployed instance group machine contains a directory tree starting at /var/vcap/ and BOSH places release resources (including compiled code), as defined by the deployment manifest, under that directory. BOSH also creates four subdirectories: jobs, packages, src, and blobs. These display on instance group machines as /var/vcap/jobs, /var/vcap/packages, /var/vcap/src, and /var/vcap/blobs, respectively. These directories directly map to the release job directories that were discussed in “Anatomy of a BOSH Release”.

Instance groups define the machine’s network, vm_type, and any required static IPs along with a persistent disk if required. Additionally, release jobs contain property placeholders, discussed in the section that follows. The deployment manifest author must provide any required property placeholder values.

Properties

During a BOSH deployment, the BOSH Director creates a new machine for an instance group, and places the release resources on that machine. During this process, BOSH begins by reviewing the release job’s specification file to see what properties are required. The specification file contains default values that are set for the job’s required properties. The properties section defines the properties that the release job requires, as defined by the release job in its specification file. Regardless of what properties are set in the manifest, they will be injected into the job’s machine only if they are defined in the specification file.

Properties can currently be set at the job level, instance group level, and global level, as demonstrated here:

instance-groups:
  - name: postgres
    jobs:
      - properties:
        #job level (top priority)

    properties:
      #instance-group level - overrides global properties and overriden by job
      properties

properties:
  #global level

When BOSH evaluates the release job, job level properties take ultimate precedence. If no job level property exists, instance-group and global properties are used, with instance-group properties taking precedence. instance-group level and global–level properties are being phased out because they can pollute the manifest with complexity. For example, global level properties can be accidentally picked up if specified only at the global level and not at the job level. Ideally, the manifest should specify job property information as close to where it is needed, namely the job level. Because nothing gets merged into job-level properties, the properties definition schema remains clean.

BOSH links (discussed in Chapter 10) come into play when you define a property on one job, such as a password, and another job then requires that property. Instead of duplicating the property information across the two jobs (and risking repetition configuration errors), you can use BOSH links to provide this information, as shown in the following example:

instance-groups:
  - name: postgres
    jobs:
      - properties:
        #job level (top priority)
        password: ...

  - name: app
    consumes:
      conn: {from postgres}

This removes the need to repeat properties in multiple places. BOSH links do not replace properties; they simply aid the consumption of properties between jobs.

Update

The update section contains canaries and is important for production environments. Canary instances are instances that are updated before other instances. We use them as a fail-fast mechanism, because any update error in a canary instance will terminate the deployment.

Because only canaries are affected before the update stops, problem packages or release jobs are prevented from taking over all instance groups and derailing the entire deployment. The update section defines the number of canaries to use during an update, including the maximum number of canaries to be running at any period of time:

update:
  canaries: 1
  canary_watch_time: 30000-1200000
  update_watch_time: 5000-1200000
  max_in_flight: 5
  serial: false

In this example, serial: false means that BOSH will deploy all instance groups at the same time. serial: true means that BOSH will deploy a single instance group (the order is based on the order of the instance groups in the deployment manifest) and will not continue to the next instance group until the current canary has successfully finished. serial: true is safer (and is the default) because sometimes dependencies exist between instance groups (i.e., the Cloud Controller cannot start if the Cloud Controller database has not been started).

Theoretically, a “well-constructed” release should support deploying jobs in parallel. For example, if the Cloud Controller cannot connect to the Cloud Controller database, the Cloud Controller process will initially fail. However, Monit will try to restart the Cloud Controller process again, some seconds later, until eventually it succeeds in connecting to the database.

You can define canary properties both globally and locally. If you specify canary properties in the global update hash, those properties are applied globally to all instance groups. You can further override or add canary properties in an update hash that is local to a specific instance group, as demonstrated here:

update:
  canaries: 1
  canary_watch_time: 30000-1200000
  max_in_flight: 5
  serial: false
  update_watch_time: 5000-1200000
instance_groups:
- name: consul
...
  update:
    max_in_flight: 1
    serial: true

Credentials

Credentials are commonplace in distributed systems. For example, username and password credentials are typically used to protect specific endpoints, such as access to a postgres database or a blobstore, and SSH keys serve as a means of identifying yourself to an SSH server.

BOSH is responsible for generating required credentials. Credential generation is backed by a generic API implementation known as config-server, and is currently implemented by a component known as cred-hub. The deployment manifest specifies the places where credentials are required using a double parentheses syntax cred-name. BOSH will then autogenerate and populate those required credentials. Currently, BOSH supports credential generation for the following:

  • RSA key (used for the UAA signing key)

  • SSH keys

  • Certificate

  • Password

BOSH achieves credential generation by using its config-server API. This API makes it possible for you to extend BOSH to generate different types of credentials without the need to modify the BOSH Director. With the introduction of the config-server API, it is now possible to implement automatic credential rotation for all software deployed by the Director. This strengthens Cloud Foundry’s security posture significantly.

To use variables generated by BOSH, the manifest must specify where the variables should be interpolated:

properties:
  password: ((postgres_password))

BOSH generates actual values for these credentials and then inserts the real value in the postgres_password placeholder, just before configuration is sent to individual instances.

BOSH create-env

When you run $ bosh create-env ... to create the Director, the CLI has limited ability to generate initial credentials because the Director is not available. The syntax for specifying credentials in the manifest and through the CLI is the same.

In the case of the Director, because it is built from a stemcell, it requires a password for the BOSH agent and SSH keys so that you can connect to the Director VM over SSH. However, the Director manifest also has a Director SSL certificate, blobstore password, and so on, all generated by the BOSH CLI. In the BOSH Director’s BOSH release repository, you can inspect the bosh.yml deployment manifest and see placeholders through the deployment manifest for all the required passwords.

As of this writing, work is being done to allow BOSH to generate and manage as well as rotate the required credentials for all the releases that BOSH is responsible for deploying.

Summary

BOSH deploys software to the defined IaaS layer by using a deployment manifest, one or more stemcells, and one or more releases:

  • A BOSH deployment consists of a collection of one or more specialized machines known as instance groups.

  • An instance group is an individual component such as the GoRouter or Cell that resides on a dedicated machine.

  • Instance groups are built from undifferentiated stemcells by layering and configuring specified release jobs that are obtained from one or more BOSH releases.

  • The various instance groups and their specified release jobs are defined by a deployment manifest and deployed by the BOSH Director.

  • As implied by the name, each instance group can run several instances.

The Cloud Foundry distributed system as a whole comprises the sum of the deployed instance groups.

1 Note that BOSH errands are different from Cloud Foundry Tasks. Errands are finite tasks run by BOSH on a machine; Cloud Foundry Tasks are run by Diego in a container on a Cell machine.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.128.205.21