Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 10. BOSH Concepts

BOSH is a release-engineering tool chain that provides an easy mechanism to version, package, and deploy software. It makes it possible for you to create software deployments that are both reproducible and scalable. This chapter covers BOSH concepts and primitives.

BOSH is an open source project originally developed to deploy Cloud Foundry. It is often overlooked as just another component of Cloud Foundry, but the truth is that BOSH is the bedrock of Cloud Foundry and an amazing piece of the ecosystem.

BOSH is a recursive acronym that stands for BOSH outer shell. The outer shell refers to BOSH being a release tool chain that unifies release-engineering, deployment, and life-cycle management of cloud-based software. To put it simply, the BOSH outer shell runs Cloud Foundry so that Cloud Foundry can run your apps.

Note

For readability I talk about BOSH deploying to VMs as simply machines; as is typically the case, however, BOSH can actually deploy VMs, containers, and in some cases, you can use it to configure physical servers.

BOSH can provision and deploy software packages either onto a single machine or at scale over hundreds of machines, with minimal configuration changes. It also performs monitoring, failure recovery, and software updates with zero-to-minimal downtime.

Release Engineering

IT operations are tasked with achieving operational stability. Historically, operational stability was achieved by reducing risk through limiting change. Limiting change is in direct conflict with frequently shipping features. To manage risks involved in frequent software releases, we use release-engineering tool chains. Release engineering involves members of the operations team, who are typically concerned with turning source code into finished software components or products through the following steps:

Compilation
Versioning
Assembly/packaging
Deploying

Automating release-engineering concerns through a tool chain reduces deployment risk, allowing for faster deployments with little to no human interaction.

Release engineering is typically concerned with the compilation, assembly, and delivery of source code into finished software components or products. Periodically, these software components require updating and repackaging in order to fix defects and provide additional features. After they are updated, the components might require redeployment over a distributed cluster of servers or repackaging for deployment to third-party servers.

Release-engineering tool chains are essential because they provide consistent repeatability. Source code, third-party components, data, and deployment environments of a software system are integrated and deployed in a repeatable and consistent fashion. Release-engineering tool chains also provide a historical view to track all changes made to the deployed system. This provides the ability to audit and identify all components that comprise a particular release. Security teams can easily track the contents of a particular release and re-create it at will if the need arises. In summary, consistent repeatability de-risks software releases.

Why BOSH?

Teams operating within a DevOps culture typically deploy their updated software to their own production environment after all of their tests have passed. The CI pipeline will often involve a number of different staging and integration environments that are similarly configured to their production environments. This ensures that updates run as expected when reaching production. These staging environments are often complex and time consuming to construct and administer. There is an ongoing challenge between trying manage configuration drift and maintaining consistency between environments.

Tools such as Chef, Puppet, and Salt Stack have become valuable assets to DevOps for provisioning new environments. Containerization technologies have further enabled developers to port their entire stack as a single reusable image as it moves through the different CI environments. Other tools have also come into the mix. The goal of versioning, packaging, and deploying software in a reproducible fashion often results in a bespoke integration of a variety of tools and techniques that provide solutions to individual parts of the stated goal.

BOSH has been designed to be a single tool covering the entire end-to-end set of requirements of release engineering. It has been purposefully constructed to address the four principles of modern release engineering in the following ways:

Identifiability: This is the ability to identify all of the source, tools, environment, and other components that make up a particular release. BOSH achieves identifiability through the concept of software release. A software release packages up all related artifacts including source code, binary assets, scripts, and configuration. This enables users to easily track contents of a particular release. In addition to software releases, BOSH provides a way to capture all dependencies as one image, known as a stemcell.
Reproducibility: This is the ability to integrate source, third-party components, data, and deployment externals of a software system in order to guarantee operational stability. The BOSH tool chain achieves reproducibility through a centralized server, known as the BOSH Director. The BOSH Director manages software releases, OS images, persistent data, and system configuration. It provides a clean and reproducible way of interacting with deployed systems.
Consistency: This encompasses the mission to provide a stable framework for development, deployment, audit, and accountability for software components. BOSH achieves consistency through defined workflows that are used throughout both the development and deployment of the software system. The BOSH Director gives users the ability to view and track all changes made to the deployed system.
Agility: This entails ongoing research into the repercussions of modern software engineering practices on productivity in the software cycle; in other words, CI. The BOSH tool chain achieves agility by providing the ability to easily create automated releases of complex systems. Furthermore, it allows for subsequent updates through simple commands. BOSH integrates well with established current trends of software engineering such as CD and CI technologies.

The Cloud Provider Interface

BOSH supports deploying to multiple IaaS providers including VMware’s vSphere, AWS, GCP, and OpenStack. BOSH achieves its infrastructure-agnostic capabilities by implementing a cloud provider interface (CPI) for each supported IaaS. The CPI contains implementations of the necessary verbs that drive the underlying IaaS layer. These include instructions such as create VM and create disk. You can extend BOSH support to additional infrastructure providers such as RackHD and Apache CloudStack by implementing a CPI for the required infrastructure, as demonstrated in Figure 10-1.

Infrastructure as Code

BOSH focuses on defining your infrastructure as a piece of code. The traditional approach with other infrastructure-as-code provisioning tools involve having a preestablished set of servers or VMs that are then used for deploying and updating software. More recently, the traditional infrastructure-as-code tools have been extended with capabilities to preprovision VMs prior to laying down the OS and software components.

The difference between these approaches and BOSH is that BOSH, by design, tries to abstract away the differences between all of the infrastructure platforms (IaaS or physical servers) into a generalized cross-platform description of your deployment. Differences between infrastructures are, where possible, handled by the CPI layer. The BOSH user works with a manifest, and that manifest will, by and large, be the same across the different infrastructures to which the BOSH release will be deployed.

Other powerful provisioning tools such as Terraform or AWS Cloud Formation, which also provide configuration to instantiate infrastructure, have not abstracted away infrastructure-specific configuration to the same degree, allowing infrastructure concerns to bubble up for the end user to handle.

With tools like Terraform, you can set up an entire data center from scratch, if, for example you are working on AWS, because Terraform knows you will be using AWS components such as Route53 and ELB. BOSH comes in at a place where all infrastructure platforms provide certain levels of abstraction for compute, disk storage, and some networking. Given those bare-minimum configurations, BOSH will have a common ground for actually creating VMs, attaching disks, and putting them in the correct networks. The BOSH user does not need to worry about concerns such as, “If I am on AWS which AMI should I choose?”

BOSH does not solve every conceivable use case. For example, BOSH does not try to handle IaaS networking configuration, because this is unique to each IaaS layer. There is no easy way to abstract away these concerns to a point where you are not losing required configuration benefits. Abstraction focuses on the common ground across infrastructure platforms, potentially cutting out features that are unique to a specific platform. Therefore, picking the right level of abstraction—to be feature-rich while portable across infrastructure—is the challenge that BOSH aims to address.

BOSH on the Server

Even though BOSH is primarily designed for deployment to IaaS, there are efforts to allow a hardware provisioning experience through projects such as Open Crowbar and RackHD.

The other key feature of BOSH is its built-in resiliency. Traditional infrastructure-as-code provisioning tools do not check whether services are up and running. BOSH has strong opinions on how to create your release, forcing you to create a monitor script for every process. If a process dies, the monitor script (Monit for Linux) will restart the process. In addition, the BOSH Resurrector has the ability to re-create failed or unresponsive VMs. Upon re-creation, BOSH can deal with remounting persistent data.

The final key difference is that most other infrastructure-as-code provisioning tools run a series of package-management commands to build and release their software. Usually those commands download packages from the internet. This approach might lead to a nonreproducible environment (the packages have changed in the upstream repository or are no longer available). BOSH again takes an opinionated view to always ensure that every provisioned release is identical. A BOSH release has all dependencies packaged into the release. Therefore, you can deploy an old release again and again, and BOSH should produce the same results every time.

The value of BOSH goes beyond configuration management. It is focused on a seamless experience of delivering software and then ensuring that it remains highly available and resilient. Simply put, BOSH translates intent into action and then maintains that state.

Creating a BOSH Environment

A single BOSH environment consists of the Director VM and any deployments it orchestrates. Before deploying software releases, or, more specifically, BOSH releases, we first need to deploy the Director. The Director VM includes all necessary BOSH components that will be used to manage the different IaaS resources such as networks, compute, and disks. You can bootstrap a BOSH Director via bosh-init. You can find steps for bootstrapping a BOSH environment via $bosh creat-env on bosh.io.

BOSH Versions

It is recommended that you update your BOSH environment frequently to stay on the latest version. I have seen spurious deployment issues fixed simply by upgrading to a newer version of BOSH, the latest stemcell, and the latest BOSH CLI.

Both BOSH and Cloud Foundry make extensive use of manifests. This includes manifests for creating a BOSH environment, release manifests, deployment manifests, and app manifests for app deployment. Manifests are discussed further in “Understanding YAML Syntax”. It is important to note that the manifest used to create the BOSH Director currently currently has some differences from manifests used for BOSH deployments. You should refer to the bosh.io for a description of those differences.

Single-Node versus Distributed BOSH

Single-node BOSH is a complete BOSH environment that runs on a single machine. Although you can deploy BOSH in a distributed fashion, in reality, most deployments (including Cloud Foundry) work fine using a single-node BOSH machine. Even though being on a single machine can be considered a single point of failure, a single-node BOSH removes network segmentation faults, and it can be protected and mitigated by the IaaS layer. BOSH downtime does not cause downtime of Cloud Foundry or the apps running on it. The only real advantage of a multinode (distributed) BOSH is that it is easier to update because a single-node BOSH can then be used to upgrade the distributed BOSH.

BOSH Lite

Not everyone requires a full-blown deployment for their BOSH release. For example, if you just want to play around with Cloud Foundry, you might not want to incur the costs of running anywhere between 10 to 30 VMs on AWS. The solution to running an easy-to-configure, lightweight version of Cloud Foundry, or any other release, is BOSH Lite.

BOSH Lite is a prebuilt Vagrant box that includes the BOSH Director. However, instead of using a traditional IaaS CPI such as AWS, BOSH Lite utilizes a CPI for Garden. This means that instead of the CPI creating new VMs, the Garden CPI uses containers to emulate VMs. This makes BOSH Lite an excellent choice for scenarios such as these:

General BOSH exploration without investing time and resources to configure an IaaS
Getting started with BOSH and Cloud Foundry without incurring significant IaaS costs
Development of BOSH releases
Testing releases locally or as part of a CI/CD pipeline

BOSH Lite is a great environment for trying out Cloud Foundry, but be mindful that because everything is running on a single VM, it is suitable only for experimentation and not production workloads.

BOSH Top-Level Primitives

This section covers the BOSH primitives, including the BOSH concepts and components, and how they interact. Before bootstrapping a new BOSH environment, it is important to be aware of the top-level BOSH primitives that underpin a provisioned Cloud Foundry environment.

Primitives

Computing languages use primitives to describe the simplest elements available in a programming language.

Following are the top-level BOSH primitives:

Stemcells
Releases
Deployments

Figure 10-2 presents an overview of the top-level BOSH primitives.

The interaction of BOSH Top Level Primitives

Stemcells

In biology, a stemcell is an undifferentiated (or basic) cell that can differentiate into several types of specialized cells throughout the body. A BOSH stemcell follows this same principle; it is a hardened and versioned base OS image wrapped with minimal IaaS-specific packaging. A common example of stemcell hardening is that SSHD, the OpenSSH daemon program for SSH, has been reconfigured to support only appropriate cyphers, only allow specific users to log in, and enforce time-out of client connections. Another example is that all compilers have been removed from the stemcell to reduce the attack surface. A typical stemcell contains the following:

A bare-minimum OS skeleton with essential preinstalled common utilities
Configuration files to securely configure the OS
A BOSH agent for communication back to the Director
IaaS-specific packaging and versioning

IaaS Specifics

Certain infrastructures have different packaging schemes (e.g., raw, qcow, AMI), requiring stemcells to be packaged in a specific way. For this reason stemcells become infrastructure-specific. The base OS image is common across all infrastructure types. All installed packages and libraries should be identical across all stemcells of the same version.

All machines created by BOSH are created from stemcells, including the Director machine itself and any machine subsequently created by the Director.

Stemcells provide clear separation between the base OS and any later-installed software used to specialize the machine. Stemcells are agnostic to what will be installed on them. They do not contain specific or sensitive information relating to the downstream software used to specialized them, so they can be shared at will with other BOSH users. Stemcells ultimately become transformed into specialized VMs in the cluster through BOSH instantiating and installing additional software on them during a BOSH deployment. This specialized downstream software is referred to as a BOSH release.

This separation of the base OS (stemcells) and specialized downstream software (BOSH release) is a powerful concept. Stemcells capture a specific base OS image; for example, Ubuntu Trusty. They are exactly the same and you can use them across different software deployments and different VM types. Because the same base OS is reused, it becomes extremely easy to apply versioning changes to the OS; for example, OS security fixes. This property of generic stemcells makes it possible for BOSH users to effortlessly switch among different infrastructures without being concerned about potential differences between OS images. It is similar to Java’s JVM aim of “write once, run anywhere” for compiled Java code. Because the same base OS is reused everywhere, it becomes easy to apply versioning changes systematically, to every deployed OS image. This benefit should not be underestimated; the ability to rapidly and confidently patch all machines within a distributed system is an extremely powerful capability.¹

Stemcells are distributed as tarballs, and the Cloud Foundry BOSH team is responsible for producing and maintaining an official set of stemcells. You can view and download the most recent and currently supported stemcells at bosh.io.

Releases

BOSH deploys software that has been packaged into a self-contained BOSH release. A BOSH release is your software, including all configuration and dependencies required to build and run your software in a reproducible way. A single BOSH release contains one or more pieces of software that are designed to work in unison. Therefore, BOSH becomes an excellent choice for deploying both an individual component and an entire distributed system.

A release is the software layer placed on top of a stemcell to transform it into a specialized component. Conceptually, a release is the packaging of your software (code) that you want to deploy. Releases are IaaS-agnostic, which allows for increased portability of your software across different IaaS environments. The key point here is that your BOSH releases and the subsequent deployment of those releases are not locked into any IaaS layer. This is because IaaS differences are abstracted away by the CPI layer and the Director’s Cloud Configuration (discussed further in “Cloud Configuration”). This decoupling is powerful for your infrastructure strategy. With the rise of viable alternatives to vSphere and AWS, there is a genuine desire for portability across different IaaS offerings. For example, using only a small team, Pivotal moved its hosted version of Cloud Foundry, known as Pivotal Web Services, from AWS to GCP in a matter of weeks with zero app downtime. This move was possible only because the Cloud Foundry software release is not tied into the underlying IaaS layer.

Releases are made up of one or more release jobs, and you can co-locate different releases during deployment. What you choose to actually deploy is not dictated by releases; you have the freedom to select which release jobs you would like to deploy and on which machines those jobs should reside. The anatomy of a release is discussed in “Anatomy of a BOSH Release”. For now, it is important to know that the release jobs you deploy to a VM form a specific component known as an instance group. Examples of instance group components are a Diego Cell and Cloud Foundry’s Cloud Controller. As implied by the name, an instance group can run several instances of the same component; for example, you typically run several Cells within a Cloud Foundry deployment.

As discussed in Chapter 5, Cloud Foundry is encapsulated by the cf-deployment BOSH release, located at cf-deployment.

Cloud Foundry Release Structure

The Cloud Foundry BOSH release is structured by the parent cf-deployment release, which links to a combination of smaller, independent releases (e.g., etcd-release, UAA-release, and diego-release). The cf-deployment release captures the exact versions of the specific releases necessary to deploy a full Cloud Foundry environment.

BOSH is designed for deploying distributed systems such as Cloud Foundry, but you can use it equally to deploy smaller individual components such as etcd or redis. Because it can deploy almost any other software that is able to be run on the supported infrastructure, it has been used extensively within the Cloud Foundry ecosystem to deploy a wide variety of services such as databases, message brokers, and caching technologies. It is worth noting that many popular services such as RabbitMQ, MySQL, PostgreSQL, and Redis already have existing BOSH releases. Community service brokers are located in the Cloudfoundry-community GitHub repo.

This book does not delve into the detail of creating a new BOSH release; however, it is valuable to become familiar with release primitives and how releases are composed in order to understand how cf-deployment is structured and deployed. Understanding BOSH releases will also come in handy if you want to create additional BOSH releases for services that back your apps running on Cloud Foundry. Chapter 11 covers BOSH releases in detail.

Deployments

BOSH deploys software to the IaaS layer using a deployment manifest, one or more stemcells, and one or more releases. This is known in BOSH terminology as a BOSH deployment, which consists of a collection of one or more machines (VMs), as depicted in Figure 10-3. Machines are built from stemcells and then layered and configured with specified components from one or more BOSH releases. A BOSH deployment requires the following:

The appropriate stemcell(s) for the IaaS of choice
The releases (software) to deploy
An IaaS environment (or infrastructure) to which to deploy the release
A deployment manifest describing the deployment

BOSH creates the deployment using ephemeral resources and persistent disks. The deployment is stable because BOSH can keep your software running by re-creating machines that fail, or restarting failed processes. State stored on the persistent disk (e.g., database data files) can survive when BOSH re-creates a VM because persistent disks can be reattached. As just discussed, deployments are portable across different kinds of cloud infrastructure with minimal changes to the deployment manifest.

The anatomy of a deployment is discussed in Chapter 12.

BOSH 2.0

There are some great features of BOSH 2.0 that significantly reduce the complexity of deployment manifests. You can use the additional set of BOSH 2.0 features in conjunction with the original BOSH constructs.

Cloud Configuration

Previously, deployment manifests included all of the IaaS-specific resource configurations required for that deployment. To keep BOSH deployments IaaS-agnostic, you now can configure IaaS resource configuration (networks, resource pools, disk pools, compilation) in one location at the Director level. Each BOSH deployment can then reference the predefined IaaS resources by name. You can define the IaaS configuration in an additional manifest such as iaas.yml, as follows:

compilation:
  workers: 6
  network: my-network
  az: z1
  reuse_compilation_vms: true
  vm_type: m3.medium
  vm_extensions:
  - 100GB_ephemeral_disk

azs:
- name: z1
  cloud_properties:
    availability_zone: us-west-1a

networks:
- name: my-network
  type: manual
  subnets:
  - az: z1
    gateway: 10.0.16.1
    range: 10.0.16.0/20
    reserved:
    - 10.0.16.2-10.0.16.3
    - 10.0.31.255
    static:
    - 10.0.31.190-10.0.31.254
    cloud_properties:
      subnet: subnet-XXXXXXXX
      security_groups:
      - sg-XXXXXXXX

vm_types:
- name: m3.medium
  cloud_properties:
    instance_type: m3.medium
    ephemeral_disk:
      size: 1024
      type: gp2

vm_extensions:
- name: 100GB_ephemeral_disk
  cloud_properties:
    ephemeral_disk:
      size: 102400
      type: gp2
- name: router-lb
  cloud_properties:
    elbs:
    - stack-bbl-CFRouter-XXXXXXXX
    security_groups:
    - sg-XXXXXXXA
    - sg-XXXXXXXB
- name: ssh-proxy-lb
  cloud_properties:
    elbs:
    - stack-bbl-CFSSHPro-XXXXXXXX
    security_groups:
    - sg-XXXXXXXA
    - sg-XXXXXXXB

disk_types:
- name: 5GB
  disk_size: 5120
  cloud_properties:
    type: gp2
    encrypted: true

The cloud configuration construct provides the flexibility to stipulate IaaS-specific information once; for example, a small, medium, and large vm-type, which can then be used by the various different BOSH deployments. The preceding manifest example specifies information about the following:

Compilation VM
AZs
Networks
vm_types
vm_extensions
disk_types

The manifest specifies that the compilation VM resides on an AZ named z1 and a network named my-network. The properties of z1 and my-network are specified in the respective AZs and networks hashes. The compilation vm_type is specified as m3.medium with an 100GB_ephemeral_disk vm_extension. Again, the details of m3.medium and 100GB_ephemeral_disk are specified in the respective vm_types and vm_extensions hashes. The vm_extensions hash contains additional hashes that describe the required load balancers, named router-lb and ssh-proxy-lb. Finally, you can specify any other IaaS component in this manifest such as disk_types. If that description feels complex, don’t worry, we will walk through the specifics of each component in a moment.

Because the entire IaaS configuration is now encapsulated in a separate cloud configuration manifest, the deployment manifest is not only simpler, it’s significantly smaller. In the subsections that follow, we examine each section of the cloud configuration.

Networks

A BOSH network is an IaaS-agnostic representation of the networking layer. BOSH expresses networks as a logical view; a BOSH network is the aggregation of all assigned IaaS subnets. In the manifest, networks provides an array of networks (IaaS networks and subnets) to be used by deployed jobs.

BOSH Networking and IaaS Networking

The BOSH cloud configuration manifest has only a construct of networks. The IaaS networking can deal with both networks and subnets. Therefore, if you run out of IPs, you can simply add additional IaaS subnets to the BOSH networks’ array. Second, as discussed in “AZs”, BOSH abstracts the AZ separation. Some IaaS offerings, such as AWS, do not allow subnets to span AZs. BOSH deals with this by allowing the operator to specify a single BOSH network and then add multiple IaaS subnets from different AZs to that single BOSH network.

Within the cloud configuration manifest, all networking is specified in one global networking section and then shared by multiple deployments. This provides the advantage that created VMs are assigned the same IPs. This also makes it easier to structure subnets, ensuring that they do not overlap. Specifying network information on a per-deployment manifest basis is significantly more difficult and error prone because you were required to be mindful of other deployments that might have been deployed to the same underlying IaaS network. Now, BOSH guarantees that subnets will not overlap because it keeps unique IPs all in one table. Therefore, you can even give multiple BOSH networks the same underlying IaaS subnet, and, still, BOSH will ensure each new deployment will use only unique IPs.

It is the Director’s responsibility (with the help of the BOSH agent and the IaaS) to configure each of the instance group’s network. Networking configuration is usually assigned when the machine is started. You can also apply it when the network configuration changes due to a manifest change for already-running instance groups. Here is an example of configuring the network:

networks:
- name: my-network
  type: manual
  subnets:
  - az: z1
    gateway: 10.0.16.1
    range: 10.0.16.0/20
    reserved:
    - 10.0.16.2-10.0.16.3
    - 10.0.31.255
    static:
    - 10.0.31.190-10.0.31.254
    cloud_properties:
      subnet: subnet-XXXXXXXX
      security_groups:
      - sg-XXXXXXXX

The reserved range is used for IPs that will be explicitly used by the network (such as the gateway VM or DNS). As such, BOSH will not use any IP in a reserved range for provisioning VMs.

BOSH networking supports automatic and static IP reservation types for manual networks:

Static: The IP is explicitly requested by the Platform Operator in the deployment manifest.
Automatic: The IP is selected automatically based on the network type.

You can use the static IP range for VMs that require a static IP. Components or VMs that you need to know about upfront, such as the HA_Proxy or the GoRouter, require a static IP so that you can point your DNS server to the load balancer and then point the load balancer at the GoRouter cluster.

If using BOSH links (discussed in “BOSH Links”), static IPs are not required for internal-facing BOSH-deployed components. BOSH links will maintain the IPs for you, and they do not need to be placed in the manifest. External-facing components such as a load balancer will still require a static IP.

There are three different BOSH network types:

Dynamic
Manual
VIP

BOSH abstracts the networking away from the Platform Operator. If you are using a VIP network, you can assign elastic or floating IPs. Manual and dynamic networks allow you to assign private IPs. For a more detailed explanation about how to configure these network types, refer to bosh.io for the latest guidance, as implementation details can change over time.

VM types

VM types is a collection of VM sizes with specific properties. They are created from the stemcell defined in the deployment manifest. Every VM created from the same VM type and stemcell will have the same configuration. The Platform Operator can define an array of VM types each with different VM settings; for example, varying CPU and RAM to offer small, medium, and large VMs to your deployment. Individual deployments no longer need to be concerned with what a small VM means because it is defined once per Director.

Each instance_group instance defined in the deployment manifest will run on a dedicated VM, and so the instance_group must reference the required stemcell and vm-type. This means that each instance group belongs to exactly one vm-type and stemcell combination.

In addition, there is a cloud_properties definition with which the Platform Operator can specify other VM characteristics and IaaS-specific settings that might be required when creating VMs; for example, vm_extension and disk_type.

A manifest for bosh-init might use the precursor to vm_types known as resource_pools. Resource pools contain IaaS-specific information about the VMs.

Configuring the disk type

Disks are used for machine storage. Disk types make it possible for you to specify persistent disks for use with the instance group VMs and compilation VMs (see “Compilation VMs”). It is possible to define two different types of storage solutions: ephemeral and persistent disks. This provides you with the flexibility to use a more cost-effective storage solution for the ephemeral storage and high-grade storage for persistent data.

Consider three different Cloud Foundry instance groups:

API
UAA
etcd

Each instance group has a VM with two mounted ephemeral disk filesystems:

/dev/sda1 with / mounted
/dev/sdb2 with /var/vcap/data mounted

Instance groups that specify a persistent disk (such as persistent_disk: 1024) will have a third filesystem:

/dev/sdc1        1011416    8136    934688   1% /var/vcap/store

Looking at UAA, which has no persistent disk, you will observe two ephemeral disks with no var/vcap/store:

df -k
Filesystem     1K-blocks    Used Available Use% Mounted on
/dev/sda1        2886304 1293408   1426564  48% /
none                   4       0         4   0% /sys/fs/cgroup
udev              498216       4    498212   1% /dev
tmpfs             101760     648    101112   1% /run
none                5120       0      5120   0% /run/lock
none              508784       0    508784   0% /run/shm
none              102400       0    102400   0% /run/user
/dev/sdb2        9182512  478144   8214872   6% /var/vcap/data
tmpfs               1024      16      1008   2% /var/vcap/data/sys/run
/dev/loop0        122835    1583    117321   2% /tmp

Looking at etcd, which does have a persistent disk, we see the addition of /var/vcap/store:

df -k
Filesystem     1K-blocks    Used Available Use% Mounted on
/dev/sda1        2886304 1293640   1426332  48% /
none                   4       0         4   0% /sys/fs/cgroup
udev              498216       4    498212   1% /dev
tmpfs             101760     656    101104   1% /run
none                5120       0      5120   0% /run/lock
none              508784       0    508784   0% /run/shm
none              102400       0    102400   0% /run/user
/dev/sdb2        9182512   57524   8635492   1% /var/vcap/data
tmpfs               1024      12      1012   2% /var/vcap/data/sys/run
/dev/loop0        122835    1550    117354   2% /tmp
/dev/sdc1        1011416    8136    934688   1% /var/vcap/store

In addition, one of the CF instance groups might use a blobstore such as a debian_nfs_server in vSphere or an Amazon S3 bucket in AWS.

The API instance group has mounted the filesystem /var/vcap/nfs. Data written to this location actually resides on an NFS blobstore under /var/vcap/store:

bosh_3x78y23hg@25e1aec0-36aa-424b-a70e-eff65b7b5490:~$ df -k
Filesystem                   1K-blocks    Used Available Use% Mounted on
/dev/sda1                      2886304 1296128   1423844  48% /
none                                 4       0         4   0% /sys/fs/cgroup
udev                            498216       4    498212   1% /dev
tmpfs                           101760     648    101112   1% /run
none                              5120       0      5120   0% /run/lock
none                            508784       0    508784   0% /run/shm
none                            102400       0    102400   0% /run/user
/dev/sdb2                      3057452  360844   2521584  13% /var/vcap/data
tmpfs                             1024      12      1012   2% /var/vcap/data/sys
                                                              /run
/dev/loop0                      122835    1550    117354   2% /tmp
10.129.48.12:/var/vcap/store 103079936 2251776  95568896  3% /var/vcap/nfs

Tip

cf-deployment’s manifest-generation capability will automatically configure the storage for your instance groups. You can reconfigure these after you generate the deployment manifest.

Compilation VMs

Compilation defines the machine settings for the VMs used to compile any individual packages from the release jobs into self-executable binaries, as follows:

compilation:
  workers: 6
  network: my-network
  az: z1
  reuse_compilation_vms: true
  vm_type: m3.medium
  vm_extensions:
  - 100GB_ephemeral_disk

It is generally considered fine to reuse compilation VMs as opposed to using a clean VM for every new compilation. Fewer workers can potentially result in longer deployments because less work might be done in parallel. However, it is worth noting that in the case of AWS, the IaaS might be slow to set up VMs compared to the time it takes to compile packages; thus it might be most performant to have fewer reusable workers.

AZs

AZs provide resiliency. By striping instances from a single instance group across multiple AZs, you ensure availability in the event of a total failure of a single AZ.

To simplify configuration of AZs, they have been pulled out of the deployment manifest into their own section. Previously, to stripe a single instance group across multiple AZs, the Platform Operator had to create multiple resource pools with slightly different cloud properties and multiple instance groups with slightly different names (e.g., web_az1, web_az2). This approach introduced extra complexity and repetition in the deployment manifest.

Since BOSH 2.0, each defined AZ specifies a name and the cloud properties to include any IaaS-specific properties required for placement information. For example, on AWS, AZs are encapsulated by an AWS availability_zone and on vSphere AZs might be a vSphere cluster and resource-pool combination.

The subnets of each network must define which AZ they belong to. Each instance group can be on one or more networks. Because each instance group can span multiple AZs, there must be a way to determine how to describe a network that spans multiple AZs. Most IaaS platforms require a separate subnet per AZ, so BOSH can abstract that away by defining a network based on multiple subnets from different AZs.

For deployments that still use resource pools, their cloud properties should only include VM sizing information such as instance types, RAM, CPUs, etc. When replacing resource_pools for the newer vm_types, each instance group must specify the Cloud Foundry AZ in which to reside.

BOSH Links

BOSH links give the Platform Operator a means to configure a deployment involving multiple VMs, where at least one instance group knows about another; for example, a web server that depends on a database. Traditionally, operators had to assign static IPs or DNS names to one job and pass it via properties to the other. This configuration was error-prone and unnecessary. It was also difficult to automate for the case of on-demand deployments.

Links provide a solution to this problem and abstract away manual versus dynamic (DNS-based) networking from the instance groups. Additionally, you can use links to share other non-networking configurations (job properties) between instance groups.

From the operator’s perspective, introduction of links removes tedious cross-referencing of IPs and other properties between different BOSH releases and deployments.

To take advantage of links in releases, extra metadata needs to be specified by using the consumes and provides directives. Each release job that needs information about another release job must specify consumes with the name of the link type it consumes. Each release job that can satisfy link type must specify provides. BOSH also has a concept of explicit linking of release jobs. If a link type is provided by only one job within a deployment, all release jobs in that deployment that consume links of that type will be implicitly connected to that provider. You can find more information about BOSH links at https://bosh.io/docs/links.

Orphaned Disks

Orphaned disks provide an additional safety feature because losing persistent data is never good. If you require a larger disk size, you can modify this in the cloud configuration manifest (or select a larger instance from the disks listed in the cloud configuration); BOSH will detach the old disk and reattach the new disk, migrating data across to the new disk in the process. The challenge here is that if you rename your instance group (from nameA to nameB), BOSH sees this as a new instance group, as opposed to an upgraded instance group, and will delete the old instance group (and old persistent disks) without moving the data on the old disk. To address this valid but often unintended behavior, when deleting a disk, BOSH keeps old disks around for five days and will garbage-collect them retroactively. Therefore, if you accidentally delete a deployment, you can recover your persistent disk after the fact.

Addons

Addons provide the ability for you to add additional releases to BOSH deployments. This feature is extremely valuable for adding new capabilities to an existing or future deployment. Addons are especially powerful because they provide the ability for the operator to define Director-wide policies. For example, additional releases that you might want to “add on” could include a corporate security module or an IPSec BOSH release to encrypt communication between all deployed VMs. As a positive side effect, addons also remove some of the clutter from deployment manifests.

Your BOSH environment might consist of multiple release jobs (e.g., Cloud Foundry and additional service brokers). There might be requirements to add an additional release job to all deployed VMs. Instead of modifying the deployment manifest to add additional release templates and then redeploying, BOSH employs the concept of runtime-config in the form of BOSH addons. Addons are a new release job that can be co-located with existing instance groups. When a new bosh deploy is invoked, BOSH deploys the addons to the defined deployment.

Summary

This chapter covered the concepts of BOSH, the outer shell responsible for provisioning and running Cloud Foundry.

BOSH is a release-engineering tool chain that provides consistent, reproducible, and identifiable deployments. Consistent repeatability de-risks software releases.

BOSH is an essential component of the Cloud Foundry ecosystem, and it is important that you understand the role BOSH plays for both deploying and verifying the health of your Cloud Foundry environment and related services. Specifically, understanding the BOSH top-level primitives of stemcells, releases, and deployment helps you to understand how Cloud Foundry is deployed.

BOSH is complex, and there is a lot to it. For operators who want to gain a deeper understanding of BOSH, the next three chapters examine the anatomy of a BOSH release and deployments, and then we dive deeper into the individual BOSH components and basic BOSH commands.

¹ The word “patch” is used indicatively in this context; in reality, machines are not patched—they are re-created from a new patched stemcell. This approach further strengthens BOSH’s intentional security posture.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for
10. BOSH Concepts

Chapter 10. BOSH Concepts

Note

Release Engineering

Why BOSH?

The Cloud Provider Interface

Figure 10-1. The Cloud Foundry infrastructure stack

Infrastructure as Code

BOSH on the Server

Creating a BOSH Environment

BOSH Versions

Single-Node versus Distributed BOSH

BOSH Lite

BOSH Top-Level Primitives

Primitives

Figure 10-2. The three BOSH artifacts (deployment manifest, stemcell, and release) that the BOSH Director uses to create a new deployment (for example, a new Cloud Foundry instance)

Stemcells

IaaS Specifics

Releases

Cloud Foundry Release Structure

Deployments

Figure 10-3. A BOSH deployment

BOSH 2.0

Cloud Configuration

Networks

BOSH Networking and IaaS Networking

VM types

Configuring the disk type

Tip

Compilation VMs

AZs

BOSH Links

Orphaned Disks

Addons

Summary

Table of Contents for 10. BOSH Concepts

Create new playlist

Sign In

Sign Up

Chapter 10. BOSH Concepts

Note

Release Engineering

Why BOSH?

The Cloud Provider Interface

Figure 10-1. The Cloud Foundry infrastructure stack

Infrastructure as Code

BOSH on the Server

Creating a BOSH Environment

BOSH Versions

Single-Node versus Distributed BOSH

BOSH Lite

BOSH Top-Level Primitives

Primitives

Figure 10-2. The three BOSH artifacts (deployment manifest, stemcell, and release) that the BOSH Director uses to create a new deployment (for example, a new Cloud Foundry instance)

Stemcells

IaaS Specifics

Releases

Cloud Foundry Release Structure

Deployments

Figure 10-3. A BOSH deployment

BOSH 2.0

Cloud Configuration

Networks

BOSH Networking and IaaS Networking

VM types

Configuring the disk type

Tip

Compilation VMs

AZs

BOSH Links

Orphaned Disks

Addons

Summary

Table of Contents for
10. BOSH Concepts