Chapter 9. Continuous Delivery and Deployment

Continuous deployment (CD) is a methodology for updating production systems as often as necessary and generally in very small increments on a continuous basis. It would be difficult to understand continuous deployment without discussing continuous integration and delivery. The terminology for CD has been confusing at best, with many thought leaders using the terms continuous delivery and continuous deployment interchangeably. We discussed continuous integration in Chapter 8. Continuous delivery focuses on ensuring that the code baseline is always in a state of readiness to be deployed at any time. With continuous delivery, we may choose to perform a “technical deployment” of code without actually exposing it to the end user, using a technique that has become known as feature toggle. Continuous deployment is different from continuous delivery in that the focus is on immediate promotion to a production environment, which may be disruptive and a poor choice from a business perspective. Throughout this book we will focus on deploying code as often as desired. The key to successfully implementing CD is focusing on specific goals.

9.1 Goals of Continuous Deployment

The goal of continuous deployment is to reduce human error by deploying code in small increments, thereby reducing risk and ensuring that the team has the capability to deploy code immediately—even in the middle of the workday. This may not be a desirable goal and may even be disruptive to the business. End users often do not want to see changes constantly, even if they are deployed flawlessly and deliver desirable features, because sometimes there is simply too much change, which can be disruptive. Continuous delivery takes a more pragmatic approach by ensuring that the baseline could potentially be deployed at any time and providing a means to deploy features that are not yet exposed to the user. We will discuss these techniques in more detail throughout the rest of this chapter. It is worth noting that continuous delivery and deployment are each focused on deploying to production environments and not just to development test environments, which is more the concern of continuous integration. The DevOps focus, as always, is to deploy using the same procedures to every environment and as early as possible to production-like environments. We also have a goal of ensuring that teams understand and use related terminology consistently.

Next we will discuss why continuous deployment is important.

9.2 Why Is Continuous Deployment Important?

Continuous deployment (and continuous delivery) is important because it provides the framework for significantly improving reliability of application deployments by updating production systems more often and in smaller increments. Business often needs features or bugfixes delivered as soon as possible to stay competitive in today’s demanding environment. By performing deployments more often, we get better at doing them and we make fewer mistakes with each successive effort. Having the capability to update a production system at any time is a huge competitive advantage. Automating the entire process and having a robust framework for deployments also helps improve reliability while ensuring that key business functions can be updated as often as necessary. Continuous deployment is important because it allows us to support the business more effectively by improving reliability while delivering essential features and bugfixes in a timely manner.

9.3 Where Do I Start?

Getting started with continuous deployment (and delivery) requires that you first understand your existing practices. We strongly recommend that you perform a comprehensive assessment of what works well and what could be improved. It is never a good idea to fix things that are not broken, although we have seen some well-meaning folks try to do exactly that when adhering blindly to following process improvement guidelines. When we do these assessments, we compare existing practices to the guidance provided in industry-relevant standards and frameworks, which we will discuss further in Chapter 16. The next step is to create a list of things that could be improved, in order of priority. Most importantly, we don’t always start with the most important items to fix. It is helpful to pick a few easy items to address first, which helps your team realize how to make things change and helps especially to develop a culture that values process improvement and believes that it can happen!

Sadly, many teams are mired in the status quo, adopting a defeatist attitude that sabotages change.

Continuous deployment (and delivery) has an ultimate goal of establishing a fully automated deployment pipeline.

9.4 Establishing the Deployment Pipeline

Manual deployments all too often result in increased risk of human error. Continuous deployment places a strong focus on creating a fully automated deployment pipeline. Teams that cannot master this approach introduce significant risk in the deployment process, including the risk of human error. Further, the impact of delayed features and bugfixes can significantly disrupt the business agility of the firm. Imagine a team providing tax software that cannot roll out a feature on or before April 14 in response to a late-breaking IRS ruling!

Establishing the deployment pipeline is no easy task, and we prefer to take an agile iterative approach when creating the fully automated deployment pipeline. The first step is always to understand existing practices. We like to create checklists and start by scripting each and every step no matter how small or simple. At least half of our code is testing the actual deployment steps themselves.

Even copying a file should be handled with a script verifying and validating that the copy was made successfully.

Traceability to document, verify, and communicate that all tasks have been completed is essential.

One very successful approach is rapid incremental deployments.

9.5 Rapid Incremental Deployment

Rapid incremental deployment is an approach where you quickly script and automate each step while performing your deployments. I usually start the first time by creating a checklist and then instead of trying to create a monolithic script that does everything, I create many smaller scripts that break the deployment into smaller chunks. For me, the train is always moving forward, and I rarely get much time to automate my deployments. I usually have to work with a team already in motion, meeting deadlines and ensuring that each release is deployed on time and without error. I take an iterative approach to writing my own deployment scripts, and I try to delivery very quickly. In the beginning, I am often pausing, reading the screen, and hitting Enter over and over again. With each subsequent release, the scripts get more mature and robust as we iteratively create the deployment pipeline.

Embracing change is normal when implementing rapid incremental deployment.

Taking an incremental approach not only helps avoid risk, but it also helps avoid the defeatist attitude where folks believe that they cannot successfully embrace change, such as including automating application deployments of systems that are still under construction. Two key concepts should always be remembered. The first is that we want our deployment process to be the same across all environments, from deployment test to production. Too many teams create highly automated deployments to development test environments, but then jeopardize production by doing something completely different there. The second consideration is that it is extremely important to be deploying to production-like environments from the very beginning of the process. This is where container-based deployments have considerable potential, although some challenges remain to be addressed.

Make sure that you have a repeatable process for creating your containers, whether that be through a build of a declarative file or through an automated script. Using containers should never be an excuse for engaging in practices that lack traceability. Container-based deployments are an evolving capability and will mature and become a key practice within continuous deployment and delivery.

Much of this discussion has been at minimizing risk, which is a key consideration in continuous deployment.

9.6 Minimize Risk

Risk, often unavoidable, is not necessarily negative, especially if it is properly identified and accounted for. Continuous deployment reduces risk, but too often we see colleagues who fail to realize the importance of identifying and focusing on risk management. When we create automated deployment pipelines, we consider which steps have a high risk of human error. By following the manual process through the first time, we often notice that some steps are more error prone than others. Although picking some easy steps to automate is a great place to start, we also want to consider which steps have identifiable technical risk. These steps often involve environmental dependencies and interfaces to other parts of the system. Much of our code is intended to test, verify, and validate that any potential issues are quickly addressed. Technical risk should also be identified and reviewed during the change control process, as discussed in Chapter 10. When managing risk, we suggest that you initially script with the assumption that an operator will be running and reviewing each step. We call this approach attended automation.

The most important consideration in addressing risk is moving away from the monolithic deployment to a series of small steps. Many small deployments are inherently safer than a big-bang approach to application deployment.

9.7 Many Small Deployments Better than a Big Bang

The first thing that we do when addressing a troubled deployment process is to convince the team to break the deployment into smaller pieces that can each be deployed separately. Most often, we move from deploying every other month—usually through the weekend—to deploying twice a week in a much shorter deployment window. Obviously this won’t always work, especially if the deployment involves a large infrastructure change, but, in practice, breaking monolithic deployments into many smaller deploys eliminates many sources of error and significantly reduces risk. Even if things do go wrong, it is much easier to identify the problem and either fix the issue or back out the change altogether. In general, many small deployments are much better than a big-bang approach.

Aside from being easier to accomplish, the team just gets better at deploying code and quickly adopts a new and much healthier attitude. Communication styles improve, and everyone gets more confident that they can be successful without making costly mistakes. The change in culture is often readily apparent as the team becomes a cohesive unit. These new capabilities can also help them handle larger deploys when they are unavoidable.

Practicing the deployment procedures in a nonproduction environment is also essential for success.

9.8 Practice the Deploy

Developers frequently choose to handle their own application deployments in the lower environments, and this continues until the application is ready to be promoted to the upper environments. We will discuss a technique that is becoming known as “left-shift” in Chapter 12, whereby operations “shifts left” by getting involved with deployments to development and integration test environments in order to practice deploying the code in production-like environments earlier in the lifecycle. We also find that it is essential to practice the deployment procedures in order to get comfortable with all of the required steps and automation. Often technology professionals fail to realize the importance of practicing the deployment in order to train effectively and gain the confidence necessary to be successful. Imagine a pilot who never practiced on a flight simulator before taking you on your next six-hour flight! Similarly, we find that many groups only have one deployment engineer and then they are susceptible to keyman risk. We saw one team make huge progress with their application deployment capabilities, only to suffer a major outage when one member of the team went on vacation. It is essential to ensure that you have sufficient resources who are each very familiar and comfortable with the tools and procedures necessary for deployment. This is why ensuring that deployments are repeatable and traceable can be absolutely essential. One key consideration is whether or not your team has a culture of learning and teaching.

Teaching and learning are fundamental skills required to create deployment processes that are repeatable and traceable.

9.9 Repeatable and Traceable

Too often, we see individuals who can perform tough technical tasks, including complex deployments, as one-off activities, but struggle when it comes to creating a repeatable process. We would prefer to have a slightly less fancy approach that is repeatable and fully traceable. We find that developers can do a great job of showing us the way through the deployment process the first time through, but it takes an operations view in order to make it a repeatable process. This is also true for creating the traceability that is often a requirement for compliance, as discussed in Chapter 16. Traceability comes into play during many aspects of the deployment process. Scripts should always be logging their results, including snapshotting environment settings. More importantly, processes will only be reliable and repeatable if they are implemented using a workflow automation tool.

9.10 Workflow Automation

Workflow automation tools help the entire team understand their roles and responsibilities, along with their own required tasks and dependencies, while being aware of the outstanding and completed tasks from others that may need to be completed throughout the ALM. We discussed automating the agile ALM in Chapter 7. Workflow automation is essential in the continuous deployment process. First, we need to track the steps of the deployment itself, but then we also need to establish feedback loops to report both successful steps and especially when exceptions occur. Processes themselves may sometimes necessarily be fluid to meet the dynamic requirements of the business and the team.

One key aspect of workflow automation is facilitating the orderly assignment of work, a function best accomplished using principles from Kanban.

9.10.1 Kanban—Push versus Pull

Managing the flow of work can be challenging. We acknowledge a tendency in our practice to welcome any assignment, no matter how large or small. Bob’s grandmother washed down railroad trains by hand, so he inherited a work ethic to never turn down a reasonable task. Unfortunately, hardworking folks with this view often find themselves with more tasks than they can possibly complete in a timely manner. One key to success in these situations is correctly identifying the priorities and communicating status, especially if a task will not be completed on time.

The truth is that there is a cost to having too many balls in the air at any point in time. Kanban enthusiasts call this work-in-progress (WIP). Kanban teaches us to limit WIP and then to pull tasks when we are ready to start them. Unassigned tasks aging in the queue also help us by identifying that we might need to hire more resources.

Many workflow automation tools make it easy to manage work using Kanban concepts. Managing incoming requests is essential, as is understanding the ergonomics of deployments.

9.11 Ergonomics of Deployments

Ergonomics is the study of how people work efficiently in their environments. Closely related is the study of human factors, especially in terms of identifying their source. The cockpit of a plane is a great example of ergonomics, as pilot controls are typically designed to reduce the chances of human error. The next time you board a plane, see if you can take a quick peek at the cockpit controls, which provide a considerable amount of information to the pilot and copilot, while their design is intended to ensure that the complex controls are read accurately without the risk of human error. Deployment automation can learn a lot from pilots and other aviation experts. The deployment pipeline should always be designed to avoid any chance of human error. This means that scripts and dashboards provide accurate and timely information in a way that is likely to be interpreted correctly, avoiding mistakes that can potentially lead to failed deployments.

There are many circumstances in which mistakes are simply not acceptable. Mission-critical systems from missile defense to life support systems are a good example of applications that must be updated without any chance of human error.

Sometimes it can be difficult to test all aspects of complex deployments because there is no human interface readily available to facilitate testing. This is where verification and validation (V&V) can sometimes help fill the requirement that quality standards are met.

9.12 Verification and Validation of the Deployment

We see many situations where the usual testing approach can be impractical or even impossible to accomplish. This is where verification and validation can be a practical alternative—sometimes used with whatever testing approaches are available. For example, we may find that infrastructure configuration changes are difficult to test in production environments. In these circumstances, it may be helpful to use system tools, with ping being a simplistic example, to verify and validate that the machine is at least up and running. This approach usually requires the ability to think creatively about what aspects of the system can be checked using the available tools. Some technologies are built with a restful application programming interface (API) that can also be used for this work. Most often this approach is closely related to environment monitoring, which is often overlooked in many organizations. Monitoring the environment is discussed further in Chapter 11.

It is fundamentally essential for the application deployment to be fully verifiable via a methodology that we call the secure trusted application base.

9.13 Deployment and the Trusted Base

Deployments must be fully verifiable. Very often teams are haphazard with how they handle application deployment. We have seen many teams act as if they are unable to ensure that the deployment was completed successfully and also that any unauthorized changes can be quickly identified and remediated.

Deployments should actually be completely deterministic, with code written to ensure that we are completely certain that the right code has been deployed and that there have been no unauthorized changes due to human error or malicious intent. This is the goal of having a fully trusted application baseline. We approach this effort by starting with code that is correctly baselined in the version control system (identified with a tag, label, and often an immutable changeset). We strongly advocate that the code be independently built by a build-and-release engineer, based upon a locked version label as described in Chapter 6. The deployment pipeline then automates the application deployment and verifies version IDs built into the configuration items (CIs).1 We also monitor the cryptographic hashes (e.g., MAC SHA1 or MD5) to detect unauthorized changes.

1. Configuration items refer to any source code, binary, configuration file, or other artifact that is part of the system being built. Unfortunately, continuous integration adopted the same acronym of CI, which has caused some confusion, especially in organizations that need to use industry standards (e.g., ISO, IEEE) and frameworks (ITILv3) where the acronym CI most often refers to configuration item.

Monitoring baselines can be impractical because so many files, such as your message logs, will change constantly. Our approach is to start early identifying the files that should be monitored on a regular basis. This is a key part of our DevOps-focused approach to security that we call continuous security, which we will discuss further in Chapters 11 and 12. Securing the application baseline requires a great deal of technical knowledge, which must be gathered from the very beginning of the application development process. This is easier said than done, and we find that we have to employ some tactics to identify this technical information, which is often the basis of environment monitoring.

Identifying dependencies can be very challenging, especially when development and QA testing environments may not match the target production environment. We sometimes hear that a feature worked fine on the developer’s laptop or in the QA test environment, only to crash in production. Failure to identify potential points that could be problematic obviously introduces considerable risk. This is where deploying to environments that mirror production is essential.

9.14 Deploy to Environments that Mirror Production

Production environments often grow to be complex and robust platforms, which may not even closely resemble what are often limited test sandboxes. This not only creates a challenge for testing, but may also introduce considerable risks during the application deployment. We often see developers implement elaborate continuous integration procedures that completely automate the application build, package, and deployment to a test environment. But then the team does something different for production and things go badly wrong. We feel that the deployment procedures must be identical for each environment and that the environments should closely match production as much as possible. This is not always possible, and where discrepancies exist, they should be identified as risks, with appropriate procedures to ensure that service interruptions do not occur as a result.

Risk is not always bad, and continuous deployment has many points in which risk may need to be identified and appropriate plans created to mitigate it.

9.15 Assess and Manage Risk

Throughout the continuous deployment process, we need to always assess and manage risk, which may not always be avoidable. Risk may be present in the form of sources of human error, as we discussed, or there may be other types of risk based upon the architecture of the system and the technologies chosen for implementation. We find that technical risk is a common problem and goes hand in hand with using the newest technologies. Adopting the latest technology frameworks may give you many capabilities in terms of performance and usability, but being on the bleeding edge is also not without its own drawbacks. In continuous deployment, we may need to constantly adjust our procedures to match the new technology and its evolving requirements.

Managing technical risk often comes with any new technology. We find that it is essential to learn the new technologies. Next up is the dress rehearsal and walkthroughs.

9.16 Dress Rehearsal and Walkthroughs

We like to conduct deployment rehearsals and walkthroughs so that every stakeholder fully understands what we are doing and also so that we have an opportunity to review and verify all of our procedures before the actual rollout. The dress rehearsal is basically on-the-job training. But the walkthrough more closely resembles a code review or inspection.

During this effort, we reinforce everyone’s role and responsibilities, along with a well-defined communication protocol. In very large deployments, we usually set up a command center and have designated escalation procedures so that all resources are available in case anything happens that is unexpected or more serious problems occur. We have seen very large banking systems implemented this way, often with a couple of hiccups, but with the right procedures in place to manage the overall process, including incident and problem resolution.

Many other fields also use dress rehearsals and walkthroughs.

In the real world, we often cannot avoid risks and challenges during the deployment process. Dress rehearsals and walkthroughs are an effective methodology for addressing these challenges. In fact, handling the real-world scenario of imperfect deployments is exactly where continuous deployment has the most value.

9.17 Imperfect Deployments

Simple deployments are nice work when you can get them. But we more often find ourselves in demand when the deployment process has some unavoidable wrinkles and challenges. The imperfect deployment is exactly where our approach is most expeditiously applied and returns the most value. We commonly find ourselves dealing with scenarios that are quickly changing, making it difficult to fully automate our procedures. Test environments may not match fully, and manual testing procedures can greatly slow down the verification and validation of the system being deployed. The first time you deploy any technology may involve its own set of challenges, including learning the new technology or deployment procedures themselves.

Frequently we find teams with dysfunctional communication patterns, which must be addressed. Many organizations have poorly defined critical incident management procedures and an immature response to complex problem management. The imperfect world is actually where our approach has the most value and, truthfully, the real world is often an imperfect place. The key is always to assess where you are at the beginning of the process and then measure your success as you iteratively improve it.

Our most important lesson learned is always to have a plan B.

9.18 Always Have a Plan B

For every situation that is not perfect, which is basically every situation we find ourselves in, we always identify the risk as best we can. We try to understand what might go wrong and then we do some contingency planning. In continuous deployment, you cannot always control the runtime environment, technology risk, and related changes in the underlying technology platform. The key to your success is to always create a plan for dealing with any contingencies should they arise. This scenario is much more common than you might imagine. Successful teams establish a rhythm and a protocol, and approach each challenge with a can-do attitude. Having a plan—however brief—for dealing with unforeseen circumstances is essential for handling the real-world challenges that always seem to surface in complex production deployments.

Having a plan B is helpful, but the real key is leading the team to high performance.

One final consideration that deserves attention is the importance of continuous deployment, including a testing approach to verify that the deploy itself was successful. In a deployment context, we call this smoke testing.

9.19 Smoke Test

We find many deployment teams do not realize that smoke testing is part of the overall process. Deployments are part of the QA and testing process, and smoke testing is the last step of continuous deployment.

We find it valuable to actively participate in the smoke-testing process.

As noted, smoke testing is the last step in the deployment process. This underscores the reality that quality is everyone’s job, and we view ourselves as playing a key role in the testing effort.

9.20 Conclusion

Continuous deployment and delivery are capabilities that are emerging as an industry best practice. In this book, we will push back against the notion of single pushbutton deploys in favor of completely reliable and secure deployments that can be executed as often as business needs dictate. Closely related is continuous delivery, with its focus on ensuring that we always have a deployable baseline and that features can be rolled out, but hidden, until the business decides that it is the right time to expose them.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.139.97.53