Chapter 6. Build Engineering in the ALM

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 6. Build Engineering in the ALM

Build engineering is the discipline of efficiently turning source code into binary executables. Build engineering can be as simple as running a Makefile or Ant script and as complicated as writing a full build framework to support the underlying technology architecture. In our book on configuration management best practices we discuss build engineering in depth. In this chapter, we discuss build engineering within the context of the agile ALM. I love build engineering and have always found it to be among the most challenging and rewarding roles within configuration management.

This chapter helps you understand the build within the context of application lifecycle management. We discuss essential aspects of automating the application build with particular attention to techniques for creating the trusted application base. We also discuss baselining, compile dependencies, and embedding version IDs as required for version identification. We discuss the independent build and creating a fully automated build process. Building quality into the build through automated unit tests, code scans, and instrumenting the code is an important part of this effort. Finally, we will discuss the ever-challenging task of selecting and implementing the right build tools.

6.1 Goals of Build Engineering

The goal of build engineering is to be able to reliably compile and link your source code into a binary executable in the shortest possible time. Build engineering includes identifying the exact compile and runtime dependencies, as well as any other specific technical requirements, including compiler (linker and managed environment) switches and dependencies. Build engineering improves both quality and productivity for the entire team. We believe that the build engineering team should consider themselves to be a service function, with the development team as their primary customers. However, there are times when build engineering must also have the authority to enforce organizational policies. As build engineers, we provide a service to support the development effort, but our primary goal is to help secure the assets of the firm that are built and released through the build engineering function.

6.2 Why Is Build Engineering Important?

Build engineering helps the development team by providing an accurate and repeatable way to compile and link the code in the fastest possible way. Being able to rapidly rebuild a release enhances productivity by facilitating software development. Fast builds are important for any software development methodology, a fact that agile and iterative development have highlighted for some time now. Getting the build right also avoids serious problems that could potentially have catastrophic impacts upon the development team as well as the entire organization.¹ Build engineering is important because it can improve both the quality of the application that you are developing and the productivity of the entire organization involved.

1. Bob describes an incident in which New York Stock Exchange systems crashed, affecting the world economy, in Section 6.8.

6.3 Where Do I Start?

We always start by looking at the existing development build procedures. Sometimes you will find that the development team already has existing build scripts, perhaps using Ant, Maven, or Make. Often, the existing build procedures will only handle deployment to the development test environment. It is pretty common for a build engineer to be required to take an existing build script and modify it to support QA and production environments. Legacy build scripts may also fail frequently and require developer expertise in order to support them. If you are the build engineer, then your job will be to make these scripts more reliable and supportable. We have often found that it is best to begin by understanding the application so that we understand what we are trying to build. Sometimes the architecture will be complicated enough that you may need to partner with the developer in order to write a suitable build system. Make sure that you start by evaluating the existing build tools and processes before you attempt to improve them, implementing build-engineering best practices at every step.

Dangers of IDEs

Most developers rely heavily upon their integrated development environment (IDE), which is a tool that usually includes an “intelligent” code editor, a compiler, a debugger, and a graphical user interface (GUI) builder. IDEs help developers work more productively, but they also can pose a challenge for build engineering because developers typically forget the settings that they configured when they first got started with the development effort. In order to create a repeatable build, we usually have to help the developers figure out where they configured their compiler switches, classpath, and other environment settings. It is common to learn that many developers have different configurations, which are inherently not compatible. The build engineers’ command-line procedures end up becoming the single authoritative configuration.

Done well, build engineering allows you to tackle the complexity of understanding the overall build and create safeguards to avoid common mistakes.

6.4 Understanding the Build

Application builds can be very complicated and often involve many components that each has its own dependencies, sometimes on multiple platforms. We have seen builds where the dependencies were so complicated that no one person actually understood the entire build. Sometimes this is because the build spans multiple technologies and even multiple platforms. But sometimes builds are just written in a ridiculously obtuse and overly complex way. To really understand the build, you need to understand the components from a complete ALM perspective. For me, this often means diving into Visual Studio, C#, and .NET one day and Eclipse, Java, and J2EE the next. A few hours later, we may be focusing on node.js and MongoDB. The next day, I may be focusing on mainframe JCL, Cobol, and Clists. Build engineering in the ALM can get very complicated very quickly, so the last thing that we need is a developer who makes any component of the build process more complicated than it needs to be.

Bob’s Build Nightmares

As a build engineer, Bob has seen many really bad builds. One of the worst builds that he saw was written by a colleague who had his PhD in a field that had little to do with computer science. The approach and logic of his build scripts were so overly complicated that no other member of the development team actually understood the build. When it broke, he was usually on vacation and then, as Murphy would dictate he went on to another company—leaving this unwieldy mess to the rest of the group to rewrite.

Fundamentally, the code should only be built once and then configured for each of the environments. Build automation tools such as Ant, Maven, and Make are typically used for the compilation of source code to binary executable, which should only occur once. Subsequent automated procedures should be designed to simply configure the build for each of the environments and then deploy to the target location. Understanding the entire build is essential, but the next step is automating the process so that it is repeatable, verifiable, and traceable.

6.5 Automating the Application Build

We always automate every single step of the application build, package, and deployment across the entire ALM. Doing things manually results in mistakes and lots of time wasted due to rework. Even a one-line command is best done in a script to avoid any possible mistakes. Most of our code tests each step of the build so that any and all problems are detected immediately. We often refer to failing fast to indicate that a build script should identify a problem and stop the process immediately once an issue is detected. In manufacturing, we talk about “stopping the line,” which is a reference to stopping the manufacturing process when a serious problem is discovered. Too often, automated scripts do not test themselves and the script goes ten lines down before the problem is discovered, and then it takes a lot more time to backtrack and figure out exactly what went wrong.

Bob on Drugs

I recall visiting my doctor when I was so sick that she considered having me admitted to a hospital. After the exam, my doctor agreed that I could just go home, provided that I went straight to bed and stayed home from work for at least a few days. So I took the medication that the doctor prescribed and went straight to sleep as promised. Not surprisingly, I woke up to my cell phone ringing, only to hear my boss insist that it was urgent that I do a build, package, and deployment. Fortunately, I had automated every step of the build and each step tested itself. The scripts were numbered sequentially from 1 through 10, which was just about the limit of my cognitive ability that day. I did the build successfully and then went back to sleep. I never thought it would happen, but I had built and deployed a pretty complicated financial system while I was effectively under the influence of drugs.

Automating the build is essential, especially if you want to have a secure and verifiable build, which we refer to as the secure trusted application base.

6.6 Creating the Secure Trusted Base

There have been many high-profile cybersecurity breaches in recent memory, each of which landed on the front page of many newspapers. Organizations foolishly rely upon virus scans to identify malware that has been left on their servers by malicious hackers. This approach has very limited success because it relies upon knowing the signature of the malware in advance, which is obviously unrealistic and woefully insufficient. We usually only find out about an attack after a machine has been compromised and then forensically analyzed.

What works much better is to know exactly what was built and have an automated procedure to verify that the code was deployed correctly. In the next section, we will discuss how to baseline your code once it is built, packaged, and deployed. But first we have to ensure that we have a reliable way to verify that every single file has been successfully deployed. During the code build, we can embed immutable version IDs into each component and create cryptographic hashes (e.g., MAC SHA1, MD5) to use in verifying that the code has been successfully deployed. With every build of a package, you should also create a manifest that contains the complete list of configuration items included, the embedded version ID, and the cryptographic hash to be used for verification. This approach enables you to completely ensure that the correct configuration items (CIs) were successfully deployed. The next step is to automate the detection of any unauthorized changes, whether through malicious intent or human error. This ability requires that you establish baselines of the runtime environment itself.

6.7 Baselining

Once code has been successfully deployed, you need to be able to detect unauthorized changes, whether they be through malicious intent or simply human error. My approach is to create cryptographic hashes on the essential configuration items; these should be monitored and verified regularly, usually on a nightly basis.

Surprises in Production

Bob worked at a large bank that had repeated outages due to unauthorized changes, which were only detected after an outage occurred. Once the runtime environment was baselined and monitored, we began to learn that the middleware administrators were making changes that they thought would help the applications run more efficiently. Unfortunately, these admins were not communicating with other members of the teams and the changes that they were making not only violated change control policy, but also were causing production outages. Baselining provided us with an early detection system that allowed us to proactively identify the changes and work with the other members of the team to revert the changes before there was any adverse impact.

The challenge is that there are often too many files changing for legitimate reasons, resulting in what we will call false positives, so we need to ascertain which CIs should be monitored and which ones can be safely ignored. In the real world, it is not so easy to tell which CIs are important and which ones can be safely ignored. The best approach is to start monitoring from the very beginning of the software or systems lifecycle by establishing baseline monitoring in the development test environment. The challenge in establishing these baselines is knowing what to monitor. Baselining often fails because there is no easy way to ascertain which configuration items really should be monitored. Once the project is completed, many developers move on to new projects and may not be available for consultation and troubleshooting. We need to engage with the developers early in the lifecycle while the code is being written and the deep technical knowledge is available to assist with implementing these procedures. DevOps provides us with the principles and practices to ensure that developers and operations folks can collaborate and communicate effectively to ensure that we know exactly which configuration items should be monitored and the best approach for doing so. Another important capability is to embed version IDs into these key configuration items, in a process that is known as version identification.

6.8 Version Identification

During the build, unique immutable version IDs must be embedded into each significant configuration item. This crucial step is often overlooked, but can be accomplished in several different ways. First, you can use build tools such as Ant, Maven, or Make to embed a unique version ID into the manifest of the build container and the configuration items themselves. You can also include some code that is designed to ascertain and display the version ID in your configuration items. Version identification allows you to verify that the essential components have been successfully deployed, and together with efficient use of cryptographic hashes such as MAC SHA1 or MD5, allows you to verify that all of your CIs have been successfully deployed and to detect unauthorized changes. Although many groups manage to create these automated procedures for individual code components, we really need to take a full ALM approach to ensuring that the complete codebase is secure across the entire system.

Stopping the World Economy

Bob was once told that he had made a mistake that led to the crash of a critical trading system on the floor of the New York Stock Exchange. The mistake involved deploying the wrong version of two shell scripts that were essential to starting up this critical system. Fortunately, the two scripts had embedded version IDs, and it took less than five minutes to investigate. Based upon the forensics, Bob and the other systems administrators were able to determine the actual cause of the problem, which was still online and risked causing another outage. Bob had in fact deployed the correct scripts and there was another unrelated miscommunication that led to the scripts being overwritten by one of the systems administrators. Embedding version IDs into the configuration items made it easy to determine the root cause of the problem and prevent another outage from occurring. This technique is known as a physical configuration audit.

Another common source of errors in the ALM is a lack of understanding of compile dependencies.

6.9 Compile Dependencies

Compile dependencies can be very complicated to ascertain and fully comprehend. While observing the build, reviewing build scripts, and tracking down the settings in the developers’ IDE are essential, there are also some considerations unique to the agile ALM. Early in the lifecycle, developers are often learning new technologies and determining the best approach to building application components. Unfortunately, this information often becomes a distant memory and may even be lost when developers, perhaps even consultants, move on to their next project. You need to capture this information from the very beginning of the software and systems lifecycle. The agile ALM requires that we start thinking about production deployments from the very beginning of the lifecycle because this essential expertise simply may not be available later on. Creating an effective build is an ALM consideration.

6.10 Build in the ALM

Application lifecycle management takes a broad view, from collecting requirements to design, development, and even ongoing support after deployment. The build is an integral part of this process and needs to be considered throughout the entire ALM. What we often see is that developers have very elaborate build, continuous integration, and deployment procedures to test environments that do not match the approach that will be used to build, package, and deploy the code to production. Developers rarely think about verifying that the code has been successfully deployed using production-ready procedures. This is problematic for two important reasons. First, developers often get delayed by unexpected changes that were made due to human error or poor communication between the team members. Having sufficient IT controls in place from the very beginning of the process can avoid these types of mistakes.

Second, and even worse, changing the build, package, and deployment procedures in the middle of the software or systems development lifecycle is a common source of mistakes. DevOps best practices focus on using the same procedures and automation for deployments to every environment from development test to user acceptance testing (UAT) and production. Consistent approaches to builds are essential. Another best practice is the independent build.

6.11 The Independent Build

One of the most important best practices is the independent build. There is something about the process of trying to clearly communicate your build procedures to another person that almost always results in identifying the code that developers forgot to check into version control as well as build dependencies that were not well understood and documented. The ALM is the structure that can help identify and avoid these potential pitfalls. Organizationally, the build engineer should not report to the development manager so there is segregation of duties. Although, as a build engineer, I like to sit next to the developers so that we learn how the system works and we are plugged into the flow of the development effort. However, having the build engineer report to development frequently results in undue pressure to bypass established controls—a situation that can create a great deal of risk in the ALM. When regulatory compliance is not legally required, then another functional approach can be to use automation to create a “virtual” segregation of duties.

6.12 Creating a Build Robot

We see many teams creating fully automated application build, package, and deployment to test environments. Using a service account, this essentially comes down to creating an independent build-engineering robot. Although this is a best practice, it may not meet your organization’s regulatory requirements for segregation of duties. More importantly, you need to ensure that the build procedures result in fully traceable procedures that can also be used to deploy code to all of the environments, including production. Another consideration is building quality into the process from the very beginning.

6.13 Building Quality In

Building quality into the process requires that you do the right thing from the very beginning of the software and systems lifecycle. The first reason for this is that these procedures can be very complicated, and starting from the beginning gives you a chance to learn and understand all of the essential technical details. The DevOps approach is to involve operations from the very beginning of the lifecycle. Getting DevOps involved in development has become known as “left-shift” and significantly improves the quality of the overall application build, package, and deployment effort. One aspect of this process to consider is implementing effective unit testing.

6.14 Implementing Unit Tests

Creating effective automated unit tests is a first step that is often overlooked in the software development process. It is very difficult, and often impossible, to create unit tests at the end of the lifecycle. The only effective approach is to begin creating unit tests from the very beginning of the effort. Many developers prefer to use test-driven development (TDD) so that the unit tests are actually created before the code itself. We view testing automation to be a must-have along with automating the code analysis.

6.15 Code Scans

The build process is the ideal time to automate code scans for a variety of purposes. We have worked with teams to automate code scanning to identify potential security and quality issues in the code. It has become common to also use code scanning to programmatically identify open-source components that need to be tracked for licensing compliance and potential downstream security vulnerabilities. Some of these code scans can be performed through static code analysis. However, sometimes it can be very useful to build a variant in the code suitable for instrumenting it.

6.16 Instrumenting the Code

Building the code for a specific purpose often involves instrumenting the code to contain libraries that enable dynamic analysis of the code during actual execution. This is used most often for performance, but also has applications in security and even quality. Excellent build engineering makes this approach viable and depends largely upon the quality of the build tools themselves.

6.17 Build Tools

The selection of build tools can be challenging. We are often deeply involved with evaluating and selecting build tools to automate the application build, package, and deployment. It is essential to always evaluate two or three leading tools based upon well-defined evaluation criteria. We are often involved in the proof-of-concept (POC) and final bake-off between the best two or three tools evaluated. It is wise to spend a fair amount of time on properly selecting build tools, as they will have a huge impact on your team’s productivity and the quality of the system that you create.

6.18 Conclusion

Build engineering in the ALM depends upon both good processes and good tools. DevOps has driven many recent changes in the space, including the acknowledgment that DevOps should be involved from the very beginning of the software and systems lifecycle and also that having consistent procedures across the entire ALM is an absolute necessity.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 6. Build Engineering in the ALM

Create new playlist

Sign In

Sign Up

Chapter 6. Build Engineering in the ALM

6.1 Goals of Build Engineering

6.2 Why Is Build Engineering Important?

6.3 Where Do I Start?

6.4 Understanding the Build

6.5 Automating the Application Build

6.6 Creating the Secure Trusted Base

6.7 Baselining

6.8 Version Identification

6.9 Compile Dependencies

6.10 Build in the ALM

6.11 The Independent Build

6.12 Creating a Build Robot

6.13 Building Quality In

6.14 Implementing Unit Tests

6.15 Code Scans

6.16 Instrumenting the Code

6.17 Build Tools

6.18 Conclusion

Table of Contents for
Chapter 6. Build Engineering in the ALM