Preface

Modern organizations are increasingly using dynamic cloud platforms, whether public or private, to exploit digital technology to deliver services and products to their users. Automation is essential to managing continuously changing and evolving systems, and tools that define systems “as code” have become dominant for this.

Although cloud technology and infrastructure coding tools are becoming pervasive, most teams are still learning the best ways to put them to use.

This book is my attempt to share what I’ve learned from various people, teams, and organizations. I’m not giving you specific instructions on how to implement a specific tool or language. Instead, I’ve assembled principles, practices, and patterns that you can use to shape how you approach the design, implementation, and evolution of your infrastructure.

These draw heavily on agile engineering principles and practices. I believe that given cloud-based infrastructure is essentially just another software system, we can benefit from lessons learned in other software domains.

Infrastructure as code has grown along with the DevOps movement. Andrew Clay-Shafer and Patrick Debois triggered the DevOps movement with a talk at the Agile 2008 conference. The first uses I’ve found for the term “Infrastructure as Code” are from a talk on Agile Infrastructure that Clay-Shafer gave at the Velocity conference in 2009, and an article John Willis wrote summarizing the talk1. My own Infrastructure as Code journey began a decade or so before this.

How I Learned to Stop Worrying and to Love the Cloud

I set up my first server, a dialup BBS,2 in 1992. This led to Unix system administration and then to building and running hosted applications for various companies, from startups to enterprises. In 2001 I discovered the Infrastructures.org website, which taught me how to build servers in a highly consistent way using CFEngine3

By 2007, my team had accumulated around 20 1U and 2U physical servers in our office’s server racks. These overflowed with test instances of our company’s software, along with a variety of miscellaneous applications and services - a couple of wikis, bug trackers, DNS servers, mail servers, databases, and so on. Whenever someone found something else to run, we crammed it onto an existing server. When the product folks wanted a new environment, it took several weeks to order and assemble the hardware.

Then we learned that virtualization was a thing. We started with a pair of beefy HP rack servers and VMWare ESX Server licenses, and before long, everything was a VM. Every application could run on a dedicated VM, and it took minutes to create a new environment.

Virtualization made it easy to create new servers, which solved one set of problems. But it led to a whole new set of problems.

The Sorcerer’s Apprentice

A year later, we were running well over 100 VMs and counting. We were well underway with virtualizing our production servers and experimenting with Amazon’s new cloud hosting service. The benefits virtualization had brought to the business people meant we had money for more ESX servers and for shiny SAN devices to feed our infrastructure’s shocking appetite for storage.

We created virtual servers, then more, then even more. They overwhelmed us. When something broke, we tracked down the VM and fixed whatever was wrong with it, but we couldn’t keep track of what changes we’d made where. We felt like Mickey Mouse in “The Sorcerer’s Apprentice” from Fantasia (an adaption of the von Goethe poem).

Well, a perfect hit!
See how he is split!
Now there’s hope for me,
and I can breathe free!

Woe is me! Both pieces
come to life anew,
now, to do my bidding
I have servants two!
Help me, O great powers!
Please, I’m begging you!

Excerpted from Brigitte Dubiel’s translation of “Der Zauberlehrling” (“The Sorcerer’s Apprentice”) by Johann Wolfgang von Goethe

We faced a never-ending stream of updates to the operating systems, web servers, application servers, database servers, JVMs, and other software packages we used. We struggled to keep up with them. We might apply them successfully to some VMs, but on others, the upgrades broke things. We didn’t have time to stomp out every incompatibility, and over time had many combinations of versions of things strewn across hundreds of VMs.

We were using configuration automation software even before we virtualized, which should have helped with these issues. I had used CFEngine in previous companies, and when I started this team, we tried a new tool called Puppet. Later, my colleague introduced us to Chef when he spiked out ideas for an AWS infrastructure. All of these tools were useful, but particularly in the early days, they didn’t get us out of the quagmire of wildly different servers.

In theory, we should have configured our automation tools to run continuously, applying and reapplying the same configuration to all of our servers every hour or so. But we didn’t trust them enough to let them run unattended. We had too much variation across our servers, and it was too easy for something to break without us noticing. We would write a Puppet manifest to configure and manage a particular application server. But when we ran it against a different server, we found that it had a different version of Java, application server software, or OS packages. The Puppet run would fail, or worse, corrupt the application server.

So we ended up using Puppet ad hoc. We could safely run it against new VMs, although we might need to make some tweaks to make it work. We would write a new manifest to carry out a specific upgrade, and then run it against our servers one at a time, carefully checking the result and making fixes as needed.

Configuration automation was a useful aid, better than shell scripts, but the way we used it didn’t save us from our sprawl of inconsistent servers.

Cloud from Scratch

Things changed when we began moving things onto the cloud. The technology itself wasn’t what improved things; we could have done the same thing with our own VMware servers. But because we were starting fresh, we adopted new ways of managing servers based on what we had learned with our virtualized farm. We also followed what people were doing at companies like Flickr, Etsy, and Netflix. We baked these new ideas into the way we managed services as we migrated them onto the cloud.

The key idea of our new approach was that we could rebuild every server automatically from scratch. Our configuration tooling would run continuously, not ad hoc. Every server added into our new infrastructure would fall under this approach. If automation broke on some edge case, we would either change the automation to include it or else fix the design of the service so that it was no longer an edge case.

The new regime wasn’t painless. We had to learn new habits, and we had to find ways of coping with the challenges of a highly automated infrastructure. As the members of the team moved on to other organizations and got involved with communities such as DevOpsDays, we learned and grew. Over time, we reached the point where we were habitually working with automated infrastructures with hundreds of servers, with much less effort and headache than we had been in our “Sorcerer’s Apprentice” days.

Joining ThoughtWorks was an eye-opener for me. The development teams I worked with were passionate about using XP engineering practices like test-driven development (TDD), continuous integration (CI) and continuous delivery (CD). Because I had already learned to manage infrastructure scripts and configuration files in source control systems, it was natural to apply these rigorous engineering practices to them.

Working with ThoughtWorks has also brought me into contact with many IT operations teams. Over the years, I’ve seen organizations experimenting with and then adopting virtualization, cloud, containers, and automation tooling. Working with them to share and learn new ideas and techniques has been a fantastic experience.

Legacy cloud infrastructure

In the past few years, I’ve seen more teams who have moved beyond early adoption of cloud-based infrastructures using “as-code” tools. My ThoughtWorks colleagues and I have noticed some common issues with more mature infrastructure.

Many teams find that their infrastructure codebases have grown difficult to manage. Setting up a new environment for a new customer may take a month or more. Rolling out patches to a container orchestration system is a painful, disruptive process. New members of the team take too long to learn the complicated and messy set of scripts they use to run their infrastructure tools.

The interesting thing about these situations is that they look oddly familiar. For years, clients have asked us to help them with messy, monolithic software architectures, and error-prone application deployment processes. And now we’re seeing the same issues appearing with infrastructure.

Turns out, infrastructure as code isn’t just a metaphor. Code really is code!

So we’ve doubled down on the idea that software engineering practices can be useful with infrastructure codebases. TDD, CI, CD, microservices4, and Evolutionary Architectures[http://shop.oreilly.com/product/0636920080237.do] are coming into their own for cloud infrastructure.

Why I Wrote This Book

I’ve met and worked with many teams who are in the same place I was a few years ago: people who were using cloud, virtualization, and automation tools but hadn’t got it all running as smoothly as they know they could. I hope that this book provides a practical vision and specific techniques for managing complex IT infrastructure in the cloud.

Why A Second Edition

I started working on the first edition of this book in 2014. The landscape of cloud and infrastructure was shifting as I worked, with innovations like Docker popping up, forcing me to add and revise what I was writing. I felt like I was in a race, with people in the industry creating things for me to write about faster than I could write it.

Things haven’t slowed down since we published the book in June 2016. Docker went from a curiosity to the mainstream, and Kubernetes emerged as the dominant way to orchestrate Docker containers. Existing distributed orchestration and PaaS (Platform as a Service) products either fell by the wayside or rebuilt themselves around Docker and Kubernetes. Serverless, service meshes, and observability are all emerging as core technologies for modern cloud-based systems.

Even as I write the second edition, a new generation of infrastructure tools is emerging, led (so far) by Pulumi and the AWS CDK (Cloud Development Kit). These are challenging the incumbent tools, using general-purpose procedural languages rather than dedicated declarative languages.

Nevertheless, it feels useful to update the book, bringing it up to the current state of the industry. Some books explain how to use a specific tool or language. The concepts in this book are relevant across tools, and even across new versions and types of tools. The core practices, principles, and most of the implementation patterns I describe are valid even if specific technologies and tools become obsolete.

I have several goals with this new edition:

Include newer technologies

Although the concepts apply regardless of the specific tools you use, it’s useful to describe them in current contexts. People often ask me how infrastructure as code applies with serverless, containers, and service meshes. This second edition should make this more clear.

Improve relevance to mature cloud systems

When I wrote the first edition, very few organizations were making full use of cloud and infrastructure as code. Since then, even conservative organizations like banks and governments have begun adopting public cloud. And many other teams have built up large infrastructure codebases, often accumulating technical debt and even legacy systems and code. I’ve updated the content in this book based on lessons I’ve learned working with these teams.

Evolve and expand

One of the benefits that I hadn’t anticipated from writing a book was the amount of exposure I gained to different people building systems on the cloud. I’ve learned so much from giving talks, running workshops, visiting organizations at various phases of adoption, and working with clients. I’ve learned about pitfalls, good practices, and challenges from many people and teams. I’ve also learned how different ways of talking about this stuff resonates with people, so I’ve been able to hone my messaging. The result is that this edition of the book has more content and is presented in what I believe is a stronger structure.

Who This Book Is For

This book is for people who work with dynamic IT infrastructure. You may be a system administrator, infrastructure engineer, team lead, architect, or a manager with technical interest. You might also be a software developer who wants to build and use infrastructure.

I’m assuming you have some exposure to cloud infrastructure, so you know how to provision and configure systems. You’ve probably at least played with tools like Ansible, Chef, CloudFormation, Puppet, or Terraform.

While this book may introduce some readers to infrastructure as code, I hope people who work this way already will also find it interesting. I see it as a way to share ideas and start conversations about how to do it even better.

What Tools Are Covered

This book doesn’t offer instructions in using specific scripting languages or tools. I generally use pseudo-code for examples and try to avoid making examples specific to a particular cloud platform. This book should be helpful to you regardless of whether you use CloudFormation and Ansible on AWS, Terraform, and Puppet on Azure, Pulumi and Chef on Google Cloud, or a completely different stack.

The concepts I explain are relevant across different platforms and toolchains. When I introduce a concept, I often give examples from specific tools to illustrate what I mean. The tools I do mention are ones that I am most familiar with. But there are many other tools out there-just because I don’t mention your favorite one doesn’t mean I’m dismissing it, it only means I don’t know enough about it.

Principles, Practices, and Patterns

I use the terms principles, practices, and patterns (and antipatterns) to describe essential concepts. Here are the ways I use each of these terms:

Principle

A principle is a rule that helps you to choose between potential solutions.

Practice

A practice is a way of implementing something. A given practice is not always the only way to do something, and may not even be the best way for a particular situation. You should use principles to guide you in choosing the most appropriate practice for a given situation.

Pattern

A pattern is a potential solution to a problem. It’s very similar to a practice, in that different patterns may be more effective in different contexts. Each pattern is described in a format that should help you to evaluate how relevant it is for your problem.

Antipattern

An antipattern is a potential solution that I am recommending you avoid in most situations. Usually, it’s either something that seems like a good idea or else it’s something that you fall into doing without realizing it.

Here are how I’m using these terms to drive the structure of this book:

Principles of Cloud Age Infrastructure

These are rules driven by the dynamic nature of cloud systems5 that help you decide how to approach building and running your stuff. They are a contrast to legacy approaches from the Iron Age of infrastructure, which assume resources are static and expensive to change. These principles lead us to Infrastructure as Code as an approach.

Core Practices for Infrastructure as Code

There are many practices for implementing infrastructure as code. The three I’ve highlighted as “core” are the ones that I believe are fundamental for successfully building infrastructure in the Cloud Age. In other words, the Principles of Cloud Age Infrastructure drive the use of these Core Practices.

Implementation Principles

Given each of the core practices for infrastructure as code, these principles are rules for implementing that practice. They help you to decide which patterns and approaches are most useful. I tend to list these in a chapter for each core practice.

Patterns (and antipatterns)

Potential solutions to designing, building, and running your systems. The decisions of which ones are appropriate are driven by the implementation principles.

The FoodSpin examples

I use the fictional company Foodspin to illustrate concepts throughout this book. Foodspin provides an online menu service for fast food restaurants. I doubt this would be a viable business, but it works as an example. The company has clients in different countries, with different cuisines, including Curry Hut in the UK, The Fish King in Australia, and Bomber Burrito and Burger Barn in North America.

Conventions Used in This Book

The following typographical conventions are used in this book:

Italic

Indicates new terms, URLs, email addresses, filenames, and file extensions.

Constant width

Used for program listings, as well as within paragraphs to refer to program elements such as variable or function names, databases, data types, environment variables, statements, and keywords.

Constant width bold

Shows commands or other text that should be typed literally by the user.

Constant width italic

Shows text that should be replaced with user-supplied values or by values determined by context.

Tip

This element signifies a tip or suggestion.

Note

This element signifies a general note.

Warning

This element indicates a warning or caution.

O’Reilly Online Learning

Note

For more than 40 years, O’Reilly Media has provided technology and business training, knowledge, and insight to help companies succeed.

Our unique network of experts and innovators share their knowledge and expertise through books, articles, conferences, and our online learning platform. O’Reilly’s online learning platform gives you on-demand access to live training courses, in-depth learning paths, interactive coding environments, and a vast collection of text and video from O’Reilly and 200+ other publishers. For more information, please visit http://oreilly.com.

How to Contact Us

Please address comments and questions concerning this book to the publisher:

  • O’Reilly Media, Inc.
  • 1005 Gravenstein Highway North
  • Sebastopol, CA 95472
  • 800-998-9938 (in the United States or Canada)
  • 707-829-0515 (international or local)
  • 707-829-0104 (fax)

We have a web page for this book, where we list errata, examples, and any additional information. You can access this page at http://bit.ly/infrastructureAsCode_1e.

To comment or ask technical questions about this book, send email to .

For more information about our books, courses, conferences, and news, see our website at http://www.oreilly.com.

Find us on Facebook: http://facebook.com/oreilly

Follow us on Twitter: http://twitter.com/oreillymedia

Watch us on YouTube: http://www.youtube.com/oreillymedia

1 Adam Jacob and Luke Kanies were also using the phrase around this time. Mark Burgess pioneered the practice of Infrastructure as Code with CFEngine before anyone even used the phrase.

2 Bulletin Board System.

3 Sadly, Infrastructures.org hasn’t been updated since 2007, but the content was still there the last time I looked.

4 http://shop.oreilly.com/product/0636920033158.do

5 As I’ll explain in Chapter 3, “cloud” isn’t exclusive to public, shared cloud platforms like AWS, Azure, and GCP. The “Cloud Age” reaches into on-premise systems and private data centers as well.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.149.27.202