Preface

In this book, we hope to show you a framework for the next iteration of the database professional: the database reliability engineer, or DBRE. Consider any preconceived notions of what the profession of database administration looks like. Any software or systems engineer who has interacted with these mysterious creatures probably has a lot of these preconceived notions.

Traditionally, database administrators (DBAs) understood database (DB) internals thoroughly; they ate, lived, and breathed the optimizer, the query engine, and the tuning and crafting of highly performant, specialized systems. When they needed to pick up other skill sets to make their databases run better, they did. They learned how to distribute load across computer processing units (CPUs) or disk spindles, how to configure their DB to use CPU affinity, and how to evaluate storage subsystems.

When the DBA ran into visibility problems, they learned how to build graphs for the things they identified as key metrics. When they ran into architectural limitations, they learned about caching tiers. When they ran into the limits of individual nodes, they learned (and helped drive the development of) new design patterns like sharding. Throughout this, they were mastering new operational techniques, such as cache invalidation, data rebalancing, and rolling DB changes.

But for a long long time, DBAs were in the business of crafting silos and snowflakes. Their tools were different, their hardware was different, and their languages were different. DBAs were writing SQL, systems engineers were writing perl, software engineers were writing C++, web developers were writing PHP, and network engineers were crafting their own perfect appliances. Only half of the teams were using version control in any kind of way, and they certainly didn’t talk or step on each other’s turf. How could they? It was like entering a foreign land.

The days in which this model can prove itself to be effective and sustainable are numbered. This book is a view of reliability engineering as seen through a pair of database engineering glasses. We do not plan on covering everything possible in this book. Instead, we are describing what we do see as important, through the lens of your experience. This framework can then be applied to multiple datastores, architectures, and organizations.

Why We Wrote This Book

This book has been an evolving dream of ours for about five years. Laine came to the DBA role without any formal technical training. She was neither a software engineer nor a sysadmin; rather, she chose to develop a technical career after leaving music and theater. With this kind of background, the ideas of structure, harmony, counterpoint, and orchestration found in databases called to her.

Since that time, she’s hired, taught, mentored, and worked with probably a hundred different DBAs. Us database folks are a varied bunch. Some came from software backgrounds, others from systems. Some even came from data analyst and business backgrounds. The thing that consistently shone through from the best, however, was a passion and a sense of ownership for the safety and availability of the company’s data. We fulfilled our roles of stewards of data with a fierceness that bordered on unhealthy. But we also functioned as a lynchpin between the software engineers and the systems engineers. Some might say we were the original DevOps, with a foot in each world.

Charity’s background is firmly in operations and startup culture. She has a gloriously sloppy history of bootstrapping infrastructures fast, making quick decisions that can make or break a startup, taking risks, and making difficult choices based on severely limited resources. Mostly successfully, give or take. She is an accidental DBA who loves data. She has always worked on ops teams for which there were no specialized DBAs, so the software engineering and operations engineering teams ended up sharing that work.

Doing this for so long and with varied pasts, we’ve recognized and embraced the trends of the past decade. The life of the DBA has often been one of toil and invisibility. Now we have the tools and the collective buy-in to transform the role to that of first-class citizen and to focus on the highest areas of value that the DBA can bring.

With this book, we wanted to help the next generation of engineers have truly happy, productive careers and to continue the impact previous generations had.

Who This Book Is For

This book is written for anyone with an interest in the design, building, and operations of reliable data stores. Perhaps you are a software engineer, looking to broaden your knowledge of databases. You might also be a systems engineer looking to do the same. If you’re a database professional looking to develop your skill set, you will also find value here. If you are newer to the industry, this should also be able to give you a solid understanding. This book, after all, is a framework.

We assume that you already have a baseline of technical proficiency in Linux/Unix systems administration as well as web and/or cloud architectures. We also assume that you are an engineer on one of two paths. On one path, you have existing depth in another discipline, such as systems administration or software engineering, and are interested in developing your technical breadth to include the database engineering discipline. On the other path, you are early- to mid-career and looking to build your technical depth as a specialist in database engineering.

If you are management, or even project management, you can use this book to understand the needs of the datastores that will be underpinning your services. We believe firmly that management needs to understand operational and database principles to increase the likelihood of success of their teams and their projects.

You might also be someone without a traditional technical background. Perhaps you are an “accidental DBA” who was a business analyst and learned to run databases by jumping into the deep end of the pool. There are many database professionals who have come to the database world via Excel rather than a development or systems job.

How This Book Is Organized

As we go into this book, we present the information in two sections. The first section is operations core curriculum. This is a foundation of operations that anyone—database engineer, software engineer, even product owner—should know. After this, we dig into data: modeling, storing, replicating, accessing, and much more. This is also where we discuss architectural choices and data pipelines. It should be thrilling!

There is a reason there is an ops-heavy approach to this narrative: you can’t be a good “DBRE” without being a good “RE.” Which you can’t be without being a plain old good “E.” The modern DBRE specializes in understanding data-specific domain problems on top of the fundamentals of systems engineering.

But the point of this is that any engineer can run data services. We now speak the same languages. We use the same repos and the same code review processes. Caring for databases is an extension of operations engineering—a creamy frosting of special knowledge and awareness atop the cupcake of running systems at scale—just as being an exceptional network engineer also means knowing how to be an engineer first, and then knowing extra things about how to handle traffic, what to be scared of, what the current best practices are, how to evaluate network topology, and so on.

Here is a breakdown of what you can expect in each chapter:

Chapter 1 is an introduction to the concept of database reliability engineering. We start with guiding principals, move on to the operations centric core, and finally give a framework for building a vision for DBRE based on Maslow’s hierarchy of needs.

In Chapter 2, we start with service level requirements. This is as important as feature requirements for a product. In this chapter we discuss what service level requirements are and how to define them, which is not as easy as it sounds. We then discuss how to measure and work with these requirements over time.

In Chapter 3, we discuss risk assessment and management. After a foundational discussion on risk, we discuss a practical process for incorporating risk assessment into systems and database engineering. Pitfalls and complexities are also brought to attention.

In Chapter 4, we discuss operational visibility. This is where we discuss metrics and events, and how build a plan for what to start measuring, and what to iterate on over time. We dig into the components of monitoring systems, the clients that use them.

We then dive into infrastructure engineering and infrastructure management in Chapters 5 and 6. This is the section where we discuss the principles of building hosts for datastores. We will dive into virtualization and containerization, configuration management, automation and orchestration in an attempt to help you understand all the moving parts required to build these systems that store and access data.

Chapter 7 is backup and recovery. This is, perhaps, the most critical things for the DBE to master. Losing data is simply game over. Starting from service level requirements, we evaluate appropriate backup and restore methods, how to scale and how to test this critical and oft overlooked aspect of operations.

Chapter 8 is a discussion on release management. How do we test, build and deploy changes to data stores? What about changes to data access code and SQL? Deployment, integration and delivery are the meat of this section.

Chapter 9 is on security. Data security is critical to a company’s survival. Strategies on how to do plan for and manage security in ever evolving data infrastructures are in this chapter.

Chapter 10 is on data storage, indexing, and replication. We will discuss how relational data is stored, and then compare this to sorted strings and log structured merge trees. After reviewing indexing variants, we will explore data replication topologies.

Chapter 11 is our datastore field guide. Here we will discuss a myriad of various properties to look for in datastores you will be evaluating or operating. This includes conceptual attributes of great importance to application developers and architects, as well as the internal attributes focused on the physical implementation of the datastores.

In Chapter 12, we look at some of the more common architectural patterns used for distributed databases and the pipelines they are involved with. We start with a look at the architectural components that typically reside in a database ecosystem, along with their benefits, complexities and general usage. We then explore architectures and pipelines, or at least few examples.

Finally, in Chapter 13 we discuss how to build a culture of database reliability engineering in your organization. We explore the different ways in which you can transform the role of DBRE from one of administrator to that of engineer in today’s organizations.

Conventions Used in This Book

The following typographical conventions are used in this book:

Italic

Indicates new terms, URLs, email addresses, filenames, and file extensions.

Constant width

Used for program listings, as well as within paragraphs to refer to program elements such as variable or function names, databases, data types, environment variables, statements, and keywords.

Constant width bold

Shows commands or other text that should be typed literally by the user.

Constant width italic

Shows text that should be replaced with user-supplied values or by values determined by context.

O’Reilly Safari

Note

Safari (formerly Safari Books Online) is a membership-based training and reference platform for enterprise, government, educators, and individuals.

Members have access to thousands of books, training videos, Learning Paths, interactive tutorials, and curated playlists from over 250 publishers, including O’Reilly Media, Harvard Business Review, Prentice Hall Professional, Addison-Wesley Professional, Microsoft Press, Sams, Que, Peachpit Press, Adobe, Focal Press, Cisco Press, John Wiley & Sons, Syngress, Morgan Kaufmann, IBM Redbooks, Packt, Adobe Press, FT Press, Apress, Manning, New Riders, McGraw-Hill, Jones & Bartlett, and Course Technology, among others.

For more information, please visit http://oreilly.com/safari.

How to Contact Us

Please address comments and questions concerning this book to the publisher:

  • O’Reilly Media, Inc.
  • 1005 Gravenstein Highway North
  • Sebastopol, CA 95472
  • 800-998-9938 (in the United States or Canada)
  • 707-829-0515 (international or local)
  • 707-829-0104 (fax)

We have a web page for this book, where we list errata, examples, and any additional information. You can access this page at http://bit.ly/database-reliability-engineering.

To comment or ask technical questions about this book, send email to .

For more information about our books, courses, conferences, and news, see our website at http://www.oreilly.com.

Find us on Facebook: http://facebook.com/oreilly

Follow us on Twitter: http://twitter.com/oreillymedia

Watch us on YouTube: http://www.youtube.com/oreillymedia

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.70.132