Introduction

Thanks for picking up the second edition of The Art of Scalability: Scalable Web Architecture, Processes, and Organizations for the Modern Enterprise. This book has been recognized by academics and professionals as one of the best resources available to learn the art of scaling systems and organizations. This second edition includes new content, revisions, and updates. As consultants and advisors to hundreds of hyper-growth companies, we have been fortunate enough to be on the forefront of many industry changes, including new technologies and new approaches to implementing products. While we hope our clients see value in our knowledge and experience, we are not ignorant of the fact that a large part of the value we bring to bear on a subject comes from our interactions with so many other technology companies. In this edition, we share even more of these lessons learned from our consulting practice.

In this second edition, we have added several key topics that we believe are critical to address in a book on scalability. One of the most important new topics focuses on a new organizational structure that we refer to as the Agile Organization. Other notable topics include the changing rationale for moving from data centers to clouds (IaaS/PaaS), why NoSQL solutions aren’t in and of themselves a panacea for scaling, and the importance of business metrics to the health of the overall system.

In the first edition of The Art of Scalability, we used a fictional company called AllScale to demonstrate many of the concepts. This fictional company was an aggregation of many of our clients and the challenges they faced in the real world. While AllScale provided value in highlighting the key points in the first edition, we believe that real stories make more of an impact with readers. As such, we’ve replaced AllScale with real-world stories of successes and failures in the current edition.

The information contained in this book has been carefully designed to be appropriate for any employee, manager, or executive of an organization or company that provides technology solutions. For the nontechnical executive or product manager, this book can help you prevent scalability disasters by arming you with the tools needed to ask the right questions and focus on the right areas. For technologists and engineers, this book provides models and approaches that, once employed, will help you scale your products, processes, and organizations.

Our experience with scalability goes beyond academic study and research. Although we are both formally trained as engineers, we don’t believe academic programs teach scalability very well. Rather, we have learned about scalability by suffering through the challenges of scaling systems for a combined 30-plus years. We have been engineers, managers, executives, and advisors for startups as well as Fortune 500 companies. The list of companies that our firm or we as individuals have worked with includes such familiar names as General Electric, Motorola, Gateway, eBay, Intuit, Salesforce, Apple, Dell, Walmart, Visa, ServiceNow, DreamWorks Animation, LinkedIn, Carbonite, Shutterfly, and PayPal. The list also includes hundreds of less famous startups that need to be able to scale as they grow. Having learned the scalability lessons through thousands of hours spent diagnosing problems and thousands more hours spent designing preventions for those problems, we want to share our combined knowledge. This motivation was the driving force behind our decisions to start our consulting practice, AKF Partners, in 2007, and to write the first edition of this book, and it remains our preeminent goal in this second edition.

Scalability: So Much More Than Just Technology

Pilots are taught, and statistics show, that many aircraft incidents are the result of multiple failures that snowball into total system failure and catastrophe. In aviation, these multiple failures, which are called an error chain, often start with human rather than mechanical failure. In fact, Boeing identified that 55% of all aircraft incidents involving Boeing aircraft between 1995 and 2005 had human factors–related causes.1

1. Boeing. (May 2006). “Statistical Summary of Commercial Jet Airplane Accidents Worldwide Operations.”

Our experience with scalability-related issues follows a similar trend. The chief technology officer (CTO) or executive responsible for scale of a technology platform may see scalability as purely a technical endeavor. This perception is the first, and very human, failure in the error chain. Because the CTO is overly technology focused, she fails to define the processes necessary to identify scalability bottlenecks—failure number two. Because no one is identifying bottlenecks or chokepoints in the architecture, the user count or transaction volume exceeds a certain threshold and the entire product fails—failure number three. The team assembles to solve the problem, but because it has never invested in processes to troubleshoot incidents and their related problems, the team misdiagnoses the failure as “the database needs to be tuned”—failure number four. The vicious cycle goes on for days, with people focusing on different pieces of the technology stack and blaming everything from firewalls, to applications, to the persistence tiers to which the apps speak. Team interactions devolve into shouting matches and finger-pointing sessions, while services remain slow and unresponsive. Customers walk away, team morale flat-lines, and shareholders are left holding the bag.

The key point here is that crises resulting from an inability to scale to end-user demands are almost never technology problems alone. In our experience as former executives and advisors to our clients, scalability issues start with organizations and people, and only then spread to process and technology. People, being human, make ill-informed or poor choices regarding technical implementations, which in turn sometimes manifest themselves as a failure of a technology platform to scale. People also ignore the development of processes that might help them learn from past mistakes and sometimes put overly burdensome processes in place, which in turn might force the organization to make poor decisions or make decisions too late to be effective. A lack of attention to the people and processes that create and support technical decision making can lead to a vicious cycle of bad technical decisions, as depicted in the left side of Figure I.1. This book is the first of its kind focused on creating a virtuous cycle of people and process scalability to support better, faster, and more scalable technology decisions, as depicted in the right side of Figure I.1.

Image

Figure I.1 Vicious and Virtuous Technology Cycles Utility

Art Versus Science

Our choice of the word art in the title of this book is a deliberate one. Art conjures up images of a fluid nature, whereas science seems much more structured and static. It is this image that we heavily rely on, as our experience has taught us that there is no single approach or way to guarantee an appropriate level of scale within a platform, organization, or process. A successful approach to scaling must be crafted around the ecosystem created by the intersection of the current technology platform, the characteristics of the organization, and the maturity and appropriateness of the existing processes. This book focuses on providing skills and teaching approaches that, if employed properly, will help solve nearly any scalability or availability problem.

This is not to say that we don’t advocate the application of the scientific method in nearly any approach, because we absolutely do. Art here is a nod to the notion that you simply cannot take a “one size fits all” approach to any potential system and expect to meet with success.

Who Needs Scalability?

Any company that continues to grow ultimately will need to figure out how to scale its systems, organizations, and processes. Although we focus on Web-centric products through much of this book, we do so only because the most unprecedented growth has been experienced by Internet companies such as Google, Yahoo, eBay, Amazon, Facebook, LinkedIn, and the like. Nevertheless, many other companies experienced problems resulting from an inability to scale to new demands (a lack of scalability) long before the Internet came of age. Scale issues have governed the growth of companies from airlines and defense contractors to banks and colocation facility (data center) providers. We guarantee that scalability was on the mind of every bank manager during the consolidation that occurred after the collapse of the banking industry.

The models and approaches that we present in our book are industry agnostic. They have been developed, tested, and proven successful in some of the fastest-growing companies of our time; they work not only in front-end customer-facing transaction-processing systems, but also in back-end business intelligence, enterprise resource planning, and customer relationship management systems. They don’t discriminate by activity, but rather help to guide the thought process on how to separate systems, organizations, and processes to meet the objective of becoming highly scalable and reaching a level of scale that allows the business to operate without concerns about its ability to meet customer or end-user demands.

Book Organization and Structure

We’ve divided the book into five parts. Part I, “Staffing a Scalable Organization,” focuses on organization, management, and leadership. Far too often, managers and leaders are promoted based on their talents within their area of expertise. Engineering leaders and managers, for example, are very often promoted based on their technical acumen and aren’t given the time or resources needed to develop their business, management, and leadership acumen. Although they might perform well in the architectural and technical aspects of scale, their expertise in organizational scale needs is often shallow or nonexistent. Our intent is to provide these managers and leaders with a foundation from which they can grow and prosper as managers and leaders.

Part II, “Building Processes for Scale,” focuses on the processes that help hyper-growth companies scale their technical platforms. We cover topics ranging from technical issue resolution to crisis management. We also discuss processes meant for governing architectural decisions and principles to help companies scale their platforms.

Part III, “Architecting Scalable Solutions,” focuses on the technical and architectural aspects of scale. We introduce proprietary models developed within AKF Partners, our consulting and advisory practice. These models are intended to help organizations think through their scalability needs and alternatives.

Part IV, “Solving Other Issues and Challenges,” discusses emerging technologies such as grid computing and cloud computing. We also address some unique problems within hyper-growth companies such as the immense growth and cost of data as well as issues to consider when planning data centers and evolving monitoring strategies to be closer to customers.

Part V, “Appendices,” explains how to calculate some of the most common scalability numbers. Its coverage includes the calculation of availability, capacity planning, and load and performance.

The lessons in this book have not been designed in the laboratory, nor are they based on unapplied theory. Rather, these lessons have been designed and implemented by engineers, technology leaders, and organizations through years of struggling to keep their dreams, businesses, and systems afloat. The authors have had the great fortune to be a small part of many of these teams in many different roles—sometimes as active participants, at other times as observers. We have seen how putting these lessons into practice has yielded success—and how the unwillingness or inability to do so has led to failure. This book aims to teach you these lessons and put you and your team on the road to success. We believe the lessons here are valuable for everyone from engineering staffs to product staffs, including every level from the individual contributor to the CEO.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.226.185.96