Chapter 7. Why Processes Are Critical to Scale

After that, comes tactical maneuvering, than which there is nothing more difficult. The difficulty of tactical maneuvering consists in turning the devious into the direct, and misfortune into gain.

—Sun Tzu

Perhaps no one knows the value of the right process at the right time better than NCAA football coach Nick Saban, who currently coaches the Alabama Crimson Tide. He is the first college football coach to win national championships at two different Division 1 schools (Louisiana State University and Alabama). As of the time we were writing this book, he had four national championships. To what does Saban credit his success? Listen to his interviews and you will hear him answer this question the same way over and over: “process focus.” “Good process produces good results” and “process guarantees success.”1 Saban’s process is to focus not on winning or losing, but rather on the repetitive activities and fundamentals that, if done well, will result in a win. In the weight room, process is the adherence to the strict form of a complex functional lift. On the field, it is focusing on the repetitive mechanics specific to each player’s position. Saban’s process is “the standard that we want everybody to work toward, adhere to and do it on a consistent basis.”2

1. Jason Selk. “What Nick Saban Knows About Success.” Forbes, September 12, 2012. http://www.forbes.com/sites/jasonselk/2012/09/12/what-nick-saban-knows-about-success/.

2. Greg Bishop. “Saban Is Keen to Explain Process.” New York Times, January 1, 2013. http://thequad.blogs.nytimes.com/2013/01/05/saban-is-keen-to-explain-process/?_php=true&_type=blogs&_r=0.

To scale the output of our engineering resources, we need individuals to work in teams. For teams to be effective in the delivery of products, we need processes to help them coordinate, govern, and guide their efforts. We also need processes to help teams learn from and avoid repeating past failures and to help them repeat their former successes. This section of the book characterizes various essential processes and the roles they should play in your organization. Processes that we will cover in Part II include the following:

• How to properly control and identify change in a production environment

• What to do when things go wrong or when a crisis occurs

• How to design scalability into your products from the beginning

• How to understand and manage risk

• When to build and when to buy

• How to determine the amount of scale in your systems

• When to go forward with a release and when to wait

• When to roll back and how to prepare for that eventuality

Before launching into these topics, we think it is important to discuss in general how processes affect scalability. First we look at the purpose of processes. We then discuss the importance of implementing processes appropriate to the size and maturity of the organization.

The Purpose of Process

As defined by Wikipedia, a business process is a “collection of related, structured activities or tasks that produce a specific service or product (serve a particular goal) for a particular customer or customers.” These processes can be directly related to providing a customer with a product or service, such as manufacturing, or they can be supportive processes, such as accounting. The Software Engineering Institute defines process as what holds together three critical dimensions of organizations: people, methods, and tools. In its published Capability Maturity Model for Development version 1.2, the Software Engineering Institute states that processes “allow you to address scalability and provide a way to incorporate knowledge of how to do things better.” Processes allow your teams to react quickly to crises, determine the root cause of failures, determine the capacity of systems, analyze scalability needs, implement scalability projects, and address many more fundamental needs for a scalable system and organization. These activities are vital if you want your system to scale with your organization’s growth. As an example, if you rely on an ad hoc response to restore your service when an outage occurs, you will inevitably experience much more downtime than if you have a clear set of steps that your team should follow to respond, communicate, debug, and restore services.

While part of managers’ job is to help resolve team problems, it isn’t their only responsibility; that is, managers cannot stand around all day waiting for problems to solve. As an example, suppose an engineer is checking in code to a source repository and is unsure of which branch to use. This situation will likely occur multiple times, so it doesn’t seem cost-effective to repeatedly involve a manager to resolve the dilemma. Instead, perhaps the engineering team can decide that bug fixes go into the maintenance branch and new features go into the main branch. To make sure everyone on the team knows this, someone might write up this policy and send it around to the team, post it on the team’s wiki, or tell everyone about it at the next all-hands meeting. Congratulations: The team just developed a process! This case illustrates one of the principal purposes of processes—to manage work without needing a manager and to reduce the cost and increase the value of outcomes of repetitive tasks. Good processes supplement management, augment its reach, and ensure consistency of quality outcomes.

Back to our example of an engineer checking in code to the source code repository: What would happen if the engineer did not have a process to reference, could not find her manager, and did not have a process established for dealing with procedural uncertainties? Today, she decides that she should check her bug fix into the maintenance branch. This seems logical because the branch is called “maintenance” and the bug fix is maintaining the application. A few days later (long enough for the engineer to forget about her decision regarding where to check in the bug fix), she has been assigned another bug to fix. She quickly identifies the problem and makes the correction. Now ready to check in the new fix, she has the same question: Which branch? She again looks for her manager and cannot find him; he must have a very clever hiding spot in which to watch his favorite game show. The engineer also cannot seem to remember what she did last time in this situation. She does remember hearing that code is being promoted from the main branch tonight, and it seems logical that the product team and her boss would want this fix in as soon as possible. Consequently, she checks in her bug fix to the main branch and proceeds with her new feature development.

See the problem? Yes, without a clear process, there is room for everyone to make their own decisions about how to accomplish certain tasks. When everyone makes their own decisions, we get variability in results and outcomes. Variability in results and outcomes means, by definition, that we aren’t avoiding past mistakes and repeating past successes. Thus, key reasons that we create and maintain processes are to standardize how we perform tasks and to standardize what we do in the event of procedural uncertainty.

In our consulting practice, we are often faced with teams that mistakenly believe that the establishment of processes will stifle creativity. Similarly, process change can be one of the more difficult for organizations to embrace. For example, a change control process initially might be seen as a way to slow down and control when things get released. The reality is quite different: Well-placed processes foster creativity among team members. There is only so much time in each workday and only so many tasks that your engineers can concentrate on. Using the change control example, the results of the process should help the engineer release things safely more often and spend less time involved with an incident caused by an uncontrolled change. Equally important, people tend to have only a limited amount of creativity within them before they must “recharge their batteries.” If an engineer must spend precious time and some amount of creative thought on menial tasks, we lose that time and creative power, which could have otherwise been spent on the really important tasks, like designing a new user interface. A well-structured environment of processes can eliminate distractions and leave the engineer with more time and energy to focus on being creative.

Having discussed the purpose of processes, let’s turn our attention to determining how much process we should implement. Although processes do help improve organizational throughput and standardize repetitive or unclear tasks, too many processes can stall organizations and increase costs.

Right Time, Right Process

Organizations are composed of people with varying experiences, backgrounds, and relationships. Just as no two people are exactly the same, so, too, no two organizations are identical. Organizations are also in a constant state of flux. Just as time changes people, so, too, does time change an organization. An organization may learn (or fail to learn) from past mistakes or learn to repeat certain successes. Members of the organization may increase their skills in a certain area and change their relationships outside of the organization. People join and leave the organization.

If all organizations are different and all organizations are in a constant state of flux, what does this mean for an organization’s processes? Each and every process must be evaluated first for general fit within the organization in terms of its rigor or repeatability. Small organizations may need fewer processes, because communication is easier between team members. As organizations grow, however, they may need processes with a larger number of steps and greater rigor to ensure higher levels of repeatability between teams. As an example, when you first founded your company and the staff consisted of you and one other engineer, with very few customers for your products and services, the crisis management process would have simply been that you get out of bed in the middle of the night and reboot the server. If you missed the alert, you would reboot it the morning because there were likely no customers wanting to use your service in the middle of the night. Using that same process when your team consists of 50 engineers and you have thousands of customers would result in customer angst and lost revenue. In such a scenario, you need a process that spells out to everyone the necessary steps to take when a significant incident arises, and that process needs to be consistently repeatable.

A Process Maturity Framework

As a guideline for discussing the rigor or repeatability of a process, we like to refer to the capability and maturity levels from the Capability Maturity Model Integration (CMMI) framework. This section is in no way a full or complete explanation of the CMMI framework, and we are in no way suggesting that you must implement or adopt the CMMI. Instead, we introduce this framework as a way to standardize terminology for process improvement and repeatability. The CMMI levels are an excellent way to express how processes can exist in a number of states, ranging from ill defined to using quantitative information to make improvements. These extreme states are marked in Figure 7.2 with the O and the X points along the gradient depicting the capability and maturity levels. For more information on the CMMI, visit the Software Engineering Institute’s site at http://www.sei.cmu.edu/.

Image

Figure 7.2 Extremes of Possible Process Levels

As introduced in the CMMI sidebar, the levels are used to describe an evolutionary path of process improvement described as either a “capability level” for organizations utilizing continuous improvement or a “maturity level” for those using a staged representation. Although it may be ideal to have all level 5 processes in your business, it is unlikely, especially in a startup, that you will have enough resources to focus on establishing, managing, documenting, measuring, and improving processes to accomplish this goal. In fact, such a focus is likely to be at odds with the real value creation in a startup—getting products to market quickly and “discovering” the right product to build. Next, we offer some guidelines to help determine if your processes are meeting your needs.

When to Implement Processes

Recall that processes augment the management of teams and employees; therefore, a sure sign that a process could be effective is if you or your managers are constantly managing people through similar tasks time and time again. Constantly managing people through repetitive tasks is a sign that you can gain from defining processes around these tasks.

Another sign of the need for process improvement is if a common task is performed differently (with different outcomes) by each person on the team. Perhaps one engineer checks bug fixes into the main branch, while another checks them into the maintenance branch. Other engineers may not bother checking in at all but instead build packages on their local machines. This kind of variance in approach inevitably means variance in outcomes and results. In these cases, developing processes can help us standardize approaches for the purposes of controlling, measuring, and improving on our desired outcomes.

Employees who are overly burdened by mundane tasks represent a third indication that increased process rigor is required. These distractions take away from employees’ time, energy, and creativity. Early signs of issues here might include an increased number of engineers complaining about where they spend their time. Engineers generally are not the type of people to keep quiet if they feel hindered in performing their jobs or have a better idea of how to accomplish something.

Process Complexity

Choosing the right level of process complexity is not a matter of determining this level once and for all and forever, but rather choosing the right level of complexity for today.

We have two suggestions for ways to identify the right amount of process complexity. Before we explore these two methods, let’s consider an example of the differences in complexity of processes. Figure 7.3 depicts another gradient, illustrating complexity from simple to complex, with two examples of a process for incident management. The very simple three-step process on the left is most applicable to a small startup with just a couple of engineers. The process on the right is much more complex and is more applicable to a larger organization that has a staffed operations team. As indicated by the gradient, a large variety of levels of complexity are possible for any given process.

Image

Figure 7.3 Process Complexity

Armed with the understanding that there can be many different variations on the same process, we now explore some methods for determining which of these options are right for our organization. We have two suggested methods of determining the process level, which have worked well for us in the past and can be used in combination or separately:

• The first method is to start with the smallest amount of process complexity and iteratively move to the more complex and sophisticated processes periodically. The advantage of this approach is that there is very little chance of overwhelming the team with the new process because it is likely to be much simpler than what is required or what they can tolerate (recall that culture influences how much process should exist on a team). The disadvantages of this approach are that it takes time to narrow in on the optimal amount of process, you must remember to revisit the process periodically, and you must change the process that people are used to on a frequent basis. If these disadvantages are too much, you might consider using our second method.

• The second method of narrowing in on the optimal process for your organization is to let the team decide for itself. This approach can either be democratic, where everyone gets a voice, or representative, where a chosen few speak for the group. Either way, this approach will get you closer to the optimal amount of process much more quickly than the preceding small-to-large approach. It also has the advantage that it makes the team members feel a sense of ownership, which makes the adoption of the process much easier.

You can choose to implement one method or the other to find your optimal process, or you can mix them together. In a mixed method, the team might decide on the process, but then you step it down slightly to ensure the adoption occurs even more quickly. If the team members feel they need a very strict branching process, you could suggest that initially they ease up a bit and allow for some flexibility on the naming convention and timing of pulling branches until everyone is familiar with the process. After the process is fully established, with a release or two under your belt, you can modify the process and adopt the original suggestions for naming conventions and timing.

When Good Processes Go Bad

So far, we have discussed only the noble attributes of processes. As much as we would like to believe that there is no downside to developing and implementing processes, the reality is that processes can cause issues themselves. Similar to how a poorly designed monitoring system can cause downtime on a production site by overloading servers and services, processes can create problems within the organization when their complexity and level of rigor are not carefully considered. These challenges are generally not the fault of the processes themselves, or even due to having a process; rather, they are due to the fit between the process and the team. You often see this sort of mismatch with technology, designs, and architectures. Technologies and approaches are rarely “always right” or “always wrong”: flat network (sure, we can find a use for that), stateful apps (yes, even that can have a purpose), singletons (yes, they have a place in our world). Use these in the wrong place, however, and you’re sure to have problems.

One of the biggest problems with a badly fitted process is culture clash. When a Wild West culture meets a very complex 100-step process, sparks are sure to fly. The result is that the teams will either ignore the process, in which case it actually causes more problems than it mitigates, or will spend a lot of time complaining about the process. Both of these results are probably worse than not having the process at all. If you witness this kind of behavior on your teams, you must act quickly: The process is not only hurting the team in the short term, but is likely exacerbating the resistance to process or change, which will make implementing any process in the future more difficult.

Another big problem associated with poor fit between the organization and the process is the dreaded “b” word: bureaucracy. This term is defined by the Merriam-Webster Online Dictionary as “a system of administration marked by officialism, red tape, and proliferation.” The last thing we want to do with any process is create red tape or officialism. The result of bureaucracy, as you might expect, is lowered productivity and poorer morale. As we mentioned earlier, engineers love challenges and thrive on being asked to do difficult things. When they are so hindered that they are unable to succeed, it is easy for engineers to become demoralized. This is why engineers are typically so ready to speak out about things that diminish their ability to perform their jobs effectively. The challenge for you as a manager and leader is to decide when the complaining is just a matter of not liking change or an engineer being a curmudgeon versus when a real problem needs to be addressed. The best way to distinguish these cases is to know the team and how it generally reacts.

To prevent culture clash and expansion of bureaucracy, or in the event that you already have them, there are three key areas to focus on. First, listen to your teams. When you have learned the nuances of each team member’s personality, including those of your managers if you have a multilayered organization, then you will be able to tell when something is really bothering them versus when something is just a mild disturbance.

Second, implement processes using one of the two methods described earlier in this chapter. Either move from small to large on the process continuum, or let the team decide the right amount of process to establish. One or both of these methods, if you elect to use them in conjunction with each other, should result in a good fit between the team and their processes.

Third, perform periodic maintenance on your processes. This should happen regardless of where you fall on a process maturity scale such as CMMI. As we have repeatedly stated, there is no right or wrong process, just right processes for the right organization at the right time. Also, all organizations change over time. As an organization changes, its processes must be reevaluated to ensure they are still the optimal choices. To keep processes up-to-date, it often helps to assign owners to a process. When owners are established and well known, there is no lack of clarity in determining who can help make improvements. Finally, when examining the efficient use of process, you should always consider ways that you can automate a process. Performing periodic maintenance on the process is critical to ensure it does not turn into a source of culture clash or start to become a bureaucratic nightmare.

Conclusion

Processes serve three general purposes: They augment the management of teams and employees, they standardize employees’ actions while performing repetitive tasks, and they free employees up from daily mundane decisions to concentrate on grander, more creative ideas. Without processes such as crisis management or capacity planning, and without processes that fit our teams well in terms of complexity and repeatability, we cannot scale our systems effectively.

Many variations in terms of complexity and process maturity exist. Likewise, organizations differ from one another, and they differ from themselves over time because they change as people are hired or leave or mature or learn. The real challenge is fitting the right amount of the right process to the organization at the right time. To achieve this goal, you might start off with a very low amount of process and then slowly increase the granularity and establish stricter definitions around the process. This manner of wading into the process can be effective for easing one’s way into the world of processes. Alternatively, you might let the team decide what the right process is for a given task. Either assign one person to figure this out or ask the entire team to sit in a room for a couple hours to make the decision.

Among the problems that can arise with ill-fitting processes are culture clashes and bureaucracy. In this chapter, we pointed out some warning signs of these problems and some corrective actions to take to resolve them. Periodic maintenance of your processes will also go a long way toward circumventing these problems. Reviewing the fit for each process on an annual basis or as the organization undergoes a significant change such as a large number of new hires will help ensure you have the right process for the organization at the right time.

The rest of Part II deals with the details of specific processes that we feel are very important for scalability. For each process, you should remember the lessons learned in this chapter and think about how they would affect the way to introduce and implement the process.

Key Points

• Processes, such as application design or problem resolution, are a critical part of scaling an application.

• Processes assist in management tasks and standardization, and free employees up to focus on more creative endeavors.

• A multitude of process variations exist to choose from for almost any given process.

• Determining to implement any process at all is the first step. After that decision has been made, the next step is to identify the optimal amount of process to implement.

• Two methods for determining the optimal amount of process are (1) migrating from small to large through periodic changes or (2) letting the team decide on the right amount.

• A bad fit between a process and an organization can result in culture clashes or bureaucracy.

• To avoid problems between processes and organizations, you might let the team determine the right amount of process or, alternatively, start slowly and ramp up over time.

• Whenever a process is established, you should assign a person or team as owner of that process.

• Maintenance of processes is critical to ensure organizations do not outgrow their processes.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.12.161.165