Chapter 7. Automation Strategies

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 7. Automation Strategies

So far in this book, you have learned a lot about automation, from types of automation, to use cases, to tools to help you fulfill the use cases. You have learned about data and what you can accomplish by using it. You have learned about using Ansible as an automation tool and implementing playbooks. You have even learned about using NetDevOps pipelines to automate the delivery of end-to-end activities in a network.

This chapter covers strategy, which is notoriously missing in many companies today but is key for success. This chapter looks at what an automation strategy is, what it consist of, and how it relates to other types of company strategies. As you read this chapter, you will begin to understand the importance of having an automation strategy to guide you on your journey.

This chapter ends by summarizing an approach you can use to build your own automation strategy with a focus on network automation.

What an Automation Strategy Is

Your company most likely has an organizational strategy, a financial strategy, and a go-to-market strategy, but not an automation strategy. Why is that? Well, although automation itself is not new, many companies don’t know a lot about it. Many companies have been forced by the market into adopting automation without planning it. There are many ways to defining what a strategy is, but the basic idea is that a strategy reflects where you are, where you want to be, and how to get there.

An automation strategy is supported by five pillars, as illustrated in Figure 7-1:

Figure 7-1 Automation Strategy Pillars

• Business alignment: An automation strategy should be in line with an organization’s overall business strategy. The business strategy reflects the mission of the company, and therefore all other strategies should align to it. Automation should always work as an enabler to the overall business.

• Technology adoption: An automation strategy should reflect what technologies are in use and what technologies the company will be adopting in the future. Having this information clearly stated helps avoid sprawl in tools and technologies. Furthermore, it helps facilitate automation—for example, by enabling you to choose technologies with good APIs.

• Operating model: An automation strategy should reflect the operating model (owners and maintainers) for the automation tools and projects. Setting clear responsibilities is paramount to keep things under control and maintained; this also helps controlling project sprawl.

• Roadmap: An automation strategy should document what is ahead, including tools that will be adopted, upcoming new projects, and/or decommissions. It is not an official roadmap document, but roadmap perspectives are reflected in a strategy.

• Skills: From the automation strategy document, it should be clear what skills are needed for the future. Does the organization already have those skills, or do they need to be built/hired? Skills are tightly correlated to culture, and automation commonly requires a cultural shift.

You can produce a strategy document that reflects these five pillars by defining four concepts:

• Goals: High-level purposes

• Objectives: Specific outcomes that define the goal that are measurable (that is, linked to KPIs)

• Strategies: How to achieve the goals from a high level

• Tactics: Specific low-level strategy actions to follow

These concepts can be illustrated as an inverted triangle, as shown in Figure 7-2, that goes from high-level goals all the way down to tactics.

Figure 7-2 Goals, Objectives, Strategies, and Tactics

The goals are broad statements that typically reflect desired destinations. They are often the “what” that you are trying to achieve. Although a goal reflects an end state, it is not actionable; that is the role of objectives. Some examples of goals are being the quickest at deploying in the market segment, having zero-emissions data centers, and having a closed-loop automated business.

Objectives are actionable, specific, and measurable. Furthermore, they should be realistic and achievable. They define what needs to be done to achieve the previously defined goals; therefore, objectives do not exist in isolation but are always linked to goals. The following are some examples of objectives linked to the goal of having zero-emissions data centers: decrease dependence on fossil fuels by 10%, increase use of green energy sources by 20%, reduce data centers’ resource sprawl by 15%.

Strategies are high-level plans to follow to achieve goals. They specify, from a high level, how you are going to fulfill your intentions. Examples of strategies linked to the objective of reducing a data center’s resource sprawl could be monitor server utilization and use a cloud elastic model.

Finally, tactics are specific low-level methods of implementing strategies. Examples of tactics linked to the strategy of monitoring server utilization could be deploying a new monitoring system using Grafana and creating new metrics of device utilization such as CPU utilization.

Assessment

Before you can create a strategy document, you need to assess where you are now. An assessment focuses on identifying the current state of affairs, and it can be a lengthy process. During the assessment, you should aim to identify which technologies you are already using and to what extent. It is good to assign a maturity level on a fixed scale (for example, from 0 to 10) so you can effectively compare the different tools you might be using.

Note

During an assessment, you are likely to discover all kinds of technologies and tools that you were not aware existed in the company. This is normal, and you shouldn’t let it discourage you. It is also common to discover a great variation in tool use across different departments in the same company.

An assessment should also consider the current industry landscape—not just your company’s. This is especially important for the next phase, which involves determining where you want to be. (Later in this chapter, you will see an example of the process of building a strategy document.)

When the assessment is finished, you should know where you are. It is then time to define where you want to be. When defining where you want to be, there are many aspects to take into consideration, including these:

• The costs and benefits: You might wish for an end-to-end automated network environment, but that might be financially out of reach if you currently have no automation.

• Skills available: Do you have the skills in-house to achieve your goals? Will you need to train your people? Will you need to hire other people?

• Prioritization: You might come up with tens of destinations, but out of all those, which are the best for you? A prioritization exercise is critical when defining a future state. You might want to save the destinations you discard for strategy updates in the future.

• Possible risks: If there are identified risks involved, you should take them into account and document them. Risks can play a major role in prioritization.

With these aspects in mind, you define goals, then objectives, then strategies in terms of automation and document what you come up with in the strategy document. Automation does not mean only network automation; the automation strategy document should have a broader scope. (However, because this is a book on network automation, we focus solely on network automation in this chapter.)

Defining tactics—which tell you how to get where you’re going—is often the most technical part of this exercise. You do not need to define when tactics will be achieved or the detailed steps to achieve them. You will do those things as part of an action plan, which is defined at a later date, as a byproduct of the strategy. (You will learn about action plans later in this chapter.)

KPIs

The four concepts just described—goals, objectives, strategies, and tactics—are the minimum components of an automation strategy. However, other concepts are commonly included as well. An important concept to add to an automation strategy is key performance indicators (KPIs). Sometimes companies list KPIs in a separate document and mention that document in the automation strategy. However, including them directly in the automation strategy is helpful because it prevents people from getting lost between different documents.

A KPI is a quantifiable measurable value that can demonstrate how well or how badly you are doing at some activity. KPIs are key to tracking your objectives; therefore, even if you do not add them to your automation strategy document, your objectives should somehow link to the company’s global KPIs.

There are high- and low-level KPIs. High-level KPIs typically focus on the overall business performance and result from the work of multiple departments and individuals. Low-level KPIs focus on more specific processes and are influenced by specific individuals.

An example of a high-level KPI could be a measure of the revenue growth rate of a company, which is influenced by the performance of all the departments. A low-level KPI could be a measure of services deployed by a team under the operations department, solely influenced by those team members (which means it is more traceable and actionable). Creating a strategy for network automation typically means defining low-level KPIs.

The most common mistake with KPIs is blind adoption. Companies and departments often copy KPIs from other companies or departments without thinking about whether they accurately reflect their business or operations. This often results in failure to positively impact results and leads to abandoning the KPIs because the resources put into measuring are effectively wasted. It is paramount to have KPIs that resonate with your company’s objectives; this is the only way they can be effective.

Another common mistake is the form of communication of the KPIs. KPIs should be easily accessible, such as on a dashboard, and communicated often so individuals can understand where the company is and where it wants to go. Often KPIs are hidden in long documents reported at the end of fiscal year or fiscal quarter, and people are not aware of the KPIs and therefore do not actively work toward them. Finding a good communication mechanism is key.

A third mistake is adopting KPIs that are hard to measure, not actionable, too vague, or subjective, which can lead to discussions and conflicts. The following are some examples of KPIs with these characteristics:

• Make the scripts tidier

• Use fewer automation tools

• Improve coding practices

A fourth mistake, which is a very dangerous one, is adopting KPIs that are easy to manipulate. With such KPIs, people who want to be recognized may abuse the KPIs to elevate their performance without actually having a positive impact on the business. For example, if a KPI measures the number of software releases, a team could start making very small releases (for example, for simple bug fixes) just to influence the KPI instead of following the normal life cycle of releasing a version only after some number of fixes. In a case like this, although the KPI might make it look like the business has improved, it actually hasn’t. The problem is even worse if you link incentives to KPIs because the incentives might motivate individuals to cheat. KPIs exist to measure where you are in relation to where you want to be; they are navigation tools. Linking incentives to them transforms KPIs into targets.

Yet another mistake is evaluating KPIs with tainted data. For example, when you are measuring software compliance and you only take into consideration the software version of the network equipment in the HQ and neglect branches because they are managed by some other team, you are using tainted data. You should make sure you have high-quality data.

Another common mistake is measuring everything or measuring everything that is easy to measure. KPIs require data, and it is easiest to measure something you already have data for. KPIs should measure only what is necessary. Gathering and storing data can be expensive, and KPIs should be relevant.

Now you understand the common pitfalls, but how do you create good KPIs? We focus here on network automation, but the same principles discussed here apply to other types of KPIs as well.

The secret sauce is in the name: key. KPIs should exist only in relationship to specific business outcomes and critical or core business objectives.

There are many approaches to defining a KPI, but I find that answering the following seven questions is often the best way:

1. What is your desired outcome? (This is the objective you are starting with.)

2. Why does this outcome matter for the company/department?

3. How can you measure this outcome in an accurate and meaningful way?

4. How can people influence this outcome?

5. Can you tell if you have achieved the desired outcome? If so, how?

6. How often will you review the progress made?

7. Who is ultimately responsible for this outcome?

Let’s say you are a service provider, and your objective is to decrease your average time to deploy new network functions, such as a new firewall type or a new SD-WAN router. The following are possible answers to the seven questions:

1. Decrease by 10% the time it takes to deploy a new version of a network function.

2. Decreasing the deployment time will allow us to acquire customers with stricter SLAs and therefore increase business profitability.

3. We can use scripts to measure the time it takes for a new deployment to happen.

4. Adoption of network automation techniques can impact the time required as currently we are doing it manually.

5. We can compare the previous time of deployment to the current to determine whether we have achieved the objective.

6. Progress shall be reviewed on a quarterly basis, so we can make adjustments in regard to found challenges/lessons learned.

7. The network operations team should be responsible for this KPI.

By providing answers to the seven questions, you can define a KPI. In this case, the KPI is related to the time it takes to deploy a new network function. You should evaluate whether a newly defined KPI makes business sense, and you can do so by answering five questions, commonly known as the SMART (specific, measurable, attainable, relevant, time-bound) criteria:

1. Is your objective specific?

2. Is your objective measurable?

3. Is your objective attainable?

4. Is your objective relevant?

5. Does your objective have a time range?

In our previous example, your objective is specific: decrease by 10% the time for new deployments. It is also clearly measurable, as it is possible to measure the time, in minutes, it takes to deploy a new network function.

Attainability can be difficult to understand as you need to factor in history and context. If you were doing everything in this example manually—for example, pushing router configurations—a 10% increase would be attainable by switching to automated scripts.

The objective in this example—acquiring new customers—is relevant. You would need to factor in the cost of automation and the benefit that the new customers bring.

Finally, you have not set a time range. You simply set a quarterly review of the KPI, and it would be good to limit your objectives in time to something that is achievable and realistic, such as 6 months for the 10% decrease.

In summary, you should always align KPIs to business objectives. This correlation should always exist. You shouldn’t have objectives without KPIs or KPIs without objectives. You may have several KPIs for a single objective, however.

Summary

To summarize, an automation strategy is a document that includes goals, objectives, strategies, and tactics, along with sections linking to other relevant strategy documents, such as the data strategy. It is a living document that can and should be iterated upon when changes are required. Iterations can happen for many reasons, including the following:

• Industry events

• Other company strategy changes

• Failure to meet KPIs

Furthermore, an automation strategy is a document framed in a time range. It should be objective, concise, and easy to understand by all stakeholders.

By having an automation strategy document, you and your company have a path to follow. Having clear destination will prevent you from wandering around aimlessly.

It is very important that an automation strategy document be well understood by all stakeholders, from higher management to operators and everyone in between. If everyone shares a common view, they can work toward the same destination.

Note

Try to keep your automation strategy document short and relevant. Most stakeholders will not read and cannot understand 50-page documents.

Why You Need an Automation Strategy

We have seen many companies adopt automation tools and processes only to decommission them in a month’s time. Other companies have tens of similar yet different tools, multiplying the efforts and investments of their workforce. Why? Because they were adopted without a vision or goal but as short-term solutions to various problems.

Having a strategy is similar to having a plan. With a strategy, everyone can read and understand what are you trying to achieve and how. A strategy allows you to do the following:

• Remove ambiguity: There is no ambiguity if you have a well-defined strategy. When they know that the actual goal is to provide the best user experience, Team A will no longer think that the goal is to make the service run faster by reducing computational complexity, and Team B will no longer think that the goal is to reduce latency.

• Communicate: A strategy communicates to everyone the intentions of the company in that area—and this is why it is important that the strategy be a document that is well understood by all stakeholders. It is not a document only for higher management; it is a document for everyone.

• Simplify: Your company can simplify business operations when everyone knows what you want to achieve, using which tools, and during what time. You prevent teams from creating their own solutions to what they think is the goal. However, creative thinking is still important. The strategy functions as a guardrail to keep projects on track.

• Measure: A strategy allows you to understand if something is working. If the metric was to reduce your deployment times by 10%, did you achieve it? Measurement makes KPIs valuable.

In summary, you need an automation strategy because you want to achieve something, and achieving a goal is easier with a plan.

How to Build Your Own Automation Strategy

Now that you are convinced you need an automation strategy, how do you start? Well, we need to start by assessing where you are so that you can build your goals, objectives, strategies, and tactics to define where you want to be.

First, gather the key stakeholders in your company. A strategy is not defined by one individual but rather by a diverse group, which may consist of technical and nontechnical folks. With this type of group, decisions and assumptions can be challenged, resulting in better and common goals. In a smaller organization, the group might consists of only two people. Nonetheless, two is better than one.

This section is focused on the network automation part of your automation strategy. Your document should also encompass other automation efforts, such as business process automation.

Assessment

An assessment of the current state—where you are— is the first activity to take place. In this section, we elaborate on how you can do this yourself. Typically, an assessment involves the following stages:

Stage 1. Create/update an inventory of automation tools/scripts.

Stage 2. Create/update a list that identifies needs/shortcomings.

Stage 3. Create /update a list that identifies the current company automation technology stack and industry alignment.

Stage 4. Create questionnaires.

Each of these stages can be longer or shorter depending on several factors, such as the following:

• The state of documentation, such as whether it is up to date and accurately describes the environment

• The size of the organization

• Whether this is a first assessment or whether an assessment has already taken place

• The availability of key stakeholders

Not all companies follow the same stages in creating an assessment, but the ones outlined here typically work well. To start, consider whether you already have automation in place. If the answer is no, you can skip the stage of creating an inventory as you won’t have anything to add there. Industry alignment is still relevant, as it helps you understand if your industry uses automation when you do not. Identifying needs and shortcomings is always relevant, as there are no perfect solutions.

If you already have automation in place, you need to understand what the automation is.

The first stage, which focuses on creating/updating an inventory for network automation, is the most challenging stage.

You should start by identifying the individuals/teams in the company that are using automation for networking ends. When I target teams, I typically ask for a single person to be responsible for the questionnaire, in order to reduce duplicates.

When you know who is using automation for networking, you have several options: ask for documentation, conduct interviews, or send questionnaires. I find that using questionnaires works best, as you can request all the information you want—even information that might not be reflected in the documentation. In addition, working with questionnaires is less time-consuming than conducting interviews. Craft a questionnaire that captures the tools and data people are using, to what end they are using those tools and that data, how often they are using them, and any challenges they are facing. You should tailor this questionnaire to your specific needs. The following are some examples of questions you might ask:

1. Are you using any automation/configuration management tools? If so, which ones?

2. Do you have any data dependencies? If so, what are they?

3. To what purpose do you use the previously mentioned automation/configuration tools?

4. To what extent are you using those tools?

5. Who owns those tools?

6. How long have you been using those tools?

7. Where are those tools hosted?

8. Do you face any challenges with those tools that you might be able to overcome with a different toolset?

9. How many people in the IT organization are familiar with automation?

The following are example of answers I typically see:

1. Yes. We are using Ansible, Git, and Grafana.

2. We constantly gather metric data from our network devices. We use it to power our Grafana dashboards. We also retrieve the network devices’ configurations every week.

3. We use Ansible for device configuration, Git to store configuration files, and Grafana for metric visualization.

4. We use Git to store all our devices configurations, but we do not make all changes using Ansible. Many of our change windows are still done manually.

5. We have an operations department that takes care of the maintenance of our tooling—that is, upgrading and patching.

6. We have fairly recently adopted Ansible and Git; we have been using them for about 6 months. Grafana is even newer; we’ve been using it about 3 months.

7. Because of regulatory requirements, we host all our tooling on premises.

8. We are still in the infancy of adoption, but so far the main challenge has been the lack of in-house knowledge of Grafana.

9. Not many. We are trying to grow that number. Right now, about 10%.

After crafting your questionnaire and forwarding it to the responsible individuals/teams, you will receive answers and need compile them all into a single knowledge base.

At this stage, you should already have a clear picture of what you have at a global level; this is different from the local picture that each team has. Your knowledge of the global picture will allow you to score each tool and technology used in terms of maturity level, internal skills available, and current investment. Figure 7-3 shows a visual representation of an evaluation of Ansible as tool from a hypothetical questionnaire where 2 departments of 30 people each were actively using Ansible for their day-2 operational tasks out of the total 10 departments surveyed. The Ansible playbooks of 1 of the 2 departments consisted of complex end-to-end actions. There is some investment, but the company clearly has skills available if Ansible is the tool of choice because it could use some of those 60 people to upskill the others.

Figure 7-3 Technology Maturity Level Chart

Having a visual representation helps others understand where you are for a given technology. You can also create aggregate visualizations based on data from specific teams or even the whole organization.

In terms of network automation, I like to classify companies and departments according to four levels:

1. No automation: This classification is self-explanatory: It means doing things manually.

2. Provisioning and configuration automation: The organization is using automation tools such as Ansible or Terraform to provision, configure, and manage the network.

3. Orchestration management: The organization has orchestrated end-to-ends flows (for example, it might be able to deploy a new service using an orchestration pipeline such as Jenkins). However, the organization still relies on real configuration files.

4. Service abstractions: This stage is where the network is abstracted. Few organizations are at this stage. When the organization needs to deploy something, it simply change an abstracted representation of it, and the system maps that change to configuration in the required equipment. The organization doesn’t manage specific configuration files; instead, it manages more generic representations of those files.

To help you understand these levels, Examples 7-1, 7-2, and 7-3 illustrate the evolution between level 2, 3, and 4 for a VLAN configuration task.

Example 7-1 shows an Ansible playbook that configures the VLAN highlighted by using Cisco’s ios_config Ansible module. This is tightly coupled to the syntax required to configure a VLAN on a specific switch.

Example 7-1 Deploying a New VLAN on a Network Switch by Using Ansible

$ cat vlan_deployment.yml
---
- name: Ansible Playbook to configure an interface
  hosts: ios

  tasks:
    - name: configure interface
      cisco.ios.ios_config:
        lines:
        - name servers
        parents: vlan 101

Example 7-2 goes a step further than Example 7-1. It is no longer a manually triggered Ansible playbook that does a single action; it is a Jenkins pipeline that executes the previous playbook and also executes a second one to verify that the changes were correctly applied. However, this example still uses the previous Ansible playbook, which requires a user to know the exact syntax of the commands to apply to configure a new VLAN.

Example 7-2 Jenkins Pipeline for VLAN Configuration

pipeline {
    agent any

    stages {
        stage("Configure") {
            steps {
                sh 'ansible-playbook -i hosts vlan_deploy.yml'
            }
        }

        stage("Test") {
            steps {
                sh 'ansible-playbook -i hosts vlan_verify.yml'
            }
        }
    }
}

Example 7-3 shows the same pipeline as Example 7-2, but this example has a data model in place, where a user must only provide a VLAN number and name to configure on the network device. You achieve this through templating. A more advanced example could render different templates when connecting to different operating systems or different switches; refer to Chapter 5, “Using Ansible for Network Automation,” for more on how to build such templates.

Example 7-3 Deploying a New VLAN Using an Abstracted Input

$ cat vlan_deployment.yml
---
- name: Ansible Playbook to configure a vlan
  hosts: ios

  tasks:
    - name: import vars
      include_vars: vlan.yml

    - name: render a Jinja2 template onto an IOS device
      cisco.ios.ios_config:
        src: "{{ vlan.j2 }}"

$ cat vlan.j2
{% for item in vlan %}

vlan {{ item.id }}
  name {{ item.name }}

{% endfor %}

$ cat vlans.yml
vlans:
  - id: 101
    name: servers

Based on your global knowledge base and the previous examples, you should be able to place your organization in one of these levels. Knowing your current level is relevant for the next step in the process of creating an automation strategy, which is planning goals and objectives. You should align your goals and objectives to match your current level if you still have departments or tasks in a lower level or align them to achieve the next level. Try to evolve automation in a step-by-step fashion. For example, do not try to plan for Level 4 if you are currently at Level 1. You can visualize the levels as a flight of stairs, as illustrated in Figure 7-4.

Figure 7-4 Enterprise Automation Levels

The last stage of the assessment effort is to compare your newly crafted knowledge base to the industry state. For this, you should research what other companies like yours are doing. Are they using Terraform to spin up infrastructure, or are they manually creating their virtual machines? This is especially important because you do not want to be behind the curve.

By the end of this assessment effort, you should have answers to all these questions:

• Do you have automation in place?

• What tools do you already have in place?

• Is your workforce trained or knowledgeable in automation?

• Is there need for automation?

• Are you ahead or behind the industry in terms of automation?

• Is there support for automation from stakeholders?

Knowing where you stand is paramount to developing a good automation strategy. Do not underestimate the assessment phase.

Culture and Skills

Although there is a not a section of the automation strategy document for culture and skills, organizational culture and employees’ skills are major factors that influence your strategy planning and possibly its success. Digital transformation is not only about technology but about people.

Recall from earlier in this chapter that one of the automation strategy pillars is skills. Networking as a technology is old, and automation is not a natural evolution of networking; it is a rather different vertical.

Many if not most network engineers do not have automation in their skillset. It is something that they have been adding, however, and you need to take this into consideration for your automation strategy. Does your workforce (which might be just you) have the skills required to achieve your goals? If yes, that’s great! If not, there are two options:

• Training and upskilling: If you choose to go this route, you should add training and upskilling as tactics. Bear in mind that there are costs. Monetary costs are attached to trainings, certifications, and laboratories, and there are also time costs. People don’t just learn new skills overnight, especially if those skills are unlike previously learned skills.

• Hiring: If you choose to go this route, again there will be both monetary and time costs.

Many companies today, big and small, are undertaking the transition from device-by-device networks to software-managed networks powered by automation. If that is happening in your organization, you are not alone. Whether to choose training an upskilling or hiring is a decision that needs to be made on a case-by-case basis. Both routes to obtain the skills of the future are valid.

Organizational culture also has an impact on the automation strategy. Depending on the domain, it may be easier to introduce automation. For example, it is typically harder to introduce automation in networking than in software development because those practices are already embedded in the software culture. Unfortunately, the networking culture is typically very manual and dependent on key individuals. Adopting automation practices can be challenging in some companies where such culture is deeply rooted.

Some still also see automation as a job taker, even though it is more of a job changer. Failing to adapt may result in job losses, and many people fear automation. This can be a major challenge for companies and should not be underestimated as it can completely destroy your goals.

A term I have heard in the context of network changes is “culture of fear.” Some network folks are afraid to touch these ancient and critical network devices that can bring the entire network down in flames, and automating changes can seem really scary. This also applies to newer network components in the cloud. If adopting a new culture of NetDevOps, as described in Chapter 6, “Network DevOps,” is part of your strategy, you need to plan the transition from a “culture of fear” very carefully and include it in your automation strategy. If you think that the current culture needs to adapt in order to meet your goals, you must factor that in and plan accordingly.

Goals, Objectives, Strategies, and Tactics

With the knowledge of where you are as well as the current skill and culture of you organization culture in mind, you can start defining your automation strategy goals. To do so, you need to consider these questions:

• What do you want to achieve?

• What problems are you trying to solve?

• What is in need of improvement?

For ease of understanding, let’s say you are working for a telecommunications service provider, XYZ, that has 1000 employees, divided into several teams, including IT operations, software development, and support. These departments account for 400 out of the total 1000 employees at the company.

After an automation assessment consisting of a questionnaire to the relevant stakeholders, along with a compilation of the results under a single knowledge base and comparison with the market, you have the results shown in Table 7-1.

Table 7-1 XYZ Service Provider Assessment Knowledge Base

Currently the company is using four different technologies for the automation of tasks. It seems that there is some lack of guidance in what to use. From your assessment, you could tell that shell scripting is the most mature of the four, probably because the company has been using it in operations for over 10 years, and the other tools have been adopted over time.

Among the three teams surveyed, Ansible is a clear winner in terms of skills available. Most of the engineers know Ansible even if their departments are not using it at the moment. Terraform seems to be gaining momentum, and the development team is well versed in it. In terms of the scripting languages, only the folks using them daily know how to use them.

The operations department Ansible is supported by Red Hat as they undertake critical tasks using it. This shows that there has already been a financial investment, unlike for any of the other tools the company is using.

Finally, in terms of industry alignment, although Ansible is still very relevant, some major players in the segment have realized benefits from using Terraform and are migrating toward that solution. Python and shell scripting are not key technologies in this market segment.

From this assessment, you decide that the company is at Level 2, configuration automation. It does not have end-to-end pipelines, but it does have several tasks automated with the different technologies.

You can now define what you want to achieve, such as, “I want to accelerate the time to market of new services launched in my market segment.” This could be a goal. Do not forget to align the automation goals with your business strategy goals. You may have more than one goal, but you should typically have fewer than five.

Now, with these goals in mind, you must define objectives, which must be measurable and achievable. Here are two possible examples:

• Deploy a new VPN service from start to finish in under 24 hours.

• Start working on a new service request a maximum of 8 hours after the request is made.

Although it can be difficult to know what is achievable, you can consult subject matter experts in your company. If in doubt, aim for lower. Or have in mind that you were not sure what was achievable when evaluating the success of the strategy.

You will face many challenges when defining objectives. The two most common ones are defining too many objectives by not prioritizing the most important ones and lack of linkage to measurable KPIs.

During this type of exercise, companies typically come up with many objectives to reach the defined goals. Because an automation strategy is linked to a finite time frame, such as 1 year, it is important that objectives be achievable in the set time frame. This is where prioritization plays an important role.

For each of the objectives, you should try to quantify the cost to benefit—that is, how much money you need to invest and what benefit it will bring. Money is not necessarily the only metric; time is also a consideration, but often time can be translated to a dollar value, and hence cost is typically the metric used.

In terms of addressing the problem of linkage to measure KPIs, let us go back and use the methodology introduced earlier in this chapter and answer the seven questions applied to the two objectives:

1. What is your desired outcome? Deploy a new VPN service from start to finish in under 24 hours.

2. Why does this outcome matter for the company/department? If we achieve it, we would be faster than any of our competitors. This would allow us to attract new customers and retain current ones.

3. How can you measure this outcome in an accurate and meaningful way? We can measure it by recording the time it takes from a customer requesting a service to when it is operational.

4. How can people influence this outcome? Our teams can adopt network automation techniques to decrease the service deployment times. On the other hand, because it is currently a manual process, employees can register new service requests later than the actual request date.

5. Can you tell if you have achieved the desired outcome? If so, how? We can tell if we have achieved it if all our services are within the 24-hour threshold.

6. How often will you review the progress made? We aim to review it on a biweekly basis to understand the progress made.

7. Who is ultimately responsible for this outcome? The operations department.

In this case, a KPI could be time to deploy a new service. You could also have KPIs to track each process within a new service deployment to better understand which of the processes are slowest. This would be a more granular approach.

It is important to note that people could influence this KPI to their benefit by registering the service later than when it was actually requested. Knowing this, if you choose this KPI, you should implement something to mitigate it as it could make your KPI measurements meaningless. It is very important to identify flaws of a KPI when defining it.

You can repeat this process for the second objective:

1. What is your desired outcome? Start working on a new service request a maximum of 8 hours after the request is made.

2. Why does this outcome matter for the company/department? Starting to work on requests right away would help the company seem more responsive to customers and hence improve customer experience.

3. How can you measure this outcome in an accurate and meaningful way? We can measure it by recording the time it takes from when a customer initiates a new service request to when one of our engineers starts working on its fulfillment.

4. How can people influence this outcome? People can impact this by accepting the requests as early as possible.

5. Can you tell if you have achieved the desired outcome? If so, how? We can tell if we have achieved it if all our service requests are accepted below the 8-hour threshold.

6. How often will you review the progress made? We aim to review it on a biweekly basis to understand the progress made.

7. Who is ultimately responsible for this outcome? The operations department.

In this case, a KPI could be time to accept a new service request. In the event that there are many services request types that depend on different teams, you could define a KPI per service request type to measure in a more granular way.

The result of applying the technique is two objectives-linked measurable KPIs relevant to the company that can be tracked across a defined period of time.

After defining your goal and objectives, it is time to draft strategies to reach your goal. In this case, the goals are directly linked to objectives, but sometimes they span different objectives:

• (Objective) Deploy a new service from start to finish in under 24 hours.

• (Strategy) Adopt automation processes for service deployment.

• (Strategy) Adopt orchestration processes for service deployment.

• (Objective) Start working on a new service request a maximum of 8 hours after the request is made.

• (Strategy) Create a new platform for service requests that is always up.

• (Strategy) Improve the notification system for new service requests.

These strategies are paths to the goal. Put some thought into strategies, as it is important that they lead to the goal. Through prioritization, you can select the strategies that are most effective.

The last step in this process is to create tactics, which are the low-level, specific ways of fulfilling the strategies:

• (Strategy) Adopt automation processes for service deployment.

• (Tactic) Use Ansible to deploy configurations.

• (Tactic) Use Jinja2 to generate configurations.

• (Tactic) Use virtual devices for testing prior to deploying configurations to production.

• (Strategy) Adopt orchestration processes for service deployment.

• (Tactic) Develop an in-house orchestrator to interact with the current software ecosystem.

• (Tactic) Educate the operations department on the new orchestrator system.

• (Strategy) Create a new platform for service requests that is always up.

• (Tactic) Use a cloud SaaS for the new platform.

• (Tactic) Use a microservices architecture for the new platform so that all components are decoupled and resilient.

• (Strategy) Improve the notification system for new service requests.

• (Tactic) Replace the pull notification method with a push one.

• (Tactic) Implement a mandatory acknowledgment from the receiver when a message is sent.

In the tactics is where you define what tools to use. Remember that the automation strategy document is for the whole organization. It is very important to understand the scope of your strategies and circle back to what you have learned from your assessment.

Some of the following questions can help you choose the right tool for you, which in some cases may be different from the right tool for the job:

• Is it router configuration automation or another type of system?

• Is it cloud provisioning or on premises?

• Do we have in-house knowledge, or will we need to procure it?

• Do we have rack space in the data center for a new solution?

• Do we need vendor support?

• Is the technology aligned with industry best practices?

• Is it scalable enough for all our use cases?

You need to do this exercise for all your strategies. This exercise takes time and effort, but the reward is amazing.

Note

Remember that a strategy should have a time frame associated with it. Aim for 6 months to 2 years for a strategy; after that time, revisit the strategy and, if needed, update it.

Finally, it is important not to forget to consult other strategy documents and add excerpts from it to the automation strategy if they are relevant. For example, say that you have defined a strategy to use a cloud SaaS for your future service request platform. Service requests more often than not contain personal identifiable information (PII), among other data. It is important to verify that your data strategy supports having this type of data in the cloud. Furthermore, your systems should comply with any strategies defined in that document. Say that your data strategy has the following objectives and strategies to fulfill your goal:

• (Objective) Centralize all data under a single umbrella.

• (Strategy) Adopt common storage processes for common types of data.

• (Strategy) Adopt common processing methods for common types of data.

Expanding the strategies, you come up with the following tactics:

• (Strategy) Adopt common storage processes for common types of data.

• (Tactic) Use AWS S3 to store all file data.

• (Tactic) Use AWS Redshift to store all analytical data.

• (Tactic) Use AWS DynamoDB to store all metadata used.

• (Tactic) All data must be encrypted at rest.

• (Strategy) Adopt common processing methods for common types of data.

• (Tactic) Before storage, analytical data must pass by the EMR ETL engine for aggregation.

• (Tactic) All PII data must be anonymized using an in-house-developed system.

Some of these tactics directly impact your automation strategy. Any system you intend to develop must comply with these strategies. For example, your cloud SaaS platform will have to make use of the anonymization system before you store any service request PII data. For this reason, it is good to have an excerpt of it in your automation strategy document, so stakeholders can see why you are following this approach without having to jump from one document to another.

The same logic applies for the financial strategy and other strategy documents: You should consult them and add any excerpts that are relevant for your context.

ABD Case Study

The previous walkthrough has shown you what an automation strategy can look like. To make the concepts even clearer, let’s look at a real example of the network automation part of an automation strategy for a big company (with more than 1000 employees) called ABD, where in the recent past the network experienced production-level issues that had business impact.

The goal was clear for the stakeholders involved even before an assessment was made: The network must be able to support the needs of the business and stop being a bottleneck.

With the goal set, an extensive assessment was initiated to understand where the company was in terms of network automation and also to understand what issues existed with the current network. This type of assessment is typically beyond the scope of an automation strategy as the focus should only be on the automation piece. However, as a result of outages, understanding the “where we are” part may extend beyond automation.

The assessment was done by an independent third party, but it involved multiple internal teams, including the networking team and the operations teams. It consisted of interviews and questionnaires to capture the current situation. The following is a summary of the results:

• The network is running on legacy software versions.

• The current network is experiencing a large number of hardware failure occurrences.

• The network is monitored using a legacy system that is not configured for notifications.

• Network maintenance and proactive changes are manually executed.

• Network rollbacks are manually executed.

• There are no automation tools in use in production.

• The networking team uses Ansible in the lab environment.

• The operations department is skilled with Python.

• The networking department is skilled with Ansible.

ABD did not have much to assess from a network automation perspective as the company was barely using automation at all. However, it is still a good idea to evaluate ABD in terms of the four criteria and put less emphasis on each individual tool; instead have a generic indicator for the whole thing, see the Organization column in Table 7-2. In the table you can see that barely any investment has been made in automation; ABD has invested just a little in Ansible for the lab environment. In terms of skills, the workforce is knowledgeable in two tools, Python and Ansible, but this knowledge is divided between two different teams, networking and operations. The maturity score reflects the lack of automation solutions deployed. Finally, this organization is not aligned with its industry segment as other companies in the same segment are using automation. You see high scores for some specific tools because those individual tools are aligned with the industry for their use case.

Table 7-2 Technology Assessment Knowledge Base

With the “where we are” picture, the responsible stakeholders defined objectives to address their goal:

• Changes to the network must be made in under 1 hour and without major impact.

• The network must have an uptime of 99%.

• The network must support the addition of two new services every month.

Note

For this case, we are only looking at only the network automation objectives.

For each of these objectives, KPIs had to be created. By using the seven-question method, the company defined the following KPIs:

• Number of new services deployed per month

• Time taken, in seconds, for each step of a change window

• Duration of services affected during a change window

• Number of services affected during a change window

• Number of service-affecting network outages per month

• Duration of service-affecting network outages per month

Note

Modern KPIs can also be focused on customer experience rather than infrastructure metrics, such as results from a customer feedback form.

This was the outcome of one of the exercises for the KPI definition, which was repeated for every objective:

1. What is your desired outcome? Changes to the network must be made in under 1 hour and without major impact.

2. Why does this outcome matter for the company/department? We cannot have the network impact business operations. The longer the maintenance windows are, the longer potentially other teams cannot do their functions. The network is a foundation for everything else.

3. How can you measure this outcome in an accurate and meaningful way? We can measure the time the maintenance windows take, and we can measure which services were affected and for how long.

4. How can people influence this outcome? The team can automate the processes it uses to make changes, aligning to what the industry is already doing and improving the consistency and speed of the network changes.

5. Can you tell if you have achieved the desired outcome? If so, how? We can tell by verifying whether the time the windows take is decreasing, along with the number of service outages.

6. How often will you review the progress made? It should be reviewed on a quarterly basis.

7. Who is ultimately responsible for this outcome? The networking team.

By this stage, the company knew what it wanted to achieve. It was time to determine how they would achieve it. They defined the following strategies:

• Improve the current network monitoring system.

• Adopt automation tools/processes to perform network changes during planned windows.

• Create automated failover processes that occur during network outages.

Finally, the company detailed each of these strategies into tactics, using the knowledge gathered from the assessment related to the workforce skillset:

• (Strategy) Improve the current network monitoring system.

• (Tactic) Collect new metrics from the equipment (CPU, memory, and interface drops).

• (Tactic) Configure automated alarms on the monitoring system.

• (Tactic) Use past log information to train an ML model of predictive maintenance.

• (Tactic) Configure monitoring for software versioning.

• (Strategy) Adopt automation tools/processes to perform network changes during planned windows.

• (Tactic) Use Ansible playbooks for production changes.

• (Tactic) Use Ansible playbooks for automatic rollbacks.

• (Tactic) Use Git to store the playbooks and configurations.

• (Tactic) Verify service-affecting changes in a virtual lab before applying them to the production environment.

• (Tactic) Automatically verify the status of the production network before and after a change.

• (Strategy) Create automated failover processes that occur during network outages.

• (Tactic) Use Python to automate traffic rerouting in the event of byzantine hardware failure.

• (Tactic) Use Python to automate the tasks required for failover in the event of a complete data center outage (for example, DNS entries).

• (Tactic) Use past log information to train an ML model of predictive maintenance.

Another section of this automation strategy was a linkage to the data strategy. This was justified by the specific mentions of how to store data in that strategy, which directly impacts the folks implementing the automation strategy. The following objective and strategies were transcribed from that document:

• (Objective) Have full control over stored data.

• (Strategy) Store all data on company premises.

• (Strategy) Adopt a controlled access mechanism for data.

These strategies were also specified in the form of tactics. For the scope of the automation strategy, they were also relevant and therefore transcribed:

• (Strategy) Store all data on company premises.

• (Tactic) Log data must be moved to San Francisco’s data center Elasticsearch.

• (Tactic) Metric data must be moved to San Francisco’s data center InfluxDB.

• (Tactic) After 3 months, historical data must be moved to cold storage.

• (Strategy) Adopt a controlled access mechanism for data.

• (Tactic) Active Directory should be used to authenticate and authorize any read/write access to data.

• (Tactic) An immutable accounting log should exist for any data storage solution.

You may see the importance of transcribing the data strategies and tactics. If the company did not do so, it might overlook the need to store the new collected device metrics in the proper place—in this case, San Francisco’s data center. Furthermore, it is important for the company to investigate and plan how to build the machine learning predictive system if the historical data is moved to a different location as the move might add extra cost and complexity.

This was the overall structure of ABD’s automation strategy, which was quite extreme in terms of the changes to the company culture—going from a completely manual operation model to introducing automation and eventually having an automated environment. Based on the maturity level and the changes desired, ABD set a 2-year goal for implementing this strategy, and it planned to revisit it every 3 months to monitor progress.

How to Use an Automation Strategy

After you have built your own automation strategy, what happens next? Well, you will need to use it. The first step in using your strategy is to build a planning document. This document should have a time span of a year or less, and it should highlight the project’s milestones.

To create an action plan, you need to prioritize your tactics. Some companies like to focus on quick wins, and others prefer to start with harder and long-lasting projects first. There is not a single best way to prioritize. I find that having wins helps collaborators stay motivated, and because of that, I like to have in parallel long-lasting projects and smaller, more manageable ones.

Start with a tactic that you have identified as a priority and involve technical colleagues if you need the support. Break down the tactic into specific actions and assign dates for those actions. Let us illustrate with an example. Say that you are collecting new metrics from your equipment (CPU and memory). In order to achieve this tactic, a first step could be verifying whether the current monitoring system supports collecting such metrics or whether you need to acquire a monitoring system because you don't have one.

A second step could be configuring the monitoring system to collect these new metrics on one device. This would be called a proof of concept (PoC). If the PoC is successful, a third step could be configuring the monitoring system to collect these new metrics on 25% of the installed base. And a fourth step could be configuring it to collect from 50% of the installed base. A fifth and last step could be configuring the remaining 50%. Each of those actions should have a date assigned, along with an owner responsible for planning and execution. This owner could be a team rather than a single individual.

This type of phased approach is the most common and is typically the most successful. Jumping straight into a full-blown global system is difficult and error prone. Try to follow a phased approach when defining a plan for a tactic unless it is a tactic with a small scope.

An action plan should include detailed steps for all the tactics that will happen in a particular period of time.

Table 7-3 shows a formatted action plan for a single tactic. You can add different fields. Typically the minimum required fields are the owner and date fields. For the dates, you can be more specific than specifying quarters, as in Table 7-3. A description column gives you a place to describe more complex actions that may not be fully understood based only on their names. Lastly, it is common to have a reference column, represented as Ref in our table, this simply provides a reader an easy way to refer to any specific row. We are using a numerical reference, but alpha-numeric is also commonly seen.

Table 7-3 Action Plan for Adoption of Ansible and Creation of Playbooks for Change Windows

You can aggregate several tactics under the same action plan, but it may become very big and hard to read. Sometimes it is a good idea to use a different table for each tactic. A downside of this approach is that the delivery dates are spread out throughout the document.

In addition, visual representation of actions plans can help a lot in understanding and correlating all the time tables. You can see an example in Figure 7-5, which shows a scaled-down chart version of Table 7-3 with five tasks. You can see here that the first task is a predecessor for Tasks 2, 3, and 4, which may take place in parallel, and the fifth task is dependent on the three previous tasks. These dependencies are highlighted by the arrows. Furthermore, you can see on this chart who is responsible for each task.

Figure 7-5 Gantt Chart of Tasks

Meanwhile, you should be monitoring your progress and KPIs. That is why you defined them in the first place, and you need to make sure they are being met. If they are not being met, you need to understand why and adjust, which may mean updating your automation strategy; as mentioned earlier, it is a living document, so adjustment is expected. However, keep in mind that you should not be changing your automation strategy every day. If you are making changes very frequently, try to understand the underlying reason: Are you defining unrealistic objectives? Are your company skills not fit for the goals?

Note

Incorporating a yearly cycle to assess the need for readjustment is typically helpful.

In summary, to fully use your automation strategy document, you need to create a project plans for each tactic where you define granular actions to be taken and highlight time lines and ownership. Implement your tactics according to your plan while measuring and monitoring your KPIs.

Summary

This chapter covers automation strategies. Just as a company would be without a business strategy, you should not have automation without an automation strategy. You should not wander around automating one-off actions without understanding what you really want to achieve and how doing so can be beneficial in the long term.

This chapter mentions the five key pillars that support an automation strategy:

• Business alignment

• Technology adoption

• Operating model

• Roadmap

• Skills

This chapter also covers the sections of an automation strategy document, as well as core components, such as KPIs, that make up the document’s body.

Finally, this chapter guides you on building your own automation strategy document and provides network automation examples, tips on what to keep in mind during this journey, and how to translate the strategy into reality.

Review Questions

You can find answers to these questions in Appendix A, “Answers to Review Questions.”

1. Which of the following is not a pillar for an automation strategy?

a. Skills

b. Culture

c. Technology adoption

d. Business alignment

2. In an automation strategy, what are goals?

a. Specific outcomes that are measurable, realistic, and achievable

b. Metrics to evaluate where you are

c. Specific actions to follow

d. High-level purposes

3. Which of the following fields is not mandatory in an action plan?

a. Owner

b. Action

c. Target date

d. Description

4. What is the typical time range that an automation strategy addresses?

a. 1 month

b. 1 to 6 months

c. 6 months to 2 years

d. 5 years

5. True or false: The automation strategy document should be crafted by only upper management stakeholders.

a. True.

b. False.

6. In an automation strategy document, can a strategy span multiple objectives?

a. Yes, a strategy can span multiple objectives.

b. No, a strategy cannot span multiple objectives.

7. True or false: You cannot have other strategy documents linked to your automation strategy document.

a. True

b. False

8. Which automation level is a company in if it has systems in place that abstract service complexity (that is, service abstraction)?

a. 1

b. 2

c. 3

d. 4

9. In which part of the automation strategy document are the technology tools to be used enumerated?

a. Goals

b. Objectives

c. Strategies

d. Tactics

10. How many tactics can a strategy have in an automation strategy document?

a. 1–5

b. 5–10

c. 10–100

d. Unlimited

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 7. Automation Strategies

Create new playlist

Sign In

Sign Up

Chapter 7. Automation Strategies

What an Automation Strategy Is

Assessment

KPIs

Other Strategy Documents

Summary

Why You Need an Automation Strategy

How to Build Your Own Automation Strategy

Assessment

Culture and Skills

Goals, Objectives, Strategies, and Tactics

ABD Case Study

How to Use an Automation Strategy

Summary

Review Questions

Table of Contents for
Chapter 7. Automation Strategies