This section of the book is all about the basic operations in multi-cloud, or BaseOps. We'll be learning about the basics, starting with managing the landing zone – the foundation of any cloud environment. Before a business can start migrating workloads or develop applications in cloud environments, they will need to define that foundation. Best practices for landing zones include the hub and spoke-model in Azure, AWS Landing Zone, and the definition of projects in Google Cloud. In multi-cloud, this landing zone expands over multi-cloud concepts and technologies.
This chapter describes how to design the landing zones for the major cloud platforms and explores the BaseOps principles for managing them. We will learn how to design the landing zones in Azure, AWS, and GCP, how to define policies to manage the landing zone, and get a deeper understanding of handling accounts in these landing zones. We will also learn that there are platforms where we can manage different clouds from just one console via orchestration. In this chapter, we're going to explore the foundational concepts of the major cloud providers; that is, Azure, AWS, and GCP. We'll also design the basic landing zones in the major clouds, manage the foundation environments in multi-cloud, learn how to abstract policies from resources on the different cloud platforms by exploring Infrastructure as Code and Configuration as Code, and understand the need for demarcation in the cloud.
In this chapter, we will cover the following topics:
BaseOps might not be a familiar term to all, although you could guess what it means: basic operations. In cloud environments, this is more often referred to as cloud operations. BaseOps is mainly about operating the cloud environment in the most efficient way possible by making optimal use of the cloud services that major providers offer on the different layers: network, compute, storage, but also PaaS and SaaS.
The main objective of BaseOps is to ensure that cloud systems are available to the organization and that these can safely be used to do the following:
At the end of the day, this is all about the quality of service. That quality is defined by service levels and KPIs that have been derived from the business goals. BaseOps must be enabled to deliver that quality via clear procedures, skilled people, and the proper tools.
We have already explored the business reasons regarding why we should deploy systems in cloud environments: the ultimate goal is to have flexibility, agility, but also cost efficiency. This can only be achieved if we standardize and automate. All repetitive tasks should be automated. Identifying these tasks and monitoring whether these automated tasks are executed in the right way, is part of BaseOps. The automation process itself is development, but the one reason we should have DevOps in the first place is so that we can execute whatever the developer invents. Both teams have the same goal, for that matter: protect and manage the cloud systems according to best practices.
We can achieve these goals by executing the activities mentioned in the following sections.
This is by far the most important activity in the BaseOps domain. It's really the foundation of everything else. The landing zone is the environment on the designated cloud platform where we will host the workloads, the applications, and the data resources. The starting principle of creating a landing zone is that it's fully provisioned through code. In other words, the landing zone contains the building blocks that form a consistent environment where we can start deploying application and data functionality, as we discussed in Chapter 4, Service Design for Multi-cloud, where we talked about scaffolding. In the Creating a multi-cloud landing zone and blueprint section of this chapter, we will deep dive into creating landing zones on the different major platforms; that is, Azure, AWS, and GCP.
The base infrastructure typically consists of networking and environments that can host, compute, and storage resources. You could compare this with the Hyperconverged Infrastructure (HCI), which refers to a physical box that holds compute nodes, a storage device, and a switch to make sure that compute nodes and storage can actually communicate. The only addition that we would need is a router that allows the box to communicate with the outside world. The cloud is no different: the base infrastructure consists of compute, storage nodes, and switches to enable traffic. The major difference with the physical box is that, in the cloud, all these components are code.
But as we have already learned, this wouldn't be enough to get started. We also need an area that allows us to communicate from our cloud to the outside and to access our cloud. Next, we will need to control who accesses our cloud environment. So, a base infrastructure will need accounts and a way to provision these accounts in a secure manner. You've guessed it: even when it comes to defining the standard and policies for setting up a base infrastructure, there are a million choices to make. Landing zone concepts make it a lot easier to get started fast.
As a rule of thumb, the base infrastructure consists of five elements:
The good news is that all cloud providers agree that these are the base elements of an infrastructure. Even better, they all provide code-based components to create the base infrastructure. From this point onward, we will call these components building blocks. The issue is that they offer lots of choices in terms of the different types of building blocks and how to deploy them, such as through blueprints, templates, code editors, command-line programming, or through their respective portals. As we mentioned previously, we will explore the landing zone solutions in this chapter.
Defining standard architecture principles (architecture patterns and reference architecture)
A way to define a reference architecture for your business is to think outside-in. Think of an architecture in terms of circles. The outer circle is the business zone, where all the business requirements and principles are gathered. These drive the next inner circle: the solutions zone. This is the zone where we define our solutions portfolio. For example, if the business has a demand for analyzing large sets of data (business requirement), then a data lake could be a good solution.
The solution zone is embedded between the business zone at the outer side and the platform zone at the inner side. If we have, for instance, Azure as our defined platform, then we could have Azure Data Factory as a solution for the specific data lake requirement. The principle is that from these platforms, which can also be third-party PaaS and SaaS platforms, the solutions are mapped to the business requirements. By doing so, we create the solutions portfolio, which contains specific building blocks that make up the solution.
The heart of this model – the utmost inner circle – is the integration zone, from where we manage the entire ecosystem in the other, outer circles.
Security should be included in every single layer or circle. Due to this, the boundary of the whole model is set by the intrinsic security zone:
The preceding diagram shows this model with an example of the business requiring data analytics, with Data Factory and Data Bricks as solutions coming from Azure as the envisioned platform. The full scope forms the enterprise portfolio.
Even if we have only deployed a landing zone, there are still quite a number of building blocks that we will have to manage from that point onward.
For a network, we will have to manage, at a minimum, the following:
For compute, we will have to manage, at a minimum, the following:
Do note that compute in the cloud involves more than virtual machines. It also includes things such as containers, container orchestration, functions, and serverless computing. However, in the landing zone, these native services are often not immediately deployed. You might consider having the container platform deployed as part of the base infrastructure. Remember that, in the cloud, we do see a strong shift from VM to container, so we should prepare for that while setting up our landing zone.
In most cases, this will include setting up the Kubernetes cluster. In Azure, this is done through Azure Kubernetes Services (AKS), where we create a resource group that will host the AKS cluster. AWS offers its own cluster service through Elastic Kubernetes Service (EKS). In GCP, this is the Google Kubernetes Engine (GKE). The good news is that a lot of essential building blocks, such as Kubernetes DNS, are already deployed as part of setting up the cluster. Once we have the cluster running, we can start deploying cluster nodes, pods (a collection of application containers), and containers. In terms of consistently managing Kubernetes platforms across multi-cloud platforms, there are multiple agnostic solutions that you can look at, such as Rancher or VMWare's Tanzu Mission Control.
For storage, we will have to manage, at a minimum, the following:
Next, we will have to manage the accounts and make sure that our landing zone – the cloud environment and all its building blocks – is secure. Account management involves creating accounts or account groups that need access to the cloud environment. These are typically created in Active Directory.
In the Global admin galore – the need for demarcation section of this chapter, we will take a deeper look at admin accounts and the use of global admin accounts. Security is tightly connected to account, identity, and access management, but also to things such as hardening (protecting systems from outside threats), endpoint protection, and vulnerability management. From day 1, we must have security in place on all the layers in order to prevent, detect, assess, and mitigate any breach. This is part of SecOps. Section 4 of this book is all about securing our cloud environments.
In the cloud, we work with code. There's no need to buy physical hardware anymore; we simply define our hardware in code. This doesn't mean we don't have to manage it. To do this in the most efficient way, we need a master code repository. This repository will hold the code that defines the infrastructure components, as well as how these components have to be configured to meet our principles in terms of security and compliancy. This is what we typically refer to as the desired state.
Azure, AWS, and Google offer native tools to facilitate infrastructure and configuration as code, as well as tools to automate the deployment of the desired state. In Azure, we can work with Azure DevOps and Azure Automation, both of which work with Azure Resource Manager (ARM). AWS offers CloudFormation, while Google has Cloud Resource Manager and Cloud Deployment Manager. These are all tied into the respective platforms, but the market also offers third-party tooling that tends to be agnostic to these platforms. We will explore some of the leading tools later in this chapter, in the Orchestrating policies for multi-cloud section.
For source code management, we can use tools such as GitHub, Azure DevOps, AWS CodeCommit, and GCP Cloud Repositories.
We've already discussed the need for monitoring. The next step is to define what tooling we can use to perform these tasks. Again, the cloud platforms offer native tooling: Azure Monitoring, Application Insights, and Log Analytics; AWS CloudTrail and CloudWatch; and Google Stackdriver monitoring. And, of course, there's a massive set of third-party tools available, such as Splunk and Nagios. These latter tools have a great advantage since they can operate independent of the underlying platform. This book won't try to convince you that tool A is preferred over tool B; as an architect, you will have to decide what tool fits the requirements – and the budget, for that matter.
Security is a special topic. The cloud platforms have spent quite some effort in creating extensive security monitoring for their platforms. Monitoring is not only about detecting; it's also about triggering mitigating actions. This is especially true when it comes to security where detecting a breach is certainly not enough. Actually, the time between detecting a vulnerability or a breach and the exploit can be a matter of seconds, which makes it necessary to enable fast action. This is where SIEM comes into play: Security Incident and Event Management. SIEM systems evolve rapidly and, at the time of writing, intelligent solutions are often part of the system.
An example of this is Azure Sentinel, an Azure-native SIEM solution: it works together with Azure Security Center, where policies are stored and managed, but it also performs an analysis of the behavior it sees within the environments that an enterprise hosts on the Azure platform. Lastly, it can automatically trigger actions. For instance, it can block an account that logs in from the UK one minute and from Singapore the next – something that wouldn't be possible without warp-driven time travelling.
In other words, monitoring systems do become more sophisticated and developments become as fast as lighting.
Finally, once we have thought about all of this, we need to figure out who will be executing all these tasks. We will need people with the right skills to manage our multi-cloud environments. As we have said already, the truly T-shaped engineer or admin doesn't exist. That would be the five-legged sheep. Most enterprises end up with a group of developers and operators that all have generic and more specific skills. Some providers refer to this as the Cloud Center of Excellence (CCoE), and they mark it as an important step in the cloud journey or cloud adoption process of that enterprise. Part of this stage would be to identify the roles this CCoE should have and get the members of the CCoE on board with this. The team needs to be able to build and manage the environments, but they will also have a strong role to fulfil to evangelize new cloud-native solutions.
Tip
Just as a reading tip, please have a look at an excellent blog post on forming a CCoE by Amazon's Enterprise Strategist Mark Schwartz: https://aws.amazon.com/blogs/enterprise-strategy/using-a-cloud-center-of-excellence-ccoe-to-transform-the-entire-enterprise/.
In this section, we have learned what we need to cover to set up our operations in multi-cloud. The next step is building our landing zones on the cloud platforms.
All the major cloud providers offer a methodology that can be used to create a landing zone on their respective platforms. In this section, we will explore the landing zone concepts for Azure, AWS, and GCP.
The landing zone in Azure is part of the Cloud Adoption Framework (CAF) and implements a set of cloud services to get us started with building or migrating workloads to the Azure platform. The landing zone creates all the necessary building blocks to enable a business to start using the cloud platform.
We talked about the analogy of constructing a house previously, when we discussed scaffolding. Consider the landing zone to be the empty house. A house has a foundation; that is, a front door that provides access to the house and rooms where we can place furniture. These rooms have already been designed to cater for specific needs. The kitchen has connections for cooking equipment and a tap for running water. So does the bathroom: it has taps, a shower, a bath, but also a floor that doesn't get damaged when it gets wet. We can compare this to the landing zone: it already has rooms that have been set up for specific usage, such as to cater for outside connectivity.
Preparing these rooms for usage is something Microsoft calls refactoring. CAF guides the business in setting up security, identity, and access management, naming conventions, cost management, and so on. All these topics are deployed as part of the landing zone. Once we've finished building the landing zone, we will have a base platform that is secure and defined a naming and tagging convention for, where Role-Based Access Control (RBAC) is in place and where we have a clear insight into the costs that we are generating in the platform.
Now, what do we need for that?
First of all, we need a subscription to Azure. Next, we need to deploy rooms, the different segments in our environment where we will host our systems. In Azure, we typically deploy the hub and spoke model. This derives from the fact that Azure offers shared services, which are used across the different rooms, such as monitoring and backup services. These shared services land in the hub. The spokes connect to the hub so that they can consume the shared services from there, instead of having to deploy all these services separately into the different spokes.
The landing zone consists of code: it's Infrastructure as Code, so it drives the Azure architecture completely from code, from the very start. To do this, it uses ARM templates in JSON format. We can actually blueprint the code so that we can easily launch new spokes in a very consistent way. The blueprint would, for instance, contain code that shows how the spokes connect to the hub and how shared services are consumed. Azure offers various sample landing zone blueprints to get us started really fast. However, do check if the blueprint meets the compliancy and security requirements of your specific business.
The landing zone blueprint will provide the following:
Be aware that this landing zone is not fit to host sensitive data or mission-critical applications yet. The blueprint is deployed under the assumption that the subscription is already associated with an Azure Active Directory instance. Also, the landing zone blueprint makes the assumption that no Azure policies have to be applied. In other words, you will have an empty house with a few empty rooms that you will still need to decorate yourself, meaning that you will have to implement baselines and policies.
This gets you started. By refactoring the landing zone and adding services to improve performance, reliability, cost efficiency, and security, you will get it ready to actually host workloads:
The preceding diagram shows a basic setup for a landing zone in Azure containing a hub to host the generic services and two spokes to host the workloads. These spokes have two subnets: one for management and one for the actual workloads, such as applications.
Tip
More information about Azure Landing Zone can be found at https://docs.microsoft.com/en-us/azure/cloud-adoption-framework/ready/landing-zone/.
AWS offers AWS Landing Zone as a complete solution, based on the Node.js runtime. Like Azure, AWS offers numerous solutions so that you can set up an environment. All these solutions require design decisions. To save time in getting started, AWS Landing Zone sets up a basic configuration that's ready to go. To enable this, AWS Landing Zone deploys the so-called AWS Account Vending Machine (AVM), which provisions and configures new accounts with the use of single sign-on.
To grasp this, we must understand the way AWS environments are configured. It is somewhat comparable to the hub and spoke model of Azure, but instead of hub and spokes, AWS uses accounts. AWS Landing Zone comprises four accounts that follow the Cloud Adoption Framework (CAF) of AWS:
The following diagram shows the setup of the landing zone in AWS:
The one thing that we haven't discussed yet is the Account Vending Machine (AVM), which plays a crucial role in setting up the Landing Zone. The AVM launches the basic accounts in the Landing Zone with a predefined network and the security baseline. Under the hood, AVM uses Node.js templates that set up organization units wherein the previously described accounts are deployed with default, preconfigured settings. One of the components that is launched is the AWS SSO directory allows federated access to AWS accounts.
Tip
More information about AWS Landing Zone can be found at https://aws.amazon.com/solutions/aws-landing-zone/.
GCP differs a lot from Azure and AWS, although the hub and spoke model can also be applied in GCP. Still, you can actually tell that this platform has a different vision of the cloud. GCP focuses more on containers than on IaaS by using more traditional resources. Google talks about a landing zone as somewhere you are planning to deploy a Kubernetes cluster in a GCP project using GKE, although deploying VMs is, of course, possible on the platform.
In the landing zone, you create Virtual Private Cloud (VPC) network and set Kubernetes network policies. These policies define how we will be using isolated and non-isolated pods in our Kubernetes environment. Basically, by adding network policies, we create isolated pods, meaning that these pods – which hold a number of containers – only allow defined traffic, where non-isolated pods accept all traffic from any source. The policy lets you assign IP blocks and deny/allow rules to the pods. The next step is to define service definitions to the Kubernetes environment in the landing zone so that pods can actually start running applications or databases. The last step to create the landing zone is to configure DNS for GKE.
As we mentioned previously, Google very much advocates the use of Kubernetes and containers, which is why GCP is really optimized for running this kind of infrastructure. If we don't want to use container technology, then we will have to create a project in GCP ourselves. The preferred way to do this is through Deployment Manager and the gcloud command line. You could compare Deployment Manager to ARM in Azure: it uses the APIs of other GCP services to create and manage resources on the platform. One way to access this is through the Cloud Shell within the Google Cloud portal, but GCP also offers some nice tools to get the work done. People who are still familiar with Unix command-line programming will find this really recognizable and easy to work with.
The first step is enabling these APIs; that is, the Compute Engine API and the Deployment Manager API. By installing the Cloud SDK, we get a command-line tool called gcloud that interfaces with the Deployment Manager. Now that we have gcloud running, we can simply start a project with the gcloud config set project command, followed by the name or ID of the project itself; for example, gcloud config set project [Project ID]. Next, we must set the region where we will be deploying our resources. It uses the very same command; that is, gcloud config set compute/region, followed by the region ID; that is, gcloud config set compute/region [region].
With that, we're done! Well, almost. You can also clone samples from the Deployment Manager GitHub repository. This repository also contains good documentation on how to use these samples.
Tip
To clone the GitHub repository for Deployment Manager into your own project, use the git clone https://github.com/GoogleCloudPlatform/deploymentmanager-samples command or go to https://github.com/terraform-google-modules/terraform-google-migrate. There are more options, but these are the commonly used ways to do this.
The following diagram shows a basic setup for a GCP project:
With that, we have created landing zones in all three major cloud platforms and by doing so, we have discovered that, in some ways, the cloud concepts are similar, but that there are also some major differences in the underlying technology. Now, let's explore how we can manage these landing zones using policies, as well as how to orchestrate these policies over the different platforms.
When we work in cloud platforms, we work with code. Everything we do in the cloud is software- and code-defined. This makes the cloud infrastructure absolutely very agile, but it also means that we need some strict guidance in terms of how we manage the code, starting with the code that defines our landing zone or foundation environment. As with everything in IT, it needs maintenance. In traditional data centers and systems, we have maintenance windows where we can update and upgrade systems. In the cloud, things work a little differently.
First of all, the cloud providers apply maintenance whenever it's needed. There's no way that they can agree upon maintenance windows with thousands of customers spread across the globe. They simply do whatever needs to be done to keep the platform healthy, ready for improvements and the release of new features. Enterprises don't want to be impacted by these maintenance activities, so they will have to make sure that their code is safe at all times.
The next thing we need to take into account is the systems that the enterprise has deployed on the platform, within their own virtual cloud or project. These resources also need maintenance. If we're running VMs, we will need to patch them every now and then. In this case, we are patching code. We want to make sure that, with these activities, administrators do not accidently override certain security settings or worse, delete discs or any critical code that is required for a specific function that a resource fulfills. This is something that we must care about from the very start, when setting up the landing zones. From that point onward, we must start managing. For that, we use can policies and management tooling.
In this section, we have set up the landing zones. In the next section, we'll learn how to manage them.
This time, we'll start with AWS first. AWS offers CloudFormation Guardrails. This is a very appropriate name since it really keeps your environment on the rails. Guardrails come with four principal features for which it sets policies in JSON format. To create policies, AWS offers Policy Generator. In Policy Generator, you define the type of policy first and then define the conditions, meaning when the policy should be applied:
Tip
More information on Guardrails policies in AWS can be found ay https://aws.amazon.com/blogs/mt/aws-cloudformation-guardrails-protecting-your-stacks-and-ensuring-safer-updates/.
When we look at Azure, we must look at a service called test-driven development (TDD) for landing zones in Azure. TDD is particularly known in software development as it aims to improve the quality of software code. As we have already discussed, the landing zone in Azure is expanded through the process of refactoring, an iterative way to build out the landing zone. Azure provides a number of tools that support TDD and help in the process of refactoring the landing zone:
Tip
More information on test-driven development in Azure Landing Zone can be found at https://docs.microsoft.com/en-us/azure/cloud-adoption-framework/ready/considerations/azure-test-driven-development.
In all cases, Azure uses ARM templates, based on JSON.
As we have seen, GCP can be a bit different in terms of public cloud and landing zones. This originates from the conceptual view that Google has, which is more focused on container technology using Kubernetes. Still, GCP offers extensive possibilities in terms of setting policies for environments that are deployed on GCP. In most cases, these policies are comprised of organizations and resources that use IAM policies:
Tip
More on organization policy constraints in GCP can be found at https://cloud.google.com/resource-manager/docs/organization-policy/org-policy-constraints.
However, the binding is only one part of the policy. We also have an AuditConfig to log the policy and the metadata. The most important field in the metadata is etag. The etag field is used to guarantee that policies are used in a consistent way across the various resources in the project. If a policy is altered on one system, the etag field makes sure that the policies stay consistent. Inconsistent policies will lead to resource deployments to fail.
Policies can have multiple bindings and can be set on different levels within GCP. However, be aware that there are limitations. As an example, GCP allows a maximum of 1,500 members per policy. So, do check the documentation thoroughly, including the best practices for using policies.
TIP
Extensive documentation on Cloud IAM Policies in GCP can be found at https://cloud.google.com/iam/docs/policies.
In this section, we have learned how to create policies by enabling the basic operations (BaseOps) of our landing zones in the different clouds. The next section talks about orchestrating policies in a true multi-cloud setup, using a single repository.
So far, we've looked at the different ways we can set policies in the major cloud platforms. Now, what we really want in multi-cloud is a single repository where we can store and manage all our policies. Can we do this? From a technological perspective, we probably can: all cloud providers support JSON as a programming format. The problem is that these platforms have different concepts of deploying policies. What's the solution to this problem?
To think of a solution, we must start thinking in terms of layers and abstract logic from the code itself. What do we mean by this? A policy has a certain logic. As an example, from a security perspective, we can define that all the VMs in our environment must be hardened by following the guidelines of CIS, the baseline of the Center for Internet Security. What type of VM we're talking about is irrelevant, as is the type of operating system it runs or on what platform the VM is hosted on. The logic only says that the VM needs to be hardened by following the recommendations of the CIS framework. It's completely abstracted from the code that deploys the VM. If we do this, we can store the policies themselves in a single repository. The only thing we need to do then is add the specific code that is required to deploy the VM to our target cloud platform.
This is basically what HashiCorp's Terraform application does. Terraform abstracts policies from code so that it can deploy Infrastructure as Code on various cloud platforms from a single source of truth. For this, it uses the definition of the desired state: the code that launches the infrastructure resources is completely abstracted from the actual configuration of that resource. It's important to note that Terraform is idempotent and convergent, meaning that only the required changes are applied to return the environment to the desired state.
This point will help you gain a better understanding of desire state configuration (DSC). First of all, DSC is often associated with Microsoft PowerShell. This makes sense since DSC was indeed introduced with Windows Server 2012 R2. However, nowadays, the term desired state is more broadly used to abstract Infrastructure as Code from the actual configuration of that infrastructure. It is commonly used in CI/CD pipelines. Here, development teams can build the necessary systems and when these are pushed to production, the desired state gets deployed. An example is installing a backup agent or bringing resources under monitoring. The following diagram shows the simplified model of desired state:
Let's get back to Terraform. The syntax that Terraform uses allows us to fully abstract resources and providers. It defines blocks that can hold any type of resource, from a VM to a container, but also certain services, such as DNS. This is defined in the HashiCorp Configuration Language (HCL). The next step is to deploy these blocks to our target cloud. This is done by initializing a project in that cloud. For this, the Terraform init command is used. init will read the Terraform configuration files and import the providers needed to connect to various clouds and services.
The next step is to use the Terraform plan command, which is used to create the execution plan. This determines what actions are necessary to achieve the desired state specified in the configuration files. The last step is to use the Terraform apply command, which deploys the actions to reach the desired state.
Terraform will now apply the blocks to the cloud and, at the same time, create a so-called state file. This state file is used to apply future changes to the infrastructure: before changes are applied in an execution plan that is automatically created by the Terraform software, it runs a refresh of the actual environment to update the state file. This way, Terraform always holds the latest version of the actual deployed code and keeps environments in sync at all times.
Tip
You can use Terraform to deploy landing zones in Azure, AWS, and GCP. In Azure, this will create a basic setup that enables activity logs and a subscription for Azure Security Center. For AWS, the Terraform HCL scripts call the AWS Landing Zone solution that we described in this chapter. You can find the Terraform code for Azure at https://docs.microsoft.com/en-us/azure/cloud-adoption-framework/ready/landing-zone/terraform-landing-zone. The code for AWS has been made publicly available by the Mitoc Group on GitHub: https://github.com/MitocGroup/terraform-aws-landing-zone.
If you are already familiar with configuration tools such as Chef or Puppet, you will find that there's some overlap in the functionality of Terraform and some other tools. The big difference is that Terraform actually provisions new infrastructure resources, where most other tools are more focused on adding configuration settings to resources that have been previously deployed. This does not mean that configuration tools are useless; on the contrary. These tools have other use cases; there's no good or bad.
The key to multi-cloud is the single pane of glass view. We will discuss this frequently in this book. However, this is a complicated area. Companies such as ServiceNow target their development at creating platforms from which enterprises can do multi-cloud orchestration from just one console. At the time of writing, the latest release of ServiceNow is Orlando. It contains a product for Policy and Compliance Management that provides a centralized process for creating and managing policies, cross-cloud.
In summary, yes, you can deploy code and policies that are agnostic to different cloud platforms. However, it does require tooling. Throughout this section, we've explored some of the leading tools on the market, at time of writing. All of this requires a thorough understanding of abstracting the infrastructure resources from functionality and policies, resulting in the desired state of the resources.
Typically, when we talk about demarcation in cloud models, we refer to the matrix or delineation of responsibility: who's responsible for what in IaaS, PaaS, and SaaS computing? The following diagram shows the very basics of this matrix:
However, we need a much more granular model in multi-cloud. We have been discussing policies throughout this chapter and by now, we should have come to the conclusion that it's not very easy to draw some very sharp lines when it comes to responsibilities in our multi-cloud environment. Just look at the solution stack – even in SaaS solutions, there might be certain security and/or compliancy policies that the solution needs to adhere to. Even something such as an operating system might already be causing issues in terms of hardening: are monitoring agents from a PaaS provider allowed or not? Can we run them alongside our preferred monitoring solution? Or will that cause too much overhead on our systems? In short, the world of multi-cloud is not black and white. On the contrary, multi-cloud has an extensive color scheme to work with.
So, how do we get to a demarcation model that will work for our enterprise? Well, that's architecture. First, we don't need global admins all over our estate. This is a major pitfall in multi-cloud. We all know the cases: the database administrator that needs global admin rights to be able to execute certain actions or worse, solutions that require service accounts with such roles. It's global admin galore. Do challenge these requests and do challenge software providers – or developers, for that matter – when it comes to why systems would need the highest possible access rights in the environment.
That's where it starts: policies. In this case, a good practice is the Policy of Least Privilege (PoLP). This states that every identity is granted the minimum amount of access that is necessary to perform the tasks that have been assigned to that identity. Keep in mind that an identity, in this case, doesn't have to be a user: it can be any resource in the environment. When we are talking about users, we're addressing this as Least-Privileged User Account or Access (LPUA). PoLP helps in protecting data as data will only be accessible when a user or identity is explicitly granted access to that data. But there are more reasons to adhere to this policy. It also helps in keeping systems healthy as it minimizes risks or faults in systems. These faults can be unintended or the result of malicious conduct. We should follow the rule of least privilege at all times. We will discuss this in more detail in Chapter 15, Implementing Identity and Access Management, which is all about identity and access management.
Regarding this very first principle, there are a few more considerations that need to be made at this stage. These considerations translate into controls and with that, into deliverables that are part of BaseOps, since they are absolutely part of the foundational principles in multi-cloud. The following table shows these controls and deliverables:
Demarcation and separation of duties is very strongly related to identity and access management. That will be discussed in full in Chapter 15, Designing Identity and Access Management.
In this chapter, we have designed and set up our landing zones in the different major cloud platforms. We have learned that the foundational principles might be comparable, but the actual underlying implementation of the landing zone concepts do differ.
Next, we explored the principles of Infrastructure as Code and Configuration as Code. With tools such as Terraform, we can manage multi-cloud from one code base using configuration policies that have been abstracted from the resource code. We then learned how to define policies and how to apply these to manage our landing zones. Finally, we learned that there's a need for a redundant demarcation model in multi-cloud. This all adds up to the concept of BaseOps: getting the basics right.
Part of keeping the basics right is making sure that our environments are resilient and performing well. That's what we will be discussing in the next chapter, which is all about creating availability and scalability in the cloud.
3.95.2.54