The main goal of this chapter is to develop a foundational understanding of what multi-cloud is and why companies have a multi-cloud strategy. We will focus on the main public cloud platforms of Microsoft Azure, Amazon Web Services (AWS), and Google Cloud Platform (GCP), next to the different on-premises variants of these platforms such as Azure Stack, AWS Outposts, Google Anthos, and the VMware propositions such as VMConAWS. We will also look at the benefits, how to develop to a strategy using one or more of these platforms, and what should be the very first starting point for multi-cloud.
In this chapter, we're going to cover the following main topics:
This book aims to take you on a journey along the different major cloud platforms and will try to answer one crucial question: if my organization deploys IT systems on various cloud platforms, how do I keep control? We want to avoid cases where costs in multi-cloud environments grow over our heads, where we don't have a clear overview of who's managing the systems, and, most importantly, where system sprawl introduces severe security risks. But before we start our deep dive, we need to agree on a common understanding of multi-cloud and multi-cloud concepts.
There are multiple definitions of multi-cloud, but we're using the one stated on https://www.techopedia.com/definition/33511/multi-cloud-strategy:
Let's focus on some topics in that definition. First of all, we need to realize where most organizations come from: traditional data centers with physical and virtual systems, hosting a variety of functions and business applications. If you want to call this legacy, that's OK. But do realize that the cutting edge of today is the legacy of tomorrow. Hence, in this book, we will refer to "traditional" IT when we're discussing the traditional systems, typically hosted in physical, privately owned data centers. And with that, we've already introduced the first problem in the definition that we just gave for multi-cloud.
A lot of enterprises call their virtualized environments private clouds, whether these are hosted in external data centers or in self-owned, on-premises data centers. What they usually mean is that these environments host several business units that get billed for consumption on a centrally managed platform. You can have long debates on whether this is really using the cloud, but the fact is that there is a broad description that sort of fits the concept of private clouds.
Of course, when talking about the cloud, most of us will think of the major public cloud offerings that we have today: AWS, Microsoft Azure, and GCP. By another definition, multi-cloud is a best-of-breed solution from these different platforms, creating added value for the business in combination with this solution and/or service. So, using the cloud can mean either a combination of solutions and services in the public cloud, or combined with private cloud solutions.
But the simple feature of combining solutions and services from different cloud providers and/or private clouds does not make up the multi-cloud concept alone. There's more to it.
Maybe the best way to explain this is by using the analogy of the smartphone. Let's assume you are buying a new phone. You take it out of the box and switch it on. Now, what can you do with that phone? First of all, if there's no subscription with a telecom provider attached to the phone, the user will discover that the functionality of the device is probably very limited. There will be no connection from the phone to the outside world, at least not on a mobile network. An option would be to connect it through a Wi-Fi device, if Wi-Fi is available. In short, one of the first actions, in order to actually use the phone, would be making sure that it has connectivity.
Now we have a brand new smartphone set to its factory defaults and we have it connected to the outside world. Ready to go? Probably not. The user probably wants to have all sorts of services delivered to their phone, usually through the use of apps, delivered through online catalogs such as an app store. The apps themselves come from different providers and companies including banks and retailers, and might even be coded in different languages. Yet, by compiling the apps – transforming the code in such a way that it can be read and understood by different devices – they will work on different phones with different versions of mobile operating systems such as iOS or Android.
The user will also very likely want to configure these apps to their personal needs and wishes. Lastly, the user needs to be able to access the data on their phone. All in all, the phone has turned into a landing platform for all sorts of personalized services and data.
The best part is that in principle, the user of the phone doesn't have to worry about updates. Every now and then the operating system will automatically be updated and most of the installed apps will still work perfectly. It might take a day or two for some apps to adapt to the new settings, but in the end, they will work. And the data that is stored on the phone or accessed via some cloud directory will also still be available. The whole ecosystem around that smartphone is designed in such a way that from the end user's perspective, the technology is completely transparent:
There's a difference between hybrid IT and multi-cloud, and there are different opinions on the definitions. One is that hybrid platforms are homogenous and multi-cloud platforms are heterogenous. Homogenous here means that the cloud solutions belong to one stack, for instance, the Azure public cloud with Azure Stack on premises. Heterogenous, then, would mean combining Azure and AWS, for instance.
For now, we will keep it very simple: a hybrid environment is combining an on-premises stack – a private cloud – with a public cloud. It is a very common deployment model within enterprises. There have been numerous reports that stated some years ago that most enterprises would transform their IT to the public cloud by 2020. It was the magic year, 2020, and a lot of organizations developed a Cloud Strategy 2020. It certainly did have a nice ring to it, but magical? Not really. These same organizations soon discovered that it was not that easy to migrate all of their systems to a public cloud. Some systems would have to remain on premises, for various reasons.
Two obvious reasons were security and latency. To start with the first one: this is all about sensitive data and privacy, especially concerning data that may not be hosted outside a country, or outside certain regional borders, such as the EU. Data may not be accessible in whatever way to – as an example – US-based companies, which in itself is already quite a challenge in the cloud domain. Regulations, laws, guidelines, and compliance rules often prevent companies from moving their data off premises, even though public clouds offer frameworks and technologies to protect data at the very highest level. We will discuss this later on in this book, since security and data privacy are of utmost importance in the cloud.
Latency is the second reason to keep systems on premises. One example that probably everyone can relate to is that of print servers. Print servers in the public cloud might not be a good idea. The problem with print servers is the spooling process. The spooling software accepts the print jobs and controls the printer to which the print assignment has to be sent. It then schedules the order in which print jobs are actually sent to that printer. Although print spoolers have been improved massively over the last years, it still takes some time to execute the process. Print servers in the public cloud might cause delays in that process. Fair enough: it can be done, and it will work if configured in the right way, in a cloud region close to the sending PC and receiving printer device, plus accessed through a proper connection.
You get the idea, in any case: there are functions and applications that are highly sensitive to latency. One more example: retail companies have warehouses where they store their goods. When items are purchased, the process of order picking starts. Items are labeled in a supply system so that the company can track how many of a specific item are still in stock, where the items originate from, and where they have to be sent. For this functionality, items have a barcode or QR code that can be scanned with RFID or the like. These systems have to be close to the production floor in the warehouse or – if you do host them in the cloud – accessible through really high-speed, dedicated connections on fast, responsive systems.
These are pretty simple and easy-to-understand examples, but the issue really comes to life if you start thinking about the medical systems used in operating theatres, or the systems controlling power plants. It is not that useful to have an all-public cloud, cloud-first, or cloud-only strategy for quite a number of companies and institutions. That goes for hospitals, utility companies, and also for companies in less critical environments.
Yet, all of these companies discovered that the development of applications was way more agile in the public cloud. Usually, that's where cloud adoption starts: with developers creating environments and apps in public clouds. It's where hybrid IT is born: the use of private systems in private data centers for critical production systems that host applications with sensitive data that need to be on premises for latency reasons, while the public cloud is used to enable the fast, agile development of new applications.
From the analogy with the smartphone, it should be clear that with multi-cloud we're also talking about services, much more than just hosting systems in a private data center and a public cloud. This would mainly be Infrastructure as a Service (IaaS), where organizations run virtualized and non-virtualized physical machines in that private cloud and virtual machines in the public cloud.
In multi-cloud setups, we are also talking about Platform as a Service (PaaS) and Software as a Service (SaaS). In multi-cloud setups, it can become much more of a mixed mode, just as on our smartphone that holds data on the device itself stores and retrieves data from other sources, connecting remotely to apps or hosting the apps on the phone, making use of services through APIs in that app.
In multi-cloud, we can do exactly the same, leveraging functions and applications running on virtual machines on a private system with SaaS functionality connecting over the internet from a third-party provider, for example, to execute specific data analytics. The data may still reside in a private environment, where the runtime environment is executed from a public cloud source, or the other way around in the case of running models against data lakes that are fed with data streams from different sources, where the results of these models are delivered to private systems.
That is what multi-cloud is all about. Leveraging applications, data, and services from different cloud platforms and using different delivery models such as PaaS and SaaS. It might include hybrid IT, but it is more of a mixed mode in order to create more added value for the business by combining and optimizing cloud solutions. The next question is: how can organizations create that optimum combination of services, and by doing so, create that added value for their business?
Let's dive into the definition of a real cloud strategy.
The most common reason for organizations to adopt a multi-cloud strategy is a classic one: to avoid lock-in. Organizations simply do not want to be locked into one platform or a single service. However, that isn't really a strategy. It would be more the outcome of a strategy.
A strategy emerges from the business and the business goals. Business goals, for example, could include the following:
Business strategies often start with increasing revenue as a business goal. In all honesty: that should indeed be a goal, otherwise you'll be out of business before you know it. The strategy should focus on how to generate and increase revenue. We will explore more on this in the next chapter.
How do you get from business goals to defining an IT strategy? That is where enterprise architecture comes into play. The most used framework for enterprise architecture is TOGAF, The Open Group Architecture Framework. The core of TOGAF is the ADM cycle, short for Architecture Development Method. Also, in architecting multi-cloud environments, ADM is applicable. The ground principle of ADM is B-D-A-T: the cycle of business, data, applications, technology. This perfectly matches the principle of multi-cloud, where the technology should be transparent. Businesses have to look at their needs, define what data is related to those needs, and how this data is processed in applications. This is translated into technological requirements and finally drives the choice of technology, integrated into the architecture vision as follows:
This book is not about TOGAF, but it does make sense to have knowledge of enterprise architecture and, for that matter, TOGAF is the leading framework for that. TOGAF is published and maintained by The Open Group. More information can be found at https://www.opengroup.org/togaf.
The good news is that multi-cloud offers organizations flexibility and freedom of choice. That also brings a risk: lack of focus. Therefore, we need a strategy. Most companies adopt cloud and multi-cloud strategies since they are going through a process of transformation from a more-or-less traditional environment to a digital future. Is that relevant for all businesses? The answer is yes. In fact, more and more businesses are coming to the conclusion that IT is a core activity.
Times have changed over the last few years in that respect. At the end of the nineties and even at the beginning of the new millennium, a lot of companies outsourced their IT since it was not considered to be a core activity. That has changed dramatically over the last 10 years or so. Every company is a software company – a message that was rightfully quoted by Microsoft CEO Satya Nadella, following an earlier statement by the father of software quality, Watts S. Humphrey, who already claimed at the beginning of the millennium that every business is a software business.
Both Humprey and Nadella are right. Take banks as an example: they have been transforming to become more and more like IT companies. They deal with a lot of data streams, execute data analytics, and develop apps for their clients. A single provider might not be able to deliver all of the required services, hence these companies look for a multi-cloud, best-of-breed solutions to fulfill these requirements.
These best-of-breed solutions might contain traditional workloads with a classic server-application topology, but will more and more shift to the use of PaaS, SaaS, container, and serverless solutions in an architecture that is more focused on microservices and cloud native. This has to be considered when defining a multi-cloud strategy: a good strategy would not be "cloud first" but "cloud fit."
Of course, businesses evolve and so does technology. This is translated into a roadmap, driven by the business but including the technical possibilities and opportunities over a certain period of time. Such a roadmap will typically have a number of stages, beginning with a current state of the environment, shifting to industry-standard solutions that are immediately available, to a future state with cutting-edge technology. In the next chapter, we will have a closer look at the definition of such a roadmap and how it helps accelerate the business.
We have to make one final remark when it comes to setting out a multi-cloud strategy. It concerns security: that should always be a key topic in every strategy and in every derived roadmap. All of the public clouds and leading cloud technology providers have adopted security-by-design principles and offer a wide variety of very good solutions for information security. It's fair to say that, for example, Azure, AWS, and GCP are likely the best-secured platforms in the world. But it doesn't take away your responsibility to control security standards, frameworks, principles, and rules that specifically apply for your type of business. Using multi-cloud for hosting businesses might lower the risk of attacks taking down the whole environment, but it does also add complexity. Section 4, Security Control in Multi-Cloud with SecOps, of this book is all about SecOps—security operations.
We have been talking about public and private clouds. Although it's probably clear what we commonly understand by these terms, it's probably a good idea to have a very clear definition of both. We adhere to the definition as presented on the Microsoft website: the public cloud is defined as computing services offered by third-party providers over the public internet, making them available to anyone who wants to use or purchase them. The private cloud is defined as computing services offered either over the internet or a private internal network and only to select users instead of the general public. There are many more definitions, but these serve our purpose very well.
In the public cloud, the best-known providers are AWS, Microsoft Azure, GCP, and public clouds that have OpenStack as their technological foundation. An example of the latter one is Rackspace. These are all public clouds that fit the definition that we just gave, but there are also some major differences.
AWS and Azure have a common starting ground, however – both platforms evolved from making storage publicly available over the internet. At AWS, it started with a storage service called the Simple Storage Solution, or S3. Azure also started off with storage.
AWS, Azure, and GCP all offer a wide variety of managed services to build environments, but they all differ very much in the way you apply the technology. In short: the concepts are more or less alike, but under the hood, these are completely different beasts. It's exactly this that makes managing multi-cloud solutions complex.
There are many more public cloud offerings, but these are usually not fit for all purposes. Major software manufacturers including Oracle and SAP also have public cloud offerings available, but these are really tailored to hosting the specific software solutions of these companies. Nonetheless, they are part of the multi-cloud landscape, since a lot of enterprises use, for instance, enterprise resource planning software from SAP and/or data solutions from Oracle. These companies are also shifting their solutions more and more to fully scalable cloud environments, where they need to be integrated with systems that reside on premises or in other clouds. In some cases, these propositions have evolved to full clouds, such as OCI by Oracle. Over the course of this book, we will address these specific propositions, since they do require some special attention. Just think of license management, as an example.
In this book, we will mainly focus on the major players in the multi-cloud portfolio, as represented in the following diagram:
We have been discussing Microsoft Azure, AWS, GCP, and OpenStack as the main public cloud platforms. As said, there are more platforms, but in this book, we are limiting our discussions to the main players in the field and adhering to the platforms that have been identified as leaders by Gartner and Forrester.
So far, we've looked at the differences between private and public clouds and the main players in the public cloud domain. In the next section, we will focus on the leading private propositions for enterprises.
Most companies are planning to move, or are actually in the midst of moving, their workloads to the cloud. In general, they have a selected number of major platforms that they choose to host the workloads: Azure, AWS, GCP, and that's about it. Fair enough, there are more platforms, but the three mentioned are the most dominant ones, and will continue to be so throughout the forthcoming decades, if we look at analysts' reports.
As we already found out in the previous paragraphs, in planning for and migrating workloads to these platforms, organizations also discover that it does get complex. Even more important, there are more and more regulations in terms of compliance, security, and privacy that force these companies to think twice before they bring our data onto these platforms. And it's all about the data, in the end. It's the most valuable asset in any company – next to people.
The solution: instead of bringing data to the cloud, we're taking the cloud to the data – again. Over the last few years, we've seen a new movement where the major cloud providers have started stepping into domains where other companies were still traditionally dominant; companies such as storage providers and system integrators. The new reality is that public cloud providers are shifting more and more into the on-premises domain.
In the private cloud, VMware seems to be the dominant platform, next to environments that have Microsoft with Hyper-V technology as their basis. Yet, Microsoft is pushing customers more and more to consumption in Azure and where systems need to be kept on premises, they have a broad portfolio available with Azure Stack, which we will discuss in a bit more detail later in this chapter.
Especially in European governmental environments, OpenStack still seems to do very well, to avoid having data controlled or even viewed by American-based companies. However, the adoption and usage of OpenStack seems to be declining.
In this chapter, we will look briefly at both VMware and OpenStack as private stack foundations. After that, we'll have a deeper look at AWS Outposts and Google Anthos. Basically, both propositions extend the public clouds of AWS and GCP into the privately owned data center. Outposts is an appliance that comes as a preconfigured rack with compute, storage, and network facilities. Anthos by Google is more a set of components that can be utilized to specifically host container platforms in on-premises environments using the Google Kubernetes Engine (GKE). Finally, in this chapter, we will have a look at the Azure Stack portfolio.
In essence, VMware is still a virtualization technology. It started off with the virtualization of x86-based physical servers, enabling multiple virtual machines on one physical host. Later, VMware introduced the same concept to storage with vSAN (virtualized SAN) and NSX (network virtualization and security) that virtualizes the network, making it possible to adopt micro-segmentation in private clouds. The company has been able to constantly find ways to move along with the shift to the cloud – as an example, by developing a proposition together with AWS where VMware private clouds can be seamlessly extended to the public cloud.
Today, VMware is also a strong player in the field of containerization with Pivotal Kubernetes Services (PKS) and container orchestration with Tanzu Mission Control. Over the last few years, the company has strengthened its position in the security domain, again targeting the multi-cloud stack. Basically, VMware is trying to become the spider in the multi-cloud web by leveraging solutions on top of the native public cloud players.
There are absolutely benefits to OpenStack. It's a free and open source software platform for cloud computing, mostly used as IaaS. OpenStack uses KVM as its main hypervisor, although there were more hypervisors available for OpenStack. It was—and still is, with a group of companies and institutions—popular since it offered a stable, scalable solution while avoiding vendor lock-in on the major cloud and technology providers. Major integrators and system providers such as IBM and Fujitsu adopted OpenStack in their respective cloud platforms, Bluemix and K5 (decommissioned internationally in 2018).
However, although OpenStack is open source and can be completely tweaked and tuned to specific business needs, it is also complex, and companies find it cumbersome to manage. Most of these platforms do not have the richness of solutions that, for example, Azure, AWS, and GCP offer to their clients. Over the last few years, OpenStack seems to have lost its foothold in the enterprise world, yet it still has a somewhat relevant position and certain aspects are therefore considered in this book.
Everything you run on the AWS public cloud, you can now run on an appliance, including Elastic Compute Cloud (EC2), Elastic Block Store (EBS), databases, and even Kubernetes clusters with Elastic Kubernetes Services (EKS). It all seamlessly integrates with the virtual private cloud (VPC) that you would have deployed in the public cloud, using the same APIs and controls. That is, in a nutshell, AWS Outposts: the AWS public cloud on premises.
You can buy VMConAWS through VMware or through AWS.
VMConAWS actually extends the private cloud to the public cloud, based on HCX by VMware. VMware uses bare metal instances in AWS to which it deploys vSphere, vSAN storage, and NSX for software-defined networking.
You can also use AWS services on top of the configuration of VMConAWS through integration with AWS. Outposts works exactly the other way around: bringing AWS to the private cloud.
Anthos brings Google Cloud – or more accurately, the Google Kubernetes Engine – to the on-premises data center, just as Azure Stack does for Azure and Outposts for AWS, but it focuses on the use of Kubernetes as a landing platform, moving and converting workloads directly into containers using GKE. It's not a standalone box, such as Azure Stack or Outposts. The solution runs on top of virtualized machines using vSphere, and is more a PaaS solution. Anthos really accelerates the transformation of applications to more cloud-native environments, using open source technology including Istio for microservices and Knative for the scaling and deployment of cloud-native apps on Kubernetes.
More information on the specifics of Anthos can be found at https://cloud.google.com/anthos/gke/docs/on-prem/how-to/vsphere-requirements-basic.
The most important feature of Azure Stack Hyperconverged Infrastructure (HCI) is that it can run "disconnected" from Azure. To put it very simply: HCI works like the commonly known branch office server. Basically, HCI is a box that contains compute power, storage, and network connections. The box holds Hyper-V-based virtualized workloads that you can manage with Windows Admin Center. So, why would you want to run this as Azure Stack then? Well, Azure Stack HCI also has the option to connect to Azure services, such as Azure Site Recovery, Azure Backup, and Azure Monitoring.
It's a very simple solution that only requires Microsoft-validated hardware, the installation of Windows Server 2019 Datacenter Edition, plus Windows Admin Center and optionally an Azure account to connect to specific Azure cloud services.
Pre-warning: it might get a bit complicated from this point onward: Azure Stack HCI is also the foundation underneath Azure Stack Hub (side note: all Azure products are based on Windows Server 2019). Yet, Hub is a different solution. Whereas you can run Stack HCI standalone, Hub as a solution is integrated with the Azure public cloud – and that's really a different ballgame. It's the reason why you can't upgrade HCI to Hub.
Azure Stack Hub is really the on-premises extension of the Azure public cloud. Almost everything you can do in the public cloud of Microsoft, you could also deploy on Hub: from VMs to apps, all managed through the Azure portal or even PowerShell. It all really works like Azure, including things such as configuring and updating fault domains. Hub also supports having an availability set with a maximum of three fault domains to be consistent with Azure. This way you can create high availability on Hub just as you would in Azure.
The perfect use case for Hub and the Azure public cloud would be to do development on the public cloud and move production to Hub, should apps or VMs need to be hosted on premises for compliance reasons. The good news is that you can configure your pipeline in such a manner that development and testing can be executed on the public cloud and run deployment of the validated production systems, including desired state configuration, on Hub. This will work fine since both entities of the Azure platform use the Azure resource providers in a consistent way.
There are a few things to be aware of, though. The compute resource provider will create its own VMs on Hub. In other words: it does not copy the VM from the public cloud to Hub. The same applies to network resources. Hub will create its own network features such as load balancers, vNets, and network security groups (NSGs). As for storage, Hub allows you to deploy all storage forms that you would have available on the Azure public cloud, such as blob, queue, and tables. Obviously, we will discuss all of this in much more detail in this book, so don't worry if a number of terms don't sound familiar at this time.
One last Stack product is Stack Edge. Previously, Microsoft sold Azure Stack Edge as Data Box: it's still part of the Data Box family. Edge makes it easy to send data to Azure. As Microsoft puts it on their website: Azure Stack Edge acts as a network storage gateway and performs high-speed transfers to Azure. The best part? You can manage Edge from the Azure portal. Sounds easy, right?
Hold on. There's more to it. It's—again—called Kubernetes. Edge runs containers to enable data analyses, perform queries, and filter data at edge locations. Therefore, Edge supports Azure VMs and Azure Kubernetes Services (AKS) clusters that you can run containers on. Edge, for that matter, is quite a sophisticated solution since it also integrates with Azure Machine Learning (AML). You can build and train machine learning models in Azure, run them in Azure Stack Edge, and send the datasets back to Azure. For this, the Edge solution is equipped with the FPGAs (Field Programmable Gate Arrays) and GPUs (Graphics Processing Units) required to speed up building and (re)training the models.
Having said this, the obvious use case comes with the implementation of data analytics and machine learning where you don't want raw data to be uploaded to the public cloud straight away.
There's one more feature that needs to be discussed at this point and that's Azure Arc, launched at Ignite 2019. Arc allows you to connect non-Azure machines to Azure and manage these non-Azure workloads as if they were fully deployed on Azure itself.
If you want to connect a machine to Arc, you need to install an agent on that machine. It will then get a resource ID and become part of a resource group in your Azure tenant. However, this won't happen until you've configured some settings on the network side of things and registered the appropriate resource providers (Microsoft.HybridCompute and Microsoft.GuestConfiguration). Yes, this does require proficient PowerShell skills. If you perform the actions successfully, then you can have non-Azure machines managed through Azure. In practice, this means that you can add tagging and policies to these workloads. That sort of defines the use case: managing the non-Azure machines in line with the same policies as the Azure machines. These do not necessarily have to be on premises. That's likely the best part of Arc: it also works on VMs that are deployed in AWS.
With that last remark on Arc, we've come to the core of the multi-cloud discussion, and that's integration. All of the platforms that we studied in this chapter have advantages, disadvantages, dependencies, and even specific use cases. Hence, we see enterprises experimenting with and deploying workloads in more than one cloud. That's not just to avoid cloud vendor lock-in: it's mainly because there's not a "one size fits all" solution.
In short, it should be clear that it's really not about cloud first. It's about getting cloud fit, that is, getting the best out of an ever-increasing variety of cloud solutions. This book will hopefully help you to master working with the mix of these solutions.
In this chapter, we've learned what a true multi-cloud concept is. It's more than a hybrid platform, comprising different cloud solutions such as IaaS, PaaS, SaaS, containers, and serverless in a platform that we can consider to be a best-of-breed mixed zone. You are able to match a solution to the given business strategy. Here, enterprise architecture comes into play: business requirements are leading at all times and enabled by the use of data, applications, and lastly by the technology. Enterprise architecture methodologies such as TOGAF are good frameworks for translating a business strategy into an IT strategy, including roadmaps.
In the last section, we looked at the various main players in the field of private and public clouds. Over the course of this book, we will further explore the portfolios of these providers and discuss how we can integrate solutions, really mastering the multi-cloud domain.
In the next chapter, we will further explore the enterprise strategy and see how we can accelerate business innovation using multi-cloud concepts.