Chapter 1
Cloud Concepts, Architecture, and Design

Domain 1 establishes the foundation of knowledge required to adequately secure cloud environments, including an overview of key architectural concepts and security principles applied to cloud environments. This information is fundamental for all other topics in cloud computing. A set of common definitions, architectural standards, and design patterns will put everyone on the same level when discussing these ideas and using the cloud effectively and efficiently.

Understand Cloud Computing Concepts

The first task is to define common concepts. In the following sections, we will provide common definitions for cloud computing terms and will discuss the various participants in the cloud computing ecosystem. We will also discuss the characteristics of cloud computing, answering the question “What is cloud computing?” We will also examine the technologies that make cloud computing possible.

Cloud Computing Definitions

Cloud computing is a quickly evolving practice, with new concepts and paradigms of computing being introduced quickly. Cloud computing itself represented a major shift from traditional on-premises infrastructure, data centers, and colocation facilities, and to apply security to these new environments it is essential to have a firm understanding of core concepts.

Cloud Computing

The National Institute of Standards and Technology (NIST) Special Publication (SP) 800-145 provides a widely accepted definition of cloud computing: “a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources … that can be rapidly provisioned and released with minimal management effort or service provider interaction.” The document formalizes definitions of cloud computing and services, including the five essential characteristics that define a cloud service, cloud service categories, and deployment models. These are discussed in more detail later in this chapter.

Cloud computing expands earlier concepts of distributed computing or parallel computing, even when done over a network, in a number of critical ways. It is a philosophy that creates access to computing resources in a simple, self-driven way. Although an organization or individual may negotiate a contract, rates, and service levels from a cloud provider, once access is granted a true cloud computing environment typically does not require involvement by the cloud service provider (CSP).

Cloud computing requires a network in order to provide broad access to infrastructure, development tools, and software solutions. It requires some form of self-service to allow users to reserve and access these resources at times and in ways that are convenient to the user.

The provisioning of resources needs to be automated so that human involvement is limited. Any user should be able to access their account and procure additional resources or reduce current resource levels by themself, without the need for manual work by the CSP staff.

An example is Dropbox, a cloud-based file storage system. An individual creates an account, chooses the level of service they want or need, and provides payment information. Once this is done, the service and storage are immediately available. A company might negotiate contract rates more favorable than are available to the average consumer, but once the contract is in place, the company's employees can access this resource without the need for any additional provisioning by Dropbox staff.

A final important concept in cloud computing deals with the financial accounting for cloud services. While this is typically outside the role of the security practitioner, it is a key driver for many organizations adopting cloud computing and is helpful to understand. Purchasing servers and building data centers to house them are known as capital expenditures (CapEx), while services like cloud computing are known as operating expenditures (OpEx). In most places OpEx spending is preferable due to more favorable tax treatment; whatever an organization spends in OpEx reduces taxable income, thereby reducing the organization's tax bill.

Service and Deployment Models

There are three service models and four deployment models in which cloud services can be provisioned. These are discussed in detail later in this chapter, but a basic understanding is essential to begin exploring other cloud concepts.

The three service models are software as a service (SaaS), platform as a service (PaaS), and infrastructure as a service (IaaS). The key differences between these models include the level of control the consumer has over the cloud service as well as the level of effort required to use the service.

There are four deployment models for cloud services: public, private, community, and hybrid clouds. These define who owns and controls the underlying infrastructure of a cloud service and who can access a specific cloud service. Additionally, organizations may adopt a multi-cloud deployment strategy, combining two or more of these deployment models across their technology stack.

These concepts will be discussed further in the “Cloud Service Categories” and “Cloud Deployment Models” sections later in this chapter.

Cloud Computing Roles and Responsibilities

There are a number of roles in cloud computing, and understanding each role allows clearer understanding of each of the cloud service models, deployment models, security responsibilities, and other aspects of cloud computing.

Cloud Service Customer

The cloud service customer (CSC) is the company or person purchasing the cloud service, or in the case of an internal customer, the employee using the cloud service. For example, a SaaS CSC would be any individual or organization that subscribes to a cloud-based email service. A PaaS CSC would be an individual or organization subscribing to a PaaS resource. A PaaS resource could be a development platform. With an IaaS solution, the customer is a system administrator who needs infrastructure to support their enterprise. The CSC consumes the services provided by the cloud service provider.

Cloud Service Provider

The CSP is the company or other entity offering cloud services. CSPs may be public companies providing cloud services to any customer but can also be an internal IT department that provisions cloud platforms to other units of the organization. A CSP may offer SaaS, PaaS, or IaaS services in any combination. For example, major CSPs such as AWS, Microsoft Azure, and Google Cloud offer both PaaS and IaaS services. Major SaaS CSPs include companies like Salesforce and Dropbox, as well as Microsoft 365 and Google Workspace, which are SaaS offerings built on top of the same cloud components that make up Azure and Google Cloud.

As the cloud environment becomes more complicated, with hybrid clouds and community clouds that federate across multiple cloud environments, the responsibility for security becomes ever more complex. As the customer owns their data and processes, they have a responsibility to review the security policies and procedures of any and all CSPs in use at their organization, and the federated responsibilities that may exist between multiple CSPs and data centers.

Cloud Service Partner

A cloud service partner is a third-party offering a variety of cloud-based services (infrastructure, storage and application services, and platform services) using the associated CSP. An AWS cloud service partner uses AWS to provide their services. The cloud service partner can provide customized interfaces, load balancing, and a variety of services. It may be an easier entrance to cloud computing, as an existing vendor may already be a cloud service partner. The partner has experience with the underlying CSP and can introduce a customer to the cloud more easily.

The cloud partner network is also a way to extend the reach of a CSP. The cloud service partner will brand its association with the CSP. Some partners align with multiple CSPs, giving the customer a great deal of flexibility. Partners can extend the value of cloud services by selling additional services, support, management, and consulting to organizations that lack these skills or capabilities.

Cloud Service Broker

A cloud service broker is similar to a broker in any industry. Companies use a broker to find solutions to their cloud computing needs. The broker will package services in a manner that benefits the customer. This may involve the services of multiple CSPs. A broker is a value-add service and can be an easy way for a company to begin a move into the cloud. A broker adds value through aggregation of services from multiple parties, integration of services with a company's existing infrastructure, and customization of services that a CSP cannot or will not make. They may also be able to offer discounts due to volume purchasing of cloud services, which is beneficial to smaller organizations that lack the bargaining power of a high-volume purchaser.

Just as with any vendor, it is crucial to vet the capabilities and reputation of a CSB before engaging their services. Each serves a specific market, utilizing different cloud technologies. It is important that the CSBs selected are a good fit for the customer organization and its cloud strategy. While this is typically an operational concern rather than a security one, inadequate capabilities in the cloud solution can give rise to security problems if needed security controls cannot be implemented.

Regulator

Cloud computing itself is not heavily regulated, similar to most IT environments, which are merely tools. The use of those tools, specifically the processing of data, is regulated. Examples of regulatory frameworks that govern cloud data processing include the European Union General Data Protection Regulation (EU GDRP), the Graham-Leach-Bliley Act (GLBA), and the Personal Information Protection and Electronic Documents Act (PIPEDA), which are privacy laws in Europe, the United States, and Canada, respectively. While none explicitly identify cloud computing, they do require organizations that collect, process, or store data to properly safeguard it. Cloud customers must be aware of any regulations that affect their data or business processes and choose or configure CSP resources that meet those regulatory requirements.

Common regulatory issues that impact cloud usage include security of data at rest and in transit. When looking at data in a cloud environment, the broad network accessibility characteristic usually requires the use of the Internet to interact with systems, so these regulations demand adequate encryption to protect the data as it moves into and out of the cloud. Similarly, the shared multitenant nature of cloud services and involvement of third-party administrators working for the CSP demand proper controls for the data at rest; encryption is a common control that can mitigate the risk of unauthorized disclosure so long as keys are properly managed.

Regulatory bodies have published guidance for organizations utilizing cloud computing services to handle sensitive data, and CSPs share responsibility for providing service offerings that are compliant with their customers' regulatory requirements. For example, the major CSPs offer configurations of many services that are compliant with various regulations, though there may be additional costs associated with these specialized offerings. Ultimately it is the responsibility of the consumer to identify all requirements associated with their data and choose, architect, and maintain cloud solutions in line with those requirements.

Examples of regulator guidance on cloud computing include the following:

Shared Responsibility Model

Depending on the service provided (SaaS, PaaS, or IaaS), the responsibilities of the CSP vary considerably. In all cases, security in the cloud is a shared responsibility between the CSP and the customer. This shared responsibility is a continuum, with the customer taking a larger security role in an IaaS service model and the CSP taking a larger role in the security of a SaaS service model. The responsibilities of a PaaS fall somewhere in between. But even when a CSP has most of the responsibility in a SaaS solution, the customer is ultimately responsible for the data and processes they put into the cloud.

The major CSPs publish their variations of a shared responsibility model detailing the assignment of various aspects of security to the CSP, the CSC, or both. In most cases, the CSP is solely responsible for operational concerns such as environmental controls within the data center, as well as security concerns such as physical access controls. Customers using the cloud service are responsible for implementing data security controls, such as encryption, that are appropriate to the type of data they are storing and processing in the cloud. Some areas require action by both the provider and customer, so it is crucial for a CCSP to understand which cloud service models are in use by the organization and which areas of security must be addressed by each party. The generic model in Table 1.1 identifies key areas of responsibility and ownership in various cloud service models.

TABLE 1.1 Cloud Shared Responsibility Model

C = Customer, P = Provider

ResponsibilityIAASPAASSAAS
Data classificationCCC
Identity and access managementCC/PC/P
Application securityCC/PC/P
Network securityC/PPP
Host infrastructureC/PPP
Physical securityPPP

A variety of CSP-specific documentation exists to define shared responsibility in each CSP's offerings, and a CCSP should be familiar with the particulars of the provider their organization is utilizing. The following is a brief description of the shared responsibility model for several major CSPs and links to further resources:

  • Amazon Web Services (AWS): Amazon identifies key differences for responsibility “in” the cloud versus security “of” the cloud. Customers are responsible for data and configuration in their cloud apps and architecture, while Amazon is responsible for shared elements of the cloud infrastructure including hardware, virtualization software, environmental controls, and physical security.

    More information can be found here: aws.amazon.com/compliance/shared-responsibility-model.

  • Microsoft Azure: Microsoft makes key distinctions by the service model and specific areas such as information and data and OS configuration. Customers always retain responsibility for managing their users, devices, and data security, while Microsoft is exclusively responsible for physical security. Some areas vary by service model, such as OS configuration, which is a customer responsibility in IaaS but a Microsoft responsibility in SaaS.

    More information can be found here: docs.microsoft.com/en-us/azure/security/fundamentals/shared-responsibility.

  • Google Cloud Platform (GCP): Google takes a different approach with a variety of shared responsibility documentation specific to different compliance frameworks such as ISO 27001, SOC 2, and PCI DSS. The same general rules apply, however: customer data security is always the customer's responsibility, physical security is always Google's responsibility, and some items are shared depending on what service offerings are utilized.

    More information can be found here: cloud.google.com/security.

Key Cloud Computing Characteristics

The NIST SP 800-145 definition of cloud computing describes certain characteristics that must be present for an IT service to be considered a cloud service. Not every third-party solution is a cloud solution. Understanding the key characteristics of cloud computing will allow you to distinguish between cloud solutions and noncloud solutions. This is important as these characteristics result in certain security challenges that may not be shared by noncloud solutions.

On-Demand Self-Service

The NIST definition of cloud computing identifies an on-demand service as one “that can be rapidly provisioned and released with minimal management effort or service provider interaction.” This means the user must be able to provision these services simply and easily when they are needed. If you need a Dropbox account, you simply set up an account and pay for the amount of storage you want, and you have that storage capacity immediately. If you already have an account, you can expand the space you need by simply paying for more space. The access to storage space is on demand; neither creating an account nor expanding the amount of storage available requires the involvement of people from the CSP. This capability is automated and provided via a dashboard or other simple interface.

On-demand self-service offers advantages of speed and flexibility compared to traditional IT services that required lengthy provisioning processes. However, this ease of use can facilitate the poor practice known as shadow IT. Any individual, team, or department can bypass company policies and procedures that handle the provisioning and control of IT services. A team that wants to collaborate can choose and provision OneDrive, Dropbox, SharePoint, or another service to facilitate collaboration. This can lead to sensitive data being stored in locations that do not adhere to required corporate controls and places the data in locations the larger business is unaware of and cannot adequately protect.

In the past, provisioning IT resources involved significant spending, but the pricing of cloud services may fall below spending thresholds that require reviews and approvals. Large projects typically require some reviews and approvals from departments such as finance, accounting, IT, security, or vendor management. Setting up a cloud service is typically much cheaper and can be done using a credit card, meaning the new IT service circumvents processes designed to evaluate and mitigate security risks.

If this behavior is allowed to proliferate, the organization can lose control of its sensitive data and processes. For example, the actuary department at an insurance company may decide to create a file-sharing account on one of several available services. As information security was not involved, company policies, procedures, risk management, and controls programs are not followed. As this is not monitored by the security operations center (SOC), a data breach may go unnoticed, and the data that gives the company a competitive advantage could be stolen, altered, or deleted. Counterintuitively, shadow IT can also lead to increased spending. If all departments set up and maintain their own cloud environments, the organization loses the ability to negotiate lower rates in exchange for volume purchasing, and different groups may even pay for the same services, potentially doubling costs.

Broad Network Access

Cloud services assume the presence of a network. For public and community clouds, this is the Internet. For a private cloud, it could be the corporate network—generally an IP-based network—and possibly the Internet and a secure remote access method such as a VPN. In either case, cloud services are not local solutions stored on your individual computer. They are solutions that require the use of a network to access services hosted in the cloud. Without broad and ubiquitous network access, the cloud becomes inaccessible and is no longer useful.

Not all protocols and services on IP-based networks are secure. Part of the strategy to implementing a secure cloud solution is to choose secure protocols and services. For example, Hypertext Transfer Protocol (HTTP) and File Transfer Protocol (FTP) should not be used to move data to and from cloud services as they send unencrypted data. HTTP Secure (HTTPS), Secure FTP (SFTP), and other encryption-based transmission should be used so that data in motion may be intercepted but not read.

If you are able to access the cloud service and obtain access to your data anywhere in the world, so can others. The requirement for identification and authentication becomes more important in this public-facing environment. The security of accessing your cloud services over the Internet can be improved in a number of ways including improved passwords, multifactor authentication (MFA), virtual private networks (VPNs), etc. The increased security needs of a system available over the network where security is shared between the CSP and customer makes these additional steps more important. For clouds that require remote access, traditional security models that assume a secure perimeter are no longer applicable. This drives new requirements for network security such as zero trust architecture, which is discussed later in this chapter.

Multitenancy

One way to get the improved efficiencies of cloud computing is through the sharing of infrastructure. CSPs provide a virtual set of resources including memory, computing power, and storage space, which customers share. This is known as a multitenant model, similar to an apartment building where tenants share resources and services but have their own dedicated space. Virtualization enables the appearance of single tenancy in a multitenancy situation. Ideally, each tenant's data remains private and secure in the same way that your belongings (data) in an apartment building remain secure and isolated from the belongings (data) of your neighbor. However, incorrect access settings and software flaws in virtualization software may be exploitable to grant unauthorized access.

In a multitenant model it is the responsibility of each tenant to exercise care to maintain the integrity and confidentiality of their own data. If your apartment door is left unsecured, any other tenant in the building could easily enter and steal your belongings. It is also necessary to consider the availability of the data, as the actions of the CSP or another tenant could make your data inaccessible for a time due to no fault of your own. A software upgrade that causes system outages could impact other users of shared infrastructure, just as a fire in one apartment could cause damage to surrounding apartments. A multitenant environment increases the importance of disaster recovery (DR) and business continuity (BC) planning; luckily, other aspects of cloud services make planning high-availability (HA) infrastructure easier and cheaper.

Rapid Elasticity and Scalability

In a traditional computing model, a company needs to plan and buy for anticipated infrastructure needs. If they estimate poorly, they will either have too little capacity, leading to loss of availability, or have excess capacity that represents wasted money. In a cloud solution, elastic infrastructure allows the service to grow or shrink as needed to support the customer's demand. If there is a peak in usage or resource needs, the service grows, or scales, to meet the demand. When usage falls back to normal levels, the resources are released. This supports a pay-as-you-go model, where a customer pays only for the resources they actually consumed rather than excess capacity for potential future needs.

For the CSP, this presents a challenge. The CSP must have the excess capacity to serve all their customers without having to incur the cost of the total possible resource usage. They must, in effect, estimate how much excess capacity they need to serve all of their customers. If they estimate poorly, the customer will suffer, and the CSP's customer base could decrease.

There is a cost to maintaining this excess capacity. The cost must be built into the cost model. In this way, all customers share in the cost of the CSP, maintaining some level of excess capacity. However, some cloud customers can achieve cost savings by sharing excess capacity only when they need it. For example, an online retail store is likely to need excess capacity during major holidays, while a tax preparer needs it at a different time. Both organizations can access the resources as demand peaks, without having to pay for the full set of resources during nonpeak seasons.

In the banking world, a bank must keep cash reserves of a certain percentage so that it can meet the withdrawal needs of its customers. But if every customer wanted all of their money at the same time, the bank would run out of cash on hand. In the same way, if every customer's potential peak usage occurred at the same time, the CSP would run out of resources, leading to a loss of availability and unhappy customers.

The customer must also take care in setting internal limits on resource use. Proper architectural decisions as well as process and procedure are required to ensure that resources that are no longer needed are deprovisioned. Otherwise, the customer continues to pay for resources that are not serving any purpose. Some cloud service offerings provide automated scale-up and scale-down capabilities, but it is possible to design cloud architecture that mimics traditional servers in a data center with no automated scaling.

Resource Pooling

In many ways, this is the core of cloud computing. Multiple customers share a set of resources including compute power, memory, storage, application services, etc. They do not each have to buy the infrastructure necessary to provide their IT needs. Instead, they share these resources with each other through the orchestration of the CSP. Everyone pays for what they need and use. Pooling these resources enables the other characteristics of cloud computing: self-service is possible because adding a new virtual server doesn't require a physical server to be installed and set up, and automating this based on demand is what enables elasticity.

This resource pooling presents some challenges for the cybersecurity professional, including issues of multitenancy as discussed earlier. A competitor or a rival can be sharing the same physical hardware. If the system, especially the hypervisor, is compromised, sensitive data could be exposed.

Resource pooling also implies that resources are allocated and deallocated as needed. The inability to ensure data erasure can mean that remnants of sensitive files could exist on storage allocated to another user. This increases the importance of data encryption and key management.

Measured Service

Metering service usage allows a CSP to charge for the resources used. In a private cloud, this can allow an organization to charge each department based on their usage of the cloud. For a public cloud, it allows each customer to pay for the resources used or consumed. With a measured service, everyone pays their share of the costs.

Measured service provides two key benefits. It is the foundation of shifting IT spending from CapEx to OpEx, and it provides additional visibility and transparency into actual IT needs. A CSP provides metrics on the services consumed, including network bandwidth, storage space, and computing power. This discrete measurement of services consumed is in contrast to estimating how much of a server's capacity is actually used and is beneficial for capacity planning.

Building Block Technologies

These technologies are the elements that make cloud computing possible. Without virtualization, there would be no resource pooling, while advances in networking allow for ubiquitous access. Improvements in storage and databases allow remote access to virtual storage in a shared resource pool, and orchestration puts all the pieces together and allows organizations to utilize the various cloud computing services. The combination of these technologies allows better resource utilization and improves the cost structure of technology.

Virtualization

Virtualization allows the resources of a physical server to be shared among multiple virtual servers. Virtualization is not unique to cloud computing and can be used to share corporate resources among multiple processes and services, typically offering more efficient utilization of resources. For example, a single physical server can be used to host virtual machines (VMs) running an email server and a web server, saving the organization the cost of buying and running two physical machines. This resource sharing also makes it easier to move VMs between physical hardware, providing availability benefits.

Cloud computing takes the idea of server virtualization and expands it to virtualizing all aspects of an information system, including the basic infrastructure such as networking, compute, memory, physical data storage, data storage systems like databases, and even applications that traditionally ran on a user workstation. The CSP shares resources among a large number of services and customers (also called tenants). Each tenant has full use of their environment without knowledge of the other tenants. This increases the efficient use of the resources significantly.

Most CSPs have multiple locations providing the cloud services, and high-speed connectivity allows services and data to move seamlessly between locations. This allows the CSP to evenly distribute workloads, provides failover capabilities, and allows regulated customers to access cloud services in locations that meet their regulatory requirements.

The use of all-virtualized infrastructure can create some security and compliance concerns, such as data leaving a geographic area where it may not be governed by the same set of laws and regulations. These issues may be handled during contract negotiation, though most CSPs offer solutions designed with common regulations in mind. For example, AWS, Azure, and GCP all offer GDPR-compliant services that retain data in EU data centers and also offer solutions to the U.S. federal government that retain data only in U.S.-based data centers.

Virtualization relies on technology known as a hypervisor, which is software that governs access by VMs to the hardware resources. If the hypervisor is compromised, it could allow an attacker to gain access to other VMs running on the same hardware. This type of attack is known as an escape, and properly securing and patching the hypervisor is the responsibility of the CSP.

Early virtualization focused on creating multiple virtual computers on a single piece of physical hardware, which increased efficiency in resource utilization and offered portability for virtual machines (VMs). Containers are a more recent evolution of these virtualization concepts. A container, or containerized application, is an application packaged along with its required software dependencies and configuration information. A container platform, such as Docker, can be installed on any physical hardware and run any compatible containers. The containerized application is inherently more portable, as it can run on any platform so long as the container software is also installed.

Storage

A variety of storage solutions allow cloud computing to work. Two of these are storage area networks (SANs) and network-attached storage (NAS). These and other advances in storage allow a CSP to offer flexible and scalable storage capabilities.

A SAN provides secure storage among multiple computers within a specific customer's domain. A SAN appears like a single disk to the customer, while the storage is spread across multiple locations. This is one type of shared storage that works across a network. SANs utilize block-level storage, where data being stored is broken down into blocks of uniform size. Blocks can be stored more efficiently than files due to their uniform size, and the SAN software is responsible for arranging all the needed blocks when a specific piece of data is requested.

Another type of networked storage is the NAS. This network storage solution uses TCP/IP and allows file-level access. A NAS appears to the customer as a single file system similar to the hard drive in a workstation. Many operating systems offer native support for NAS using a variety of formats.

The responsibility for choosing the storage technology lies with the CSP and will change over time as new technologies are introduced. These changes should be transparent to the customer—from the customer's perspective, the access speed, integrity of data, and allocated storage space are the important factors, not the underlying storage technology. The CSP is responsible for the security of the shared storage resource, while customers retain responsibility for security data they store in the cloud.

Shared storage can create security challenges if data remnants are present on a disk after it has been deallocated from one customer and allocated to another. A customer has no way to securely wipe the drives in use or physically destroy them; typically a CSP will offer some form of secure deletion. However, customers can utilize a practice known as crypto-shredding to make these fragments unusable if recovered, by encrypting data and securely destroying the key.

Networking

As all resources in a cloud environment are accessed through the network, a robust, available network is an essential element. The Internet is the network used by public and community clouds, as well as many private clouds. This network has proven to be widely available with broad capabilities. The Internet has become ubiquitous in society, allowing for the expansion of cloud-based services.

An IP-based network is only part of what is needed for cloud computing. Low latency, high bandwidth, and relatively error-free transmissions make cloud computing possible. The use of public networks also creates some security concerns. If access to cloud resources is via a public network, like the Internet, the traffic can be intercepted, and if data is transmitted in the clear, it can be read. The use of encryption and secure transport keeps the data in motion secure and cloud computing safer. Some CSPs even offer dedicated connectivity into the edge of their network for organizations with a high volume of sensitive data that prefer not to utilize the public Internet for connectivity.

Databases

Databases allow for the storage and organization of customer data. By using a database in a cloud environment, the administration of the underlying database becomes the responsibility of the CSP, including key tasks such as patching, tuning, and other database administrator services. The exception is IaaS, where the user is responsible for whatever database they install.

The other advantage of databases offered through a cloud service is the number of different database types and options that can be used together. While traditional relational databases are available, so are other types. By using traditional databases and other data storage tools as well as large amounts of data resources, data warehouses, data lakes, and other data storage strategies can be implemented. The cost savings offered by the scale of cloud computing make big data applications such as these more affordable than they would otherwise be.

Orchestration

Cloud orchestration is the use of technology to manage cloud infrastructure. In a modern organization, there is a great deal of complexity, including a mix of on-premises infrastructure and multiple cloud services. Even small organizations are likely to have multiple cloud offerings, such as infrastructure hosted in a traditional CSP like AWS, GCP, or Azure, as well as SaaS applications used by the business like Google Workspace, GitHub, or Salesforce.

This complexity can lead to data being out of sync, processes being broken, and a fragmentation that leaves the IT department unable to keep track of all the cloud services, business processes, and data locations. Like the conductor of an orchestra, cloud orchestration partners keep all of these pieces working together including data, processes, and application services. Orchestration is the glue that ties all of the pieces together through programming and automation. Orchestration is valuable whether an organization runs a single cloud environment or a multi-cloud environment.

This is more than simply automating a few tasks. Automation is heavily used by cloud orchestration services to create one seemingly seamless organizational cloud environment. In addition to hiding much of the complexity of an organization's cloud environment, cloud orchestration can reduce costs, improve efficiency, and support the overall workforce.

The major CSPs provide orchestration tools. These include IBM Cloud Orchestrator, Microsoft Operations Management Suite (OMS), and AWS Cloud Formation. These offerings are typically best suited to manage their respective CSP's services. Organizations utilizing multiple CSPs can utilize multi-cloud orchestration tools to deploy infrastructure across various CSPs, such as Kubernetes.

Describe Cloud Reference Architecture

The purpose of a reference architecture (RA) is to allow a wide variety of cloud vendors and services to be interoperable and to provide consumers with guidance on optimal deployment of resources in the cloud. An RA creates a framework or mapping of cloud computing activities and cloud capabilities to allow the services of different vendors to be mapped and potentially work together more seamlessly. An example of this approach is the seven-layer Open Systems Interconnection (OSI) model of networking, which allows interoperability of networking protocols between different operating systems. As companies engage a wide variety of cloud solutions from multiple vendors, interoperability is becoming more important, and the reference architecture makes the process easier.

NIST provides a cloud computing reference architecture in SP 500-292, which was based on a Cloud Security Alliance (CSA) working group project for cloud enterprise architecture. CSA has continued to update material related to this RA, including mapping control frameworks to the RA providing guidance to security practitioners for securely deploying cloud services.

Some RA models like NIST are role-based and describe the activities needed to provision, use, and maintain cloud services. The NIST RA is intended to be vendor neutral and defines five roles: cloud consumer, cloud provider, cloud auditor, cloud broker, and cloud carrier. Other RAs, such as the IBM Cloud Computing Reference Architecture (CCRA), are layer-based, although they also identify key activities performed by cloud provider and consumer.

Cloud Computing Activities

Some organizations will be a mix of cloud consumer and cloud provider. Internal IT departments may migrate legacy computing environments to a private cloud model for consumption by the organization's users, while other services will be consumed strictly from external cloud providers; as an example, an organization might retain its on-premises Exchange environment while also consuming Microsoft 365 SaaS for collaboration. Cloud-native organizations are those with no traditional, on-premises IT environments. New organizations, such as startups, often pursue this model due to the ease of use, while some older organizations have migrated completely to the cloud for cost savings. Regardless of the organization's status, there are several key activities related to each role, which are detailed here:

  • Cloud consumer: The consumer procures and uses, or consumes, the cloud services. This involves reviewing available providers and services to determine which one best fits the organization's needs and then entering into a relationship, usually defined by a contract, with the CSP. Once the service is active, the consumer is responsible for setting up accounts, configuring the service, and then actually using it. These activities will be different across different CSPs and cloud service models, but there are some common activities. For a SaaS consumer, the typical end-user management is required, including provisioning accounts and configuring settings such as multifactor authentication (MFA). End users consuming the service will perform whatever activity the SaaS platform enables, such as collaboration and communication. In a PaaS environment the customer activities center on software development efforts, business intelligence, and application deployment. IaaS customers focus on activities such as business continuity and disaster recovery and building higher-level services on top of the basic infrastructure such as storage and compute.
  • Cloud service provider: The provider makes the service available. These activities include service deployment, orchestration, and management as well as security and privacy. The CSP is also responsible for constructing physical infrastructure such as data centers and computer rooms, deploying physical security controls including fences and security guards, and monitoring and maintaining environmental controls such as air handling and fire suppression.
  • Cloud auditor: An auditor is an entity capable of independent examination and evaluation of cloud service controls and is usually charged with issuing a report on the effectiveness of those controls. These activities are especially important for consumers with contractual or regulatory compliance obligations, as the auditor's independent review of controls provides assurance that the cloud service is properly secured. Audits are usually focused on compliance, security, or privacy.
  • Cloud broker: This entity is involved in three primary activities: aggregation of services from one or several CSPs, integration with consumers' existing on-premises infrastructure, and customization of services.
  • Cloud carrier: The carrier provides the network and telecommunication connectivity that permits the delivery and use of cloud services. This role is often performed by an Internet service provider (ISP) along with other standard Internet access, though dedicated cloud carrier functions may be useful for organizations that require high-security communications or dedicated connectivity.

Cloud Service Capabilities

Capability types are another way to look at cloud service models. In this view, we look at the capabilities provided by each model. The three service models are SaaS, PaaS, and IaaS, and each provides a different level and type of service to the customer. The shared security responsibilities differ for each type as well.

Application Capability Types

Application capabilities include the ability to access an application over the network from multiple devices and from multiple locations. Application access may be made through a web interface, through a thin client, or in some other manner. As the application and data are stored in the cloud, the same data is available to a user regardless of the device they connect from. Depending on the end user's authorization and device, the look of the interface may be different. Administrators will typically see the full set of features, while users on mobile devices may see a subset of functionality tailored to a smaller screen.

Users do not have the capability to control or modify the underlying cloud infrastructure, although they may be able to customize their interface of the cloud solution. This often takes the form of customizing or whitelabeling the application so it looks like a part of the organization's branded tools, though more advanced customization is also possible. This can include defining available data fields, access to different modules within an application, and integration with other tools or platforms. The organization typically does not have to be concerned with the different types of endpoints in use in their organization, so long as the devices are capable of running a modern web browser. Supporting different device types is the responsibility of the application service provider, and support for the organization's installed technology is often a key decision point for acquiring the cloud service.

Platform Capability Types

A platform has the capability of developing and deploying solutions through the cloud. These solutions may be developed with available tools, may be acquired solutions that are delivered through the cloud, or may be solutions that are acquired and customized prior to delivery. The user of a platform service may modify the solutions they deploy, particularly the ones they develop and customize. However, the user has no capability to modify the underlying infrastructure. A standard build of Windows Server deployed on a cloud machine is an example of a PaaS solution. The consumer might modify which software components are installed, such as a web or file server, but they do not have the ability to specify the underlying infrastructure.

What the user gets in a platform service are tools that are specifically tailored to the cloud environment. In addition, the user can experiment with a variety of platform tools, methods, and approaches to determine what is best for a particular organization or development environment without the expense of acquiring all those tools and the underlying infrastructure costs. It provides a development sandbox at a lower cost than doing it all in house. Some platform services focus on vital business functions like big data analytics and business intelligence. In these cases the consumer gets access to a shared environment needed to perform the tasks without the up-front cost of building the massive storage and processing infrastructure needed.

Infrastructure Capability Types

An infrastructure customer cannot control the underlying hardware but has full control over the operating system as well as tools, applications, and solutions installed on top of their infrastructure. The consumer can also provision infrastructure tailored to their needs, including the amount of computing power, storage space, and network bandwidth. This is similar to building out legacy IT, except the infrastructure is virtualized instead of physical.

This capability provides the customer with the ability to quickly spin up an environment and also quickly deprovision when it is no longer needed. This combines the characteristics of elasticity and self-service to provide lower-cost access to the computing capabilities.

Cloud Service Categories

There are three primary cloud service categories: SaaS, PaaS, and IaaS. In addition, other service categories are sometimes suggested, such as storage as a service (STaaS), database as a service (DBaaS), and even everything as a service (XaaS). Marketing combined services is also popular using the -aaS name, such as penetration testing as a service (PTaaS). However, the three fundamental service models for delivering cloud computing are used across providers and are defined in NIST SP 800-145.

Security of systems and data is a shared responsibility between the customer and service provider. The point at which the provider's responsibility ends and the consumer's responsibility begins varies by service category, and the major CSPs publish their shared responsibility model mapping different services they offer to the service categories.

If you are an end user, you are likely using a SaaS solution. If you are a developer, you may be offering a SaaS solution you developed in-house or delivered to your customers using a PaaS development environment. If you are building and managing entire infrastructure in the cloud similar to an on-premises IT deployment, you are using IaaS.

Software as a Service

SaaS is the most common cloud service that most people have experience with. This is where we find the end user performing common tasks like collaborating in Google Docs, sharing files via Dropbox, or documenting project work status by updating a Jira ticket. SaaS is usually subscription-based and billed by the number of users or licenses acquired, and SaaS is typically very easy to set up and use. Some SaaS providers offer site licenses for large volumes of users.

Consumer-configurable security in a SaaS environment is typically limited to application settings and user access controls. Security of the underlying infrastructure from the virtual servers to operating systems is maintained by the provider, though some limited options may be available to the consumer. For example, the deployment of a major upgrade to a SaaS application might be configurable so that the downtime required for the upgrade does not impact system availability. The amount of control over security will vary by the CSP, the service offering, and often the size of the contract.

Platform as a Service

PaaS is the domain of developers. With a PaaS solution, the service provider is responsible for infrastructure, networking, virtualization, compute, storage, and operating systems. Everything built on top of that is the responsibility of the developer and their organization. Many PaaS service providers offer tools that may be used by developers to create their own applications, leaving choices about how tools are used and configured to the developers and their organizations. PaaS offers cost savings over building and maintaining traditional infrastructure and speeds development activities by providing ready-built platforms for developers to deploy applications.

With a PaaS solution, a developer can work from any location with an Internet connection. The CSP is responsible for maintaining the security of the platform, such as patching and updating the services provided rather than requiring internal IT staff to manage these tasks. Major CSPs offer PaaS solutions for common operating systems and databases and support multiple programming languages for custom development.

Infrastructure as a Service

IaaS is where we find the system administrators (sysadmins). In a typical IaaS offering, the service provider is responsible for provisioning hardware, networking, and storage infrastructure, and for exposing this hardware through virtualization. Consumers use tools provided by the CSP such as a web console or command line to create and maintain infrastructure, and the CSP's systems allocate resources from the virtualized pool as needed. The sysadmin is responsible for everything built on top of the virtualized infrastructure, including the operating system, developer tools, middleware applications, and end-user applications as needed.

IaaS provides the most flexibility to build and deploy computing resources but also requires the most administrative effort from the consumer. It mirrors processes needed for traditional infrastructure but provides additional flexibility and cost savings due to the shared nature of cloud computing.

Cloud Deployment Models

There are four cloud deployment models defined in NIST SP 800-145, and a fifth model is emerging as organizations utilize multiple cloud environments at either an application or organization level. Each deployment model has advantages and disadvantages for both cost and security considerations. The defining element of each model is who owns the cloud and who can access the cloud—or at least, who controls access to the cloud.

Public Cloud

In a public cloud, anyone with access to the Internet may access the resources provided, usually through a subscription-based service. The resources and application services are provided by third-party service providers, and the systems and data reside on third-party servers. For example, Dropbox provides a file storage product to end users. The details of how Dropbox provides this service are not exposed directly to the end users, and for the customer it is simply a publicly available cloud data storage service that can be consumed using a PC or mobile device, or even integrated into an application.

There are concerns with privacy and security in a public cloud. While this was a major concern in the past that could prevent organizations from utilizing public cloud services, advances in regulation and service offerings have made public cloud a viable option for virtually all organizations. The responsibility for data privacy and security remains with the data owner, who is typically the customer, which includes the decision about which cloud service model to utilize. Concerns about reliability can sometimes be handled contractually through the use of a service-level agreement (SLA); however, most CSPs offer standard uptime SLAs that are robust enough for a majority of organizations. This is also a key part of the shared responsibility model, as different services provide different SLAs, and it is the responsibility of the consumer to architect their application and use of cloud services to meet their availability requirements.

Concerns also exist for vendor lock-in and access to data if the service provider goes out of business or is breached. While unlikely, the major CSPs have suffered temporary disruptions, and each offers customized service offerings that can make it difficult or impossible to move an application hosted by one CSP to another. A more frequent risk is the retirement or removal of certain services, which is a business decision made by the CSP. If you are a customer of the CSP and rely on that service, you likely have no control over when and how the service is retired, potentially creating availability issues and unexpected effort to upgrade or migrate.

Private Cloud

A private cloud is built in the same manner as a public cloud, but the difference is in ownership. A private cloud belongs to a single company and contains data and services for use by that company, meaning there is no subscription or access for the general public. In this case, the infrastructure may be built internally or hosted on third-party servers. Private clouds continue the trend toward virtualization and delivery of IT shared services across an organization and are often driven by cost measures designed to maximize efficiency.

A private cloud is usually more customizable, since the provider and consumer are the same entity. This offers benefits for access control, security, and privacy. A private cloud is also generally more expensive, as it requires building out the infrastructure required. There are no other customers to share the infrastructure costs, so the cost of providing excess capacity is not shared. However, in a very large organization a private cloud might be able to more evenly share costs across different units.

A private cloud may not save on infrastructure costs, but it provides cloud services to the company's employees in a more controlled and secure fashion. The major cloud vendors provide both a public cloud and the ability for an organization to build virtual private cloud environments, and some have dedicated private cloud environments for specific customers such as government.

The primary advantage to a private cloud is security and control, while the primary disadvantage is cost. With more control over the environment and only one customer, it is easier to avoid security issues of multitenancy, and issues such as scheduling patches or maintenance downtime are easier. When the cloud is internal to the organization, secure data destruction is also a possibility because the organization retains physical control of its infrastructure.

Community Cloud

A community cloud falls somewhere between public and private clouds. The cloud is built for the needs of multiple organizations that are typically in the same industry. These common industries might be banks, regional governments, or a cloud hosted by a game manufacturer and provided to a specific community of gamers with specific hardware. Universities often set up consortiums for research, and this can be facilitated through a community cloud. Structured like public and private clouds, the infrastructure may be hosted by one of the community partners or by a third party. Access is restricted to members of the community and may be subscription based.

While a community cloud can facilitate data sharing among similar entities, each remains independent and is responsible for what it shares with others. As in any other model, the owner of the data remains responsible for its privacy and security, sharing only what is appropriate, when it is appropriate. Because of the smaller scale compared to a public cloud, a community cloud may be more expensive. This may be offset by the shared interests or requirements of the community that can be implemented in the cloud, such as enhanced privacy.

Hybrid Cloud

A hybrid cloud is a combination of one or more cloud deployment models and is often a combination of private and public cloud. This offers additional flexibility and scalability for different types of computing needs. For example, high-sensitivity data requires additional security controls that a public cloud cannot offer, while lower-sensitivity data can be handled in a public cloud at much lower cost. In this arrangement, sensitive data like intellectual property would utilize only the private cloud, while the organization's external-facing systems like email and website hosting can take advantage of public cloud cost savings.

When an organization has highly sensitive information, the additional cost of a private cloud is warranted, since the potential risk impact is greater. A private cloud still offers benefits of broad network access and resource pooling but provides more control to the organization over security controls such as physical access, data destruction, and system access.

Most organization will also have less sensitive information, such as marketing materials, emails, and public information, on a company website. A public cloud's cost benefits likely outweigh the minor risk increase inherent with giving up physical control of infrastructure, so the business decision will likely require the organization to take advantage of these cost savings.

In a hybrid model, the disadvantages and benefits of each type of cloud deployment are the same for the different elements that make up the organization's hybrid cloud environment. Additionally, the added complexity of managing multiple clouds is itself a risk, so a clear justification needs to exist. Cloud orchestration can be useful to make the job of managing a hybrid cloud environment easier, particularly ensuring that configurations are uniformly applied across different cloud services.

Multi-cloud

Multi-cloud is not one of the cloud deployment models identified in NIST SP 800-145, but it is not an entirely new concept. An organization with a multi-cloud model consumes cloud computing services from multiple CSPs. Multi-cloud is different from hybrid cloud because the entire system is spread across multiple CSPs, rather than specific subsystems being in different cloud environments. An organization that utilizes public cloud for its website and a private cloud for company confidential data is utilizing a hybrid cloud deployment model.

By contrast, a system with both databases and application servers that are located across both AWS and Azure is a multi-cloud deployment. By utilizing open-source or standardized cloud orchestration tools that work across different CSPs, such as Kubernetes and YAML definition files, the organization can deploy the same infrastructure regardless of the underlying CSP. Modern applications increasingly rely on application programming interfaces (APIs) for communicating and sharing data, and these APIs allow for dynamic location of hosts via DNS records. Application servers in AWS can easily communicate with a database in AWS or Azure, as long as the DNS records are maintained for the current location of the database.

Multi-cloud deployments can provide enhanced reliability and availability in the event that a CSP suffers an outage, and issues of vendor lock-in may be reduced with the more portable infrastructure created by a multi-cloud architecture. Potential cost savings can also be recognized if one CSP offers a cheaper service than their competitors. Additionally, if different CSPs offer unique services, a multi-cloud deployment allows the organization to take advantage by locating relevant system elements in that CSP.

The added complexity of multiple CSPs with potentially different service offerings, security capabilities, and technical requirements can make multi-cloud deployment risky. Each CSP offers its own interface and methods of interacting with the cloud service offerings from setup to deployment to administration, so performing routine tasks can require additional time and effort. Operational issues can include increased latency of communications and potential difficulty in assigning costs to specific systems that are spread across multiple CSP invoices.

Cloud Shared Considerations

All cloud customers and CSPs share a set of concerns or considerations. It is no longer the case that all companies use a single CSP or SaaS vendor. In fact, larger companies may use multiple vendors and two or more CSPs in their delivery of services. The business choice is to use the best service for a particular use (best being defined by the customer based on features, cost, or availability). The sections that follow discuss some major considerations that cloud customers must consider when deciding if cloud computing is appropriate for specific tasks and if it should be part of the organization's cloud computing strategy.

Interoperability

With the concern over vendor lock-in, interoperability is a primary consideration. Interoperability creates the ability to communicate with and share data across multiple platforms and between traditional IT and cloud services provided by different vendors. Avoiding vendor lock-in allows the customer to make decisions based on the cost, feature set, or availability of a particular service regardless of the vendor providing the service. Interoperability leads to a richer set of alternatives and more choices in pricing.

Portability

Portability may refer to data portability or architecture portability. Data portability is focused on the ability to move data between traditional and cloud services or between different cloud services without having to port the data using potentially lossy methods, and architecture portability allows for migration of applications or systems without significant changes to or loss of service.

Data portability matters to an organization that uses a multi-cloud approach, as data will move between CSPs. Moving data between the CSPs should not be a burdensome task leading to loss of system availability or loss of data. It is also important in a cloud bursting scenario, where peak usage expands from on-premises hosting to include services in a cloud environment to meet demand, as the system must remain fully available both during the cloud burst and when demand returns to normal and the on-premises system resumes handling all requests. This must be seamless to make the strategy useful. Data backups are increasingly stored in the cloud, and restoration to in-house servers must be handled easily.

Architecture portability is concerned with the ability to access and run a cloud service from a wide variety of devices, running different operating systems. This allows users on a Windows laptop, iPadOS tablet, or Android smartphone to use the same application services, share the same data, and collaborate easily.

Reversibility

Reversibility can be a challenging concept to separate from portability, but it is concerned with a very specific question: once an organization moves an application or workload into a CSP, can it be moved back out again without causing significant impact? This move-out might entail switching to another CSP, deploying a multi-cloud strategy, or even migrating back to traditional on-premises infrastructure.

Elements of reversibility include the ability to avoid vendor lock-in. These might take the form of tools that can be used for both importing data and exporting it easily, or cloud architectures that allow for portability such as standardized PaaS, which can be replicated by another CSP. Another element of reversibility is the potential impact on the system operations. If migration takes days or weeks, during which the system cannot be used, this is a barrier to reversibility and creates a potential vendor lock-in to a specific CSP.

Availability

Availability is obviously important as it is one leg of the CIA triad. Within the constraints of agreed-upon SLAs, the purchased cloud services must be made available to the customer by the CSP. If the SLA is not met, there should be penalties or recourses available to the consumer. As an example, if a customer has paid for Dropbox and the service is not available when the customer attempts to access their data, the service availability has failed. If this failure is not within the requirements of the SLA, the customer has a claim against the service provider and may be entitled to compensation such as a reduced price for the service.

An important aspect of availability is concerned with the elasticity and scalability of the cloud service. If the CSP has not properly planned for capacity expansion, customers with growing needs will not find the service adequate to their requirements. Consider a SaaS tool like Salesforce. If customers are successful, their list of potential customers will grow, and they would expect a tool like Salesforce to scale up to this new demand. If Salesforce does not provision adequate storage and computing power to grow with customer demand, then successful businesses will be forced to migrate to another solution as their businesses expand.

Security

Cloud security is a challenging endeavor. It is true that the larger CSPs spend resources and focus on creating a secure environment, and the shared nature of the cloud means these resources are available to organizations that would otherwise not have access to them. It is equally true that large CSPs are very attractive targets for attackers, and there are aspects of cloud computing like multitenancy that create new security challenges.

The fundamental architecture of cloud services, which make data and service portability extremely simple, also introduce a new security complexity. Laws and regulations that restrict cross-border data flow, such as GDPR, can make choosing a CSP and architecting infrastructure in a cloud environment more challenging. With cloud computing the actual hardware could be anywhere, so it is vital to know where your data resides. When there are law enforcement issues, location of the data may also be a jurisdictional challenge as law enforcement may lack the means to gain access to data stored outside their jurisdiction.

The owner of data retains ultimate responsibility for the security of the data, regardless of what cloud or noncloud services are used. Cloud security involves more than protection of the data and includes the applications and infrastructure used as well. Security practitioners must ensure that their organizations are aware of any potential security risks introduced by cloud computing and that such risks do not outweigh potential cost savings from a cloud migration.

Privacy

The involvement of third-party providers, in an off-premises situation, creates challenges to data protection and privacy. In most privacy laws and regulations there are concepts of data owners and data processors. Owners are typically the organizations that collect the data, and processors act on behalf of the owner to perform some data tasks. Data subjects are the individuals whose data is being handled, and their rights are enforced through privacy laws like GDPR. This creates a requirement for the data owner to ensure that they choose appropriate cloud service options and architectures.

Privacy concerns include access to data both during a contract and at the end of a contract, as well as the erasure or destruction of data when requested or as required within the contract. Regulatory and contractual requirements such as HIPAA and PCI are also key concerns. Monitoring and logging of data access and modification, and the location of data storage, are additional privacy concerns that must be considered when evaluating a cloud solution.

Resiliency

Resiliency is the ability to continue operating under adverse or unexpected conditions. This involves both business continuity and disaster recovery planning and implementation. Business continuity might dictate that a customer store their data in multiple regions so that a service interruption in one region does not prevent continued operations, and many cloud services offer this type of resiliency by design.

The cloud also provides resiliency when a customer suffers a severe incident such as weather, facilities damage, terrorism, civil unrest, or similar events. A cloud strategy allows the company to continue to operate during and after these incidents. The plan may require movement of personnel or contracting personnel at a new location. The cloud strategy handles the data and processes as these remain available anywhere network connectivity exists.

Major CSPs use multiple regions and zones to provide redundancy and support recovery or continuity abilities. An organization may choose to build an entire continuity strategy around a single CSP, with multiple regions or zones used to continue providing system availability in the event of a disaster. Multi-cloud can be another strategy to support continuity, as the system can run even if a specific provider is not available.

Performance

Performance is measured through the requirements agreed upon in an SLA and is generally quite high as major CSPs build excess capacity and redundancy into their systems. The major performance concerns are network availability and bandwidth, which may be outside the control of the CSP. A network is a hard requirement of a cloud service, and if the network is down, the service is unavailable. Areas of limited bandwidth or network providers may be at higher risk for poor performance, or loss of system availability, and may drive design requirements such as seeking out or even building alternate network options.

Governance

Cloud governance uses the same mechanisms as governance of your on-premises IT solutions, including policies, procedures, controls, and oversight. Controls include encryption, access control lists (ACLs), and identity and access management. As many organizations have cloud services from multiple vendors, a cloud governance framework and application can make the maintenance and automation of cloud governance manageable. This may be another cloud solution, and many cloud security SaaS offerings exist in the market.

A variety of governance solutions, some cloud based, exist to support this need. Without governance, cloud solutions quickly grow beyond what can be easily managed. For example, a company may want to govern aspects of the cloud services used such as the number of CSP accounts, number of server instances, amount of storage utilized, and size of databases or other storage tools. Each of these adds to the cost of cloud computing, and without adequate governance the organization's cloud bill will continue to grow. A tool that tracks usage and associated costs will help an organization use the cloud efficiently and keep its use under budget.

Maintenance and Versioning

Maintenance and versioning in a cloud environment have some advantages and disadvantages. Each party is responsible for the maintenance and versioning of their portion of the cloud stack. In a SaaS solution, the maintenance and versioning of all parts is the responsibility of the CSP, from the hardware to the SaaS solution. In a PaaS solution, the customer is responsible for the maintenance and versioning of the applications they acquire and develop. The platform and tools provided by the platforms, as well as the underlying infrastructure, are the responsibility of the CSP. In an IaaS solution, the CSP is responsible for maintenance and versioning of hardware, network and storage, and the virtualization software. The remainder of the maintenance and versioning is the responsibility of the customer.

What this means in practical terms is that updates and patches in a SaaS or PaaS environment may occur without the knowledge of the customer. If properly tested before being deployed, it will also be unnoticed by the customer. There remains the potential for something to break when an update or patch occurs, as it is impossible to test every possible variation that may exist in the cloud environment of the customers. This is true in a traditional on-premises environment as well. In an IaaS environment, the customer has much more control over patch and update testing and deployment.

Virtualization made maintenance and versioning much easier, and the cloud's foundation on virtualization extends this advantage to cloud services. In most cases it is easy to create a snapshot of a virtualized resource, which supports rollback if a patch causes operational issues. The distributed nature of the cloud also means that many services exist in multiple places; updates are typically deployed in stages, so if an issue is encountered, other regions or zones can handle processing without causing downtime. Additionally, SaaS solutions solve issues of potentially unlicensed or unpatched software by centralizing the control with the CSP.

In a PaaS or IaaS, the customer is responsible for some of the maintenance and versioning and is required to define their own policies, procedures, and oversight to ensure that they are followed. However, each customer that connects to the PaaS and IaaS environment will be accessing the most current version provided by the CSP, ensuring that at least some issues addressed by patching are handled in a timely manner.

Service Levels and Service Level Agreements

Contractually, an SLA specifies the required performance parameters of a solution. This negotiation will impact the price, as more stringent requirements can be more expensive. For example, if you need 24-hour support, this will be less expensive than 4-hour support.

Some CSPs will provide a predefined set of SLAs, and customers choose the level of service they need. The customer can be an individual or an organization. For the customer contracting with a CSP, this is a straightforward approach. The CSP publishes their performance options and the price of each, and the customer selects the one that best suits their needs and resources.

In other cases, a customer specifies their requirements, and the CSP will provide the price. If the CSP cannot deliver services at the level specified or if the price is more than the customer is willing to pay, the negotiation may continue, or the organization may need to find an alternative provider. Once agreed upon, the SLA becomes part of the contract, and the SLA is executed along with other documents like a contract or master services agreement (MSA). The cost of negotiating and customizing an SLA and the associated environment is not generally cost effective for smaller organizations and individuals, though most CSPs offer a large variety of services at predefined service levels. This allows smaller entities to choose the services they need without the cost of negotiation.

Auditability

A cloud solution needs to be auditable for the customers to gain assurance that their data is adequately protected. Audits require an independent examination of the cloud services controls, and the auditors express an opinion on the effectiveness of the controls examined. The audit activities seek to answer questions such as “Are the controls properly implemented?” and “Are the controls functioning adequately and achieving their intended risk reduction goals?”

A CSP will rarely allow a customer to perform on audit on their controls. Instead, independent third parties will perform assessments that are provided to the customer. Some assessments require a nondisclosure agreement (NDA), while others are publicly available. These audit reports include SOC reports, FedRAMP packages, vulnerability scans, and penetration tests. More details of audit processes, methodologies, and the types of audits a CSP might furnish to customers are covered in Chapter 6.

Regulatory

Proper oversight and auditing of a CSP makes regulatory compliance more manageable. A regulatory environment is one where a principle or rule controls or manages the activities of an organization. For example, the Payment Card Industry Data Security Standard (PCI DSS) dictates how organizations processing payment card transactions must handle the data. Governance of the regulatory environment involves implementing policies, procedures, and controls that assist an organization in meeting regulatory requirements.

One form of regulations are those governmental requirements that have the force of law. The Health Insurance Portability and Accountability Act (HIPAA), the Gramm-Leach-Bliley Act (GLBA), the Sarbanes-Oxley Act (SOX) in the United States, and the GDPR in the European Union are examples of laws that are implemented through regulations and have the force of law. If any of these apply to an organization, governance will put a framework in place to ensure compliance with these regulations.

Another form of regulations are those put in place through contractual requirements. SLAs are one example of a contractual obligation that regulates business activities, and PCI DSS is another example of contractual obligations that credit and debit card processors are required to implement in order to continue processing payments. Enforcement of contractual rules can be through the civil courts governing contracts. Governance must again put in place the framework to ensure compliance.

A third form of regulations is found through standards bodies like the International Organization for Standardization (ISO) and NIST, as well as nongovernmental groups such as the Cloud Security Alliance and the Center for Internet Security. These organizations make recommendations and provide best practices in the governance of security and risk. While this form of regulation does not usually have the force of law, an organization or industry may voluntarily adopt a specific set of guidelines as a framework for implementing security and risk management. In other cases these may be required based on legal or contractual requirements. For example, contractors to the U.S. federal government are often required to implement the NIST control set in order to secure data shared with them, and private organizations may have customer-enforced requirements for an ISO 27001 certification over their security program.

Outsourcing

As previously discussed, use of the cloud involves giving up some control, which has both benefits and possible disadvantages. Shared resources tend to lower costs, which is a major benefit of outsourcing and a huge driver for organizations to both outsource business processes and migrate to cloud computing. The loss of visibility into service provisioning is also a potential drawback of outsourcing; in SaaS the CSP is supposed to provide patching, but there may be little to no visibility other than application interface changes to indicate whether all relevant patches have actually been applied.

The loss of control inherent in outsourcing can be a source of additional security risks. Outsourcing firms are often located in different countries, and sharing data across borders can be a violation of privacy or security laws and regulations like the EU GDPR. Outsourcing services like system administration or even data entry can lead to issues under the U.S. International Traffic in Arms Regulations (ITAR), which requires that certain activities be performed only by U.S. citizens.

Both cloud consumers and CSPs need to consider the advantages and risks of outsourcing when developing business strategies. The cost savings can be significant, but so too can the added risks. Major CSPs offer specialized cloud services designed to meet major regulations like ITAR, PCI DSS, HIPAA, and GDPR. It might even be possible to simplify meeting regulatory obligations by using such a prebuilt service, as it removes many potential pitfalls from the process of standing up your own infrastructure in a heavily regulated environment.

Impact of Related Technologies

The technologies in this section may be termed transformative technologies. Without them, cloud computing still works and retains its benefits, but these technologies are providing increased capabilities and improvements, and many leverage the quickly evolving set of technologies that cloud computing is built on. In the following sections, the specific use cases for the technology will be described.

Data Science

The field of data science has expanded rapidly in recent years and combines elements of the scientific method with data management and usage to derive new approaches to understanding, manipulating, and extracting valuable information from large volumes of data. Its applications to security are obvious to any practitioner who has observed log data from even a small computer network—computer systems can easily generate hundreds of log entries per minute, leading to an overwhelming amount of data for human analysis.

Advances in data storage and processing, enabled by the massive scale of cloud computing resources, have made data science a critical tool for information security. Defining a baseline of expected system behavior and then spotting anomalies or suspicious patterns in real time is virtually impossible for human analysts to do. A machine learning (ML) model can be trained on the unique circumstances of a particular system, then continuously monitored for anomalies like unexpected connections, user behavior that does not conform to expectations, or entirely novel attacker methods that deviate from expected system activity.

Training is a key element of data science. This is not the same as security training for end users; data scientists develop models that ML platforms can use to analyze data. These models must be trained using datasets, often using the ML to make a guess and then verifying it with human analysis. CAPTCHA authentication mechanisms are an example. First, systems designed to recognize objects in photos guess what types of objects each photo contains; then when a user is presented a CAPTCHA challenge, they can confirm or refute the ML model's guess. Such systems need a large quantity of data, and if the input data for training is not high quality, the results may be unreliable or entirely unusable.

Data science tools include big data storage locations like data warehouses and data lakes. Because these locations aggregate large amounts of data, they can present a number of security challenges. Personnel working on the data will have broad access, so robust authentication mechanisms are required. These locations are also likely to be highly attractive targets since a compromise grants access to a large volume of likely sensitive data.

Machine Learning

ML is a key component of artificial intelligence (AI) and is becoming more widely used in the cloud. Machine learning creates the ability for a solution to learn and improve without the use of additional programming. Many of the CSPs provide ML tools designed for organizations to build their own ML models, as well as services that include ML like speech or image recognition. There is some concern and regulatory movement when ML makes decisions about individuals without the involvement of a person in the process.

The availability of large amounts of inexpensive data storage coupled with vast amounts of computing power increases the effectiveness of ML. A data warehouse, or even a data lake, can hold amounts of data that could not be easily used before, likely because of the extremely high cost of storage and processing power. ML tools can mine this data for answers to questions that could not be asked before and can be used to train systems to automatically spot patterns or trends.

The security concern inherent with ML has to do with both the data and the processing. If all your data is available in one large data lake, access to the data must be tightly controlled, and that data lake is a high-value target for attackers. If the data store is breached, all your data is at risk, so controls to protect the data at rest, control access, and audit access are crucial to make this capability safe for use.

The other concern is with how the data is used. More specifically, how will it impact the privacy of the individuals whose data is in the data store? Will questions be asked where the answers can be used to discriminate against groups of people based on some characteristics? Might insurance companies refuse to cover individuals when the health history of their entire family tree suggests they are an even greater risk than would be traditionally believed?

Governmental bodies and nongovernmental organizations (NGOs) are addressing these concerns to some degree. For example, Article 22 of the EU GDPR has a prohibition on automated decision making, which often involves ML, when that decision is made without human intervention if the decision has a significant impact on the individual. For example, a decision on a mortgage loan could involve ML, but the final loan decision cannot be made by exclusively the ML solution. A human must review the information and make the final decision.

Artificial Intelligence

The goal of AI is to create a machine that has the capabilities of a human and cannot be distinguished from a human, especially when it comes to situations that the machine was not preprogrammed to deal with. It is possible that AI could create intelligent agents online that are indistinguishable from human agents. This has the potential to impact the workforce, particularly in jobs that do not require a great deal of situational analysis or critical thinking. There is also concern about how agents could be manipulated to affect consumer behavior and choices. An unethical individual could use these tools to impact humanity, leading to a need for safeguards in the technology and legal protections that will need to be in place to protect the customers. The rapid evolution of this field has so far outpaced the rate of regulation and legislation.

With the vast amount of data in the cloud, the use of AI is a security and privacy concern beyond the data mining and decision making of ML, though many AI endeavors rely on training systems using ML models, so there are shared concerns. This greater ability to aggregate and manipulate data through the tools created through AI research creates growing concerns over security and privacy of that data, as well as the potential applications that will be devised for any AI systems.

Many security solutions offer ML or AI as features that can supplement human analysts in processing large quantities of security data. One example is intrusion detection, where hundreds of thousands or even millions of data points may be generated by an organization's network in a day. Human analysis of that much data is virtually impossible, but a computer system could easily handle the load. Since most organizations have some uniqueness in their IT setup, an ML model can be trained to spot activity that is anomalous for the given network. However, a true AI that can analyze all relevant data and make a decision on whether the activity is truly suspicious or simply unexpected is still many years away.

Blockchain

A blockchain is an open distributed ledger of transactions, often financial, between parties. This transaction is recorded in a permanent and verifiable manner, where the records, or blocks, are linked cryptographically and are distributed across a set of computers, owned by a variety of entities. All parties participating in the blockchain can write new transactions and verify previous transactions but cannot modify those previous transactions.

Blockchain provides a secure way to perform anonymous transactions that also maintain nonrepudiation. The ability to securely store a set of records across multiple servers, perhaps in different CSPs or on-premises across different organizations, can be a method for achieving data integrity. Any data transaction committed to the chain is verifiable and secure and relies on advances in cryptography and distributed computing.

In cloud computing, blockchains are often implemented as part of financial systems to record transactions. Most CSPs and many other IT service providers offer blockchains that can be used as high-integrity data storage methods for any application. Any organization that requires this level of integrity can build a solution on top of a specific blockchain. This can, in turn, lead to similar issues with utilizing a given CSP, particularly the problem of vendor lock-in.

One example of a security application for blockchain is the collection of audit evidence. Once an audit artifact is created, such as an access review, it is recorded to the blockchain to definitively prove that the security control activity has occurred. When the organization's compliance team or auditors review that evidence, they have a high level of assurance that the control was executed and is functioning as intended.

Internet of Things

With the growth of the Internet of Things (IoT), a great deal of data is being generated and stored by a large volume of distributed devices. The cloud is a natural way to store this data, as data from IoT devices such as thermostats, cameras, or irrigation controllers is generated by devices with limited storage capabilities but persistent network connectivity. The ability to store, aggregate, and mine this data in the cloud from any location with a network connection is beneficial.

The manufacturers of many IoT devices do not even consider the cybersecurity aspects of these devices, and many of the manufacturers lack robust software development and security practices. To an HVAC company, a smart thermostat may simply be a thermostat with hardware that allows it to connect to a Wi-Fi network and communicate via APIs. These devices can be in service for many years, may never receive software or firmware updates, and lack processing power to run security tools like anti-malware or firewalls.

For many IoT devices, the data may not be the primary target, but instead the device itself may become part of a botnet for use in a DDoS attack. This was the case for the Mirai botnet, which infected vulnerable IoT devices like cameras and used them to send floods of network traffic. Devices with cameras, microphones, or location tracking abilities are frequently a target, as those features may be used to surveil individuals. Processes controlled by IoT devices can be interrupted in ways that damage equipment (e.g., Stuxnet) or reputations.

Few organizations are sufficiently mature to really protect IoT devices. This makes these devices more dangerous because they are rarely monitored. The cloud provides the ability to monitor and control a large population of devices from a central location. For some devices, such as a thermostat, this may be a small and acceptable risk. However, audio and visual feeds raise privacy, security, and safety concerns that must be addressed.

Containers

Virtualization is a core technology in cloud computing. It allows resource pooling, multitenancy, and other important characteristics. Containers are one approach to virtualization. In a traditional virtualization environment, the hypervisor sits atop the host OS, while the VM sits atop the hypervisor. The VM contains the guest OS and all files and applications needed for the functioning of that VM. A physical server can have multiple VMs, each running a different machine.

In containerization, there is no hypervisor and no guest OS. A container runtime sits above the host OS, and then each container uses the runtime to access needed system resources. The container contains the files and data necessary to run, but no guest OS. The virtualization occurs higher in the stack and is generally smaller and can start up more quickly. It also uses fewer resources by not needing an additional OS in the virtual space. The smaller size of the container image and the low overhead are the primary advantages of containers over traditional virtualization, as well as increased portability as it allows for increased abstraction from underlying hardware resources.

Containers make a predictable environment for developers and can be deployed anywhere the container runtime is available, specifically across different CSPs, which enables multi-cloud deployment. Similar to the Java Virtual Machine, a runtime is available for common operating systems and environments. Containers can be widely deployed. Versioning and maintenance of the underlying infrastructure do not impact the containers as long as the container runtime is kept current.

The container itself is treated like a privileged user, which creates security concerns that must be addressed. Techniques and servers exist to address each of these security concerns such as a cloud access security broker (CASB). Security concerns exist and must be carefully managed. All major CSPs support some form of containerization.

Quantum Computing

Quantum computers use quantum physics to build extremely powerful computers. When these are linked to the cloud, it becomes quantum cloud computing. IBM, AWS, and Azure all provide a quantum computing service to select customers, though many of these are still in a research and development stage and require highly specialized knowledge to use. The increased power of quantum computers and the use of the cloud may make AI and ML more powerful and will allow modeling of complex systems available on a scale never seen before. Quantum cloud computing has the ability to transform medical research, AI, and communication technologies.

A concern for quantum computing is that traditional methods for encryption/decryption could become obsolete as the vast power of the cloud coupled with quantum computing makes cryptographic attacks much simpler. This would effectively break current cryptographic methods, which necessitates new quantum methods of quantum encryption. Quantum cryptographic attacks are still theoretical, but as with all cryptography it is only a matter of time before current methods are rendered obsolete.

Edge Computing

Edge computing refers to processing or acting on data at the source where it is collected. It assumes distributed devices that can collect, analyze, and act on data, and is a frequently used model for IoT devices. For example, smart thermostats throughout a campus can gather local temperature data and set the temperature in individual floors, buildings, or even zones without the need for centralized management.

Edge computing can enhance availability by reducing single points of failure (SPOFs), since the edge devices can act independently and synchronize when a centralized system is available. They can provide enhanced customization as well—zones on the side of a building that get afternoon sun might need a lower temperature than zones on the side that is not heated, leading to significant energy cost savings. Processing data on device also reduces bandwidth consumption, which can be useful for extending services to areas with poor connectivity.

Security at the edge can be a challenge for many of the same reasons IoT devices are a security issue. IoT, industrial control systems (ICSs), and embedded systems at the edge may be low-power, stripped-down devices incapable of running traditional security controls like antimalware or host-based firewalls.

Edge devices may be deployed in areas where the organization has limited control as well. Physical devices like IoT may be located in areas with limited physical access controls, while virtual edge devices such as servers may be located across a wide range of hosting services. The issue of data integrity can also be problematic. Similar to distributed computing systems where data must be reconciled into a single data store, processing by edge devices and the communication needed to move data from the edge to a centralized store creates opportunities for data to be intercepted or modified in transit.

Confidential Computing

Confidential computing employs cryptography to protect data in use when it is being processed in a cloud environment. Data that is to be processed typically must be stored in memory and sent to a processor in an unencrypted state, which leaves it potentially vulnerable to unauthorized access by other processes such as malware or compromised applications.

In confidential computing, a trusted execution environment (TEE) is utilized to perform data decryption only when an authorized program attempts to access data. The TEE acts as a secure enclave, with access controls enforced to verify that only authorized applications can make calls for the data being protected. If a malware application attempts to access data processed in the TEE, it is denied access, because it is not authorized to view the keys needed to decrypt the data.

Organizations that want to recognize benefits of cloud computing, such as elasticity and cost savings from metered services, often must balance security risk against the potential benefits. Confidential computing can be deployed to ensure that data processed in the cloud is readable only by authorized applications and not by other cloud tenants or even cloud administrators.

Confidential computing also supports distributed workloads such as edge computing and can be deployed on edge devices as a countermeasure to a failure of physical access controls where the edge devices are deployed. A Confidential Computing Consortium was developed in 2019 to develop models, reference architectures, and best practices for the use of confidential computing. More information can be found at confidentialcomputing.io.

DevSecOps

Although not a technology itself, DevSecOps represents an evolution of DevOps, which creates a pipeline from system development to operational execution, to include important concepts of security at various stages of the pipeline. DevOps identifies ways to remove inefficiencies and misunderstandings between development teams and operational teams like IT by blending the practices. As the name implies, DevSecOps inserts security concerns into this cross-functional discipline.

A fundamental tenet of DevSecOps is the principle of “shifting left,” or taking security activities and embedding them where appropriate throughout the development and deployment lifecycle. Rather than perform security activities after a system is built and deployed, when it may be costly or impossible to fix issues, DevSecOps embeds security activities at earlier phases of the system lifecycle. This generally leads to lower-cost fixes that are easier to implement. For example, code reviews performed after a developer checks in a module can uncover bugs that can be fixed with a few simple code changes rather than weeks of hunting to determine the source of a flaw.

In addition to distributing security activities, DevSecOps also seeks to create a more holistic view of security and provision resources most appropriate to various types of tasks. A security analyst is unlikely to be skilled at performing code reviews. A DevSecOps team can hire and manage resources directly and remove the bottleneck of a separate security team's involvement. DevOps and Agile development are widely deployed and popular with cloud computing organizations. By embedding key security activities throughout the development and operations lifecycle, organizations can ensure that security objectives are met without adversely impacting project timelines or deliverable schedules.

Understand Security Concepts Relevant to Cloud Computing

Security concepts for cloud computing mirror many concepts in on-premises security, though some unique considerations exist. Most of these differences are related to the customer not having access to the physical hardware and storage media, as well as the assumption of broad network accessibility. These concepts and concerns will be discussed in the following sections.

Cryptography and Key Management

Cryptography is essential in the cloud to support security and privacy. Cloud computing presents three key challenges for encryption. First, data must move between the consumer and CSP, so data in transit must be protected. Second, multitenancy and the inability to securely wipe the physical drives used in a CSP's data center make data at rest and disposal more challenging. The primary solution is cryptography.

Data at rest and data in motion must be securely encrypted. A customer will need to be able to determine whether a VM or container has been unaltered after deployment, requiring cryptographic tools. Secure and reliable communications are essential when moving data and processes between the consumer and the CSP, so encryption and integrity checks by hashing are essential.

One of the challenges with cryptography has always been key management. With many organizations using a multi-cloud strategy, key management becomes even more challenging. The questions to answer are

  • Where are the keys stored?
  • Who generates and manages the keys (customer or CSP)?
  • Should a key management service be used?

In a multi-cloud environment, there are additional concerns:

  • How is key management automated?
  • How is key management audited and monitored?
  • How is key management policy enforced?

The power of a key management service (KMS) is that many of these questions are answered. Most CSPs offer a KMS and supporting services like virtual hardware security modules (HSMs) that can be attached to VMs to provide secure cryptographic functions.

One benefit of a KMS is that it stores keys separately from the data, so a breach of encrypted data is less severe because the attackers are less likely to be able to read it. Many data breach and privacy laws provide an exemption if a breach occurs and there are no signs attackers were able to decrypt the data. This benefit disappears if the encryption/decryption keys are stored with the data or if evidence points to attackers accessing the KMS. If the keys are to be stored in the cloud, they must be stored separately from the data and have robust logging capabilities. Outsourcing this has the benefit of bringing that expertise to the organization. However, like any outsourcing arrangement, you cannot turn it over to the KMS and forget about it—someone still needs to oversee the KMS.

Using a KMS does not mean that you turn over the keys to another organization any more than using a cloud file repository gives away your data to the service storing your files. You choose the level of service provided by the KMS to fit your organization and needs. The level of integrity and confidentiality required will determine which services make sense for an organization to engage from an external KMS.

The last three questions—automation, monitoring and auditing, and policy enforcement—are the questions to keep in mind when reviewing the different KMSs available. Like any other service, the features and prices vary, and each organization will have to choose the best service for their situation. More details of managing encryption and cryptographic keys in cloud environments, including topics such as key escrow, are covered in Chapter 2.

Identity and Access Control

There are multiple types of access control, and in the shared responsibility model of cloud computing the CSP and customer have different, but important, responsibilities. Some examples include physical access control, technical access control, and administrative access control.

Physical Access

Physical access control refers to actual physical access to the servers and data centers where the data and processes of the cloud customer are stored. Physical access is entirely the responsibility of the CSP, since the CSP owns the physical infrastructure and the facilities that house the infrastructure. Only they can provide physical security.

Physical access control to cloud customer facilities remains the purview of the customer. Some cloud services offer physical transportation options for data migration to the cloud, whether for initial migration or for making data backups available. In these cases, the customer must choose an appropriate courier or other secure transportation service.

User Access

Administrative access control refers to the policies and procedures a company uses to regulate and monitor access. These policies include who can authorize access to a system, how system access is logged and monitored, and how frequently access is reviewed. The customer is responsible for determining policies and enforcing those policies as related to procedures for provisioning/deprovisioning user access and reviewing access approvals.

Technical access control is the primary area of shared responsibility. While the CSP is responsible for protecting the physical environment and the customer is responsible for the creation and enforcement of policies, both the customer and the CSP share responsibilities for technical access controls.

For example, a CSP may be willing to federate with an organization's identity and access management (IAM) system. The CSP is then responsible for the integration of the IAM system, while the customer is responsible for the maintenance of the system. If a cloud IAM system is used (provided by the CSP or a third party), the customer is responsible for the provisioning and deprovisioning of users in the system and determining access levels and system authorizations while the CSP or third party maintains the IAM system.

Logging system access and reviewing the logs for unusual activity can also be a shared responsibility, with the CSP or third-party IAM provider logging access and the customer reviewing the logs or with the CSP providing both services. Either choice requires coordination between the customer and the CSP. Access attempts can come from a variety of devices and locations throughout the world, making IAM an essential function.

Privilege Access

Privilege access control refers to the privileged access granted to cloud infrastructure as well as systems, applications, and data hosted in a CSP. A related field is known as privileged access management (PAM) and is focused on reducing the risk of compromise or abuse related to such accounts. A system administrator is an example of a privileged account. An attacker who gained access to that user's account could perform a number of functions, including adding new users, accessing confidential information, and misusing resources.

Managing privileged access in the cloud may rely on an existing IAM solution or a separate PAM solution that is integrated with the IAM. Additional controls for privilege accounts are often justified by the capabilities granted to these accounts. Those controls might include stronger password or multifactor authentication (MFA) requirements, separate accounts for administrative and nonadministrative tasks, and more frequent access reviews.

Service Access

Service access refers to controlling access by the CSP to customer data. CSP administrators require access to infrastructure elements of the cloud service in order to perform necessary services such as patching, troubleshooting, and other maintenance. This access can be a problem for highly sensitive workloads such as regulated data and could be a deciding factor for an organization to avoid using cloud computing.

Major CSPs publish details of their service access control policies and may implement additional controls to give customer assurance over the safety of their data in the cloud environment. Data-at-rest encryption for many cloud services can be configured to utilize customer-controlled keys, meaning the cloud administrators can see encrypted data but nothing else. Implementing confidential computing via a TEE is another safeguard against snooping of data by cloud service personnel, since their administrative tools can be blocked from access to the keys needed to decrypt data in use.

Some CSPs offer transparency reports that show when, why, and how CSP personnel accessed customer data. For organizations looking to mitigate risks associated with service access, such a report can provide needed oversight to understand whether the CSP's access is a risk to the organization. In other cases, the requirement for cloud administrators to have access may be too great a risk, so a different deployment model like a community or private cloud is required.

Data and Media Sanitization

It is possible to sanitize storage media when you have physical access to the media, as is the case with on-premises architecture. You determine the manner of sanitization, such as software-based overwriting or even physical destruction of the storage media. You also determine the schedule for data deletion and media sanitization.

In the cloud this becomes more challenging. The data storage is shared and distributed, and CSPs do not allow access to the physical media. The CSP will not allow customer organizations access to the physical disks and will certainly not allow their destruction. In addition, data in the cloud is regularly moved and backed up, making it impossible to determine which disks might contain a copy of any one organization's data. This is a security and privacy concern. The customer will never have the level of control for data and media sanitization that they had when they had physical access and ownership of the storage hardware.

While some CSPs provide access to wipeable volumes, there is no guarantee that the wipe will be done to the level possible with physical access. Encrypted storage of data and cryptoshredding are discussed in the following sections. While not the same as physical access and secure wipe, they provide a reasonable level of security. If, after review, this level of security is not adequate for an organization's most sensitive data, this data should be retained on-premises in customer data centers or on storage media under the direct physical control of the customer.

Overwriting

Overwriting of deleted data occurs in cloud storage over time. Deleted data areas are marked for reuse, and eventually this area will be allocated to and used by the same or another customer, overwriting the data that was previously stored. There is no specific timetable for overwriting, and the data or fragments may continue to exist for some time. Encryption is key in keeping your data secure and the information private. Encrypting all data stored in the cloud works only if the cryptographic keys are inaccessible or securely deleted.

Cryptographic Erase

Cryptographic erase, also known as cryptographic erasure, is an additional way to prevent the disclosure of data. In this process, the cryptographic keys are destroyed (cryptoshredding), eliminating the key necessary for decryption of the data. Although data remains on drives or other storage media owned by the CSP, it is unreadable without the decryption key. Like data and media sanitization and overwriting, encryption is an essential step in keeping your data private and secure. Secure deletion of cryptographic keys makes data retrieval nearly impossible.

Network Security

Broad network access is a key component of cloud computing. However, if you have access to cloud resources over an uncontrolled network, bad actors can also gain access, which threatens the security of the cloud service you are using. These bad actors might be able to intercept data in transit or gain access to cloud services by observing the traffic you send, threatening the privacy and security of your data.

There are a number of ways to provide network security. This list is not exhaustive, not all concepts apply to all organizations, and the concepts are not mutually exclusive. Network security starts with controlling access to cloud resources through IAM, discussed previously. Controlling access to the cloud resources limits their exposure and provides accountability for authorized users.

Network security tools like VPNs, cloud gateways, and proxies can be used to reduce exposure to the public Internet and therefore reduce the attack surface bad actors might exploit. The use of VPNs for secure remote access across the Internet is common, though organizations may find cost savings by transitioning from legacy architecture that requires a VPN to a SaaS offering that allows users to dynamically create secure connections without the need for additional hardware or software. Evolving methods of network security, such as zero trust architecture (ZTA), are leveraging new capabilities to provide security with less overhead than a traditional VPN.

Network Security Groups

Security remains an important concern in cloud computing. A network security group (NSG) is one way of protecting a group of cloud resources. The NSG provides a set of security rules that control access, like a virtual firewall, to those resources. The NSG can apply to an individual VM, a network interface card (NIC) for that VM, or even a subnet. The NSG is essentially an access control mechanism protecting the asset and fills the same role as a traditional firewall in a layered defense strategy.

Configuration of the NSG is typically the responsibility of the cloud customer, as they are familiar with the architecture of their applications and systems. Concepts that applied to traditional firewall configuration also apply to NSGs, including default deny all and routine review of allowed access. Although configuration is the responsibility of the customer, most CSPs offer some security tools that can identify common issues with NSGs, such as missing deny all configuration, redundant or broken rules that can hinder access, and public access to resources that typically require stricter access control such as database services.

Zero Trust Network

Zero trust is a relatively new term in security. It is often applied as a marketing label to a wide variety of products, which can make it difficult to discern what the actual concept means. Zero trust is a fundamental approach to security that focuses on protecting assets by never implicitly assuming that other resources are trusted.

Older approaches to security often assumed a well-defined perimeter, like the edge of an organization's network, and implicitly trusted assets inside that perimeter. In this model a database server might have robust firewall rules and access controls for any remote access from outside the organization's firewall but would not enforce the same controls for a user inside the network. Obviously if a bad actor were to gain access to the network, they would be able to access a great deal of information with few barriers.

A zero trust network is modeled on the idea of making access control enforcement as granular as possible. Put another way, every request for access should be verified before access is granted. In a zero trust network, users must authenticate to join the network, they must authenticate to any and all systems or resources they try to access, and, where possible, the system should enforce access approval for each transaction or action performed.

A formal definition of zero trust architecture was published by NIST in SP 800-207. The document formally defines zero trust, identifies applications for network security, and provides guidance on implementing these principles. The document can be found at nist.gov/publications/zero-trust-architecture.

Ingress and Egress Monitoring

Traditional secure perimeter architecture placed security monitoring devices at major entry and exit points to the network, known as ingress and egress monitoring. Ingress controls can block unwanted external access attempts, while egress controls can prevent internal resources from communicating with unknown or unwanted resources outside the perimeter. However, an organization with even a simple mix of SaaS applications and infrastructure in a PaaS environment will find such monitoring difficult because of the lack of a traditional network perimeter.

Many CSPs offer monitoring solutions native to their environment, which can take the place of traditional perimeter monitoring. In a complex environment with multiple CSPs or cloud applications, it can be helpful to aggregate these monitoring logs into a single platform, such as a security information and event management (SIEM) tool. The SIEM provides centralized monitoring but must be able to ingest information from the organization's entire cloud environment to prevent blind spots in the monitoring.

Virtualization Security

Virtualization is an important technology in cloud computing. It allows for resource sharing and multitenancy, but like all technology there are benefits and security concerns. Security of the virtualization method is crucial. Several components make up virtualization, such as a hypervisor and VMs, and the ability to create virtual computing resources has led to the development of other technologies such as serverless computing.

Hypervisor Security

A hypervisor, such as Hyper-V or vSphere, packages resources into a VM. Creating and managing the VM are both done through the hypervisor, and the hypervisor is responsible for scheduling and mediating a VM's access to the underlying physical hardware. For this reason, it is important that the hypervisor be secure. Hypervisors such as Hyper-V, VMware ESXi, and Citrix XenServer are type I hypervisors or native hypervisors that run on the host's hardware.

A type I hypervisor is generally faster and more secure but requires special skills to set up and maintain compared to a type II hypervisor such as VMware or VirtualBox. Type II hypervisors run on top of a host operating system like Windows or macOS. These are easier to set up, typically requiring only the skills needed to install the virtualization application, but they are generally less secure.

A hypervisor is a natural target of malicious users, because a compromise of the hypervisor grants a malicious user control of all the resources used by each VM. If a hacker compromises another tenant on the server you are on and can then escalate the attack to compromise the hypervisor, they may be able to attack other customers through the hypervisor. This type of attack is known as a VM escape, and hypervisor vendors are continually working to make their products more secure.

For the customer, security is enhanced by controlling admin access to the virtualization solution, designing security into your virtualization solution, and securing the hypervisor. All access to the hypervisor should be logged and audited. Access to the network should be limited for the hypervisor to only the necessary access. This traffic should be logged and audited. Finally, the hypervisor must remain current, with all security patches and updates applied as soon as is reasonable. More detailed security recommendations are published in NIST SP 800-125A Rev 1 and by hypervisor vendors.

Container Security

Containerization, such as through Docker or LXC, has many benefits and some vulnerabilities. These include resource efficiency, portability, easier scaling, and agile development. Containers are portable environments that package up an application and its dependencies. This is often a key enabler for a multi-cloud strategy, as containers can be run in any cloud environment compatible with the container technology used. Containerization improves security by isolating the cloud solution and the host system, but inadequate identity and access management and misconfigured containers can be a security risk. Software bugs in the container software can also be an issue. The isolation of the container from the host system does not mean that security of the host system can be ignored.

The security issues of containerization must first be addressed through education and training. Traditional DevOps practices and methodologies do not always translate to secure containerization. The use of specialized container operating systems is also beneficial as it restricts the capabilities of the underlying OS to only those functions a container may need. Much like disabling network ports that are unused, limiting OS functionality decreases the attack surface. Finally, all management and security tools used must be designed for containers. A number of cloud-based security services are available.

There are many containerization solutions provided by major CSPs, and the choice of which solution to use will depend on a variety of factors. The choice of a container platform should be made based on the features and ability to support the organization's existing or defined future-state architecture. Ideally security should be a primary deciding factor, but at a minimum the ability to provide adequate security must be a consideration.

Ephemeral Computing

Ephemeral means “lasting for a very short time.” Ephemeral computing refers to resources that are created when needed and immediately deprovisioned when no longer needed. They are a crucial part of the metered service characteristic of cloud computing—if an organization needs a server for only a few minutes each day, then paying for 24 hours of computing power is wasteful.

Ephemeral computing can offer a major security advantage. A system that does not exist is impossible to attack, and a system that is briefly created and then deprovisioned reduces the amount of time a malicious user has to exploit it. However, the transient nature of these devices also means that traditional monitoring and security controls are not available. Running an antimalware agent on a server that is only briefly in existence is nearly impossible.

The key to securing ephemeral computing lies in properly specifying configuration of the ephemeral asset. When the asset is needed, a definition file is used to create the needed resources. This definition file must specify appropriate configurations like access controls and security settings like encryption to ensure that the resulting asset is properly secured during use. When the asset is no longer needed, care must be taken to ensure that any data in the ephemeral asset is also securely disposed. The use of cryptographic erasure is common, which ensures that the data cannot be easily recovered.

Serverless Technology

The name serverless is something of a misnomer, as there are still servers involved in serverless computing. The CSP is solely responsible for maintaining servers and exposes their computing capacity to customer applications on demand. Code written for serverless computing is deployed into an environment that supports a standard set of programming languages and functions.

When serverless code is run, the CSP's serverless environment allocates resources needed for the requested functions and performs the requested data processing. Results are stored in persistent memory rather than volatile memory like RAM, and the requesting function reads the data from its persistent data store.

Serverless applications offer the ability to scale very efficiently due to fewer constraints from system resources, and developers do not need to worry about setting up or tuning any infrastructure. The customer pays only for the actual amount of computing time that they utilize. Serverless applications may be more secure since they do not inherit security vulnerabilities of a traditional operating system.

However, similar to ephemeral computing, there can be security risks such as a lack of traditional security controls designed to run on a full operating system like intrusion detection systems (IDSs). Many serverless applications rely on APIs for communication and executing functions, so authentication of API calls becomes more important than in a traditional application where requests for data or processing are coming from a known location.

Common Threats

Previous sections dealt with threats that are related to the specific technologies that are key parts of cloud computing, such as virtualization, media sanitization, and network security. However, all other threats that may attack traditional services are also of concern. Controls that are used to protect access to software solutions, data transfer and storage, and identity and access control in a traditional environment must be considered in a cloud environment as well.

Cloud computing is evolving quickly. Each new service or capability offers advantages but also poses potential security risks. The Cloud Security Alliance (CSA) publishes a list of common cloud threats known as the Egregious Eleven. This list highlights threats that target unique elements of cloud computing, such as a lack of cloud security architecture or strategy in immature organizations or those with a significant shadow IT problem. Other threats on the list are common across all computing systems, such as inadequate access controls and insider threats.

Security Hygiene

Hygiene refers to basic practices designed to maintain health and is typically associated with practices like handwashing and cleaning surfaces. Cyber hygiene is basically the same—practices that maintain the health and security posture of systems. These practices include, but are not limited to, the following:

  • Vulnerability scanning to detect known vulnerabilities and allow the organization to remediate them
  • Penetration testing to identify weaknesses or misconfigurations that could be exploited
  • Maintaining asset inventories to ensure that the organization is aware of what assets might be at risk and what controls are deployed to protect them
  • Data backups that allow recovery or reconstitution of data if it suffers a loss of integrity or availability

Patching

The single most important element of a security hygiene program is applying software patches. All software contains flaws and vulnerabilities, and patches remove them before bad actors can exploit them. Traditional patching advice has often established windows for patch deployment, such as patches that address critical security issues must be installed within 30 days of release. However, the rapid pace of exploitation means that such windows often leave an organization exposed to risk, so faster patching cycles are recommended. Many systems and application support automatic installation of patches, and it is quickly becoming a security best practice to enable this setting.

Patching in a cloud environment is a crucial shared responsibility. Patches for SaaS applications are the responsibility of the CSP, as are patches for underlying platform software in PaaS such as operating systems and database management systems. Custom software or applications deployed on top of the PaaS are the responsibility of the customer. In IaaS, the CSP is responsible only for patching underlying systems that provide virtualized resources to the customer, such as a hypervisor. Patching of all systems and applications deployed on top of IaaS are the responsibility of the customer.

Baselining

A baseline refers to a known set of configuration attributes for a system. Security hygiene related to baselines is twofold. First, it is important to ensure that any new systems or applications added follow the defined baseline, which should be configured to enforce all relevant aspects of security. In cloud computing environments, this can be achieved relatively easily by using infrastructure as code (IAC).

Specifications for cloud infrastructure are documented in a text file that the CSP reads when provisioning new infrastructure. Unlike traditional IT, this removes the potential for errors if a deployment technician missed a step or misread the deployment guide. The concept of immutable architecture is also important to baselines, as it prohibits changes to environments once they are built. Patching immutable infrastructure is done by tearing down the old and building a new environment with all the latest patches and updates. Although this sounds like a great deal of work, the cloud characteristic of rapid elasticity makes it trivial to build a new environment from IAC definitions, and tools exist to check all elements contained in a definition and update to the latest version when the environment is built.

The second requirement for baselines is an audit or review of the current environment configuration against the baseline. CSPs offer tools that can alert when a system deviates from the expected baseline. In immutable architecture the baseline is always re-established when the environment is rebuilt, as any drift in the configuration is removed when the new environment is built.

Understand Design Principles of Secure Cloud Computing

As processes and data move to the cloud, it is only right to consider the security implications of that business decision. Cloud computing is as secure as it is configured to be. With careful review of CSPs and cloud services, as well as fulfilling the customer's shared responsibilities for cloud security, the benefits of the cloud can be obtained securely. The following sections discuss methods and requirements that help the customer work securely in the cloud environment.

Cloud Secure Data Lifecycle

As with all security efforts, the best outcomes are achieved when security is designed into a system from the very beginning; this is the principle of secure by design. Since many cloud service offerings do not move through traditional system development phases, it can be helpful to instead identify the phases that data flows through. Each phase has accompanying risks and potential security controls. The cloud secure data lifecycle comprises six steps or phases.

  • Create: This is the creation of new content or the modification of existing content. Controls that typically operate at the creation phase can include labeling data according to its sensitivity and applying encryption.
  • Store: This generally happens at creation time and involves storing the new content in some data repository such as a database or file system. Choosing an appropriate storage location based on the data's sensitivity level is a critical security task during this phase.
  • Use: This includes all the typical data activities such as viewing, processing, and modifying data, collectively known as handling. Data spends most of its life in this phase, and the majority of an organization's security controls apply here. These include encryption of the data at rest and underlying storage media, encryption of data in transit when it is accessed, access controls, and auditing data use.
  • Share: This is the exchange of data between two entities or systems. Ensuring adequate confidentiality and integrity of data when it is shared is a vital security control and is typically achieved by encryption and hashing. Access controls are also crucial as the sharing parties must be properly authorized and authenticated to avoid unauthorized access.
  • Archive: Data is no longer used but is being stored. Data controls such as a retention schedule are important during this phase, and additional encryption concerns may apply due to the long life of data in archives. The proper key is required to decrypt data, so if keys are rotated, a backup is required to ensure that data in archives remains accessible.
  • Destroy: Data has reached the end of its life, as defined in a data retention policy or similar guidance. It is permanently destroyed, and the method of destruction or sanitization must meet the organization's requirements to avoid data recovery by an unauthorized party.

At each of these steps in the data's lifecycle, there is the possibility of a data breach or data leakage. As discussed, encryption plays a vital role in data security in multiple ways. Other security tools are also important, such as data loss prevention (DLP), which can help identify, inventory, and classify information stored in both on-premises and cloud systems.

Cloud-Based Business Continuity and Disaster Recovery Plan

A business continuity plan (BCP) is focused on keeping a business running following a disaster such as weather, civil unrest, terrorism, fire, etc. The BCP may focus on critical business processes necessary to keep the business going while disaster recovery takes place. A disaster recovery plan (DRP) is focused on returning to normal business operations, which can be a lengthy process. The two plans work together, since most business processes documented in the BCP rely on the technology and services the DRP is concerned with.

In a BCP, business operations must continue, but they often continue from an alternate location. So, the needs of BCP include space, personnel, technology, process, and data. The cloud can support the organization with many of those needs. A cloud solution provides the technology infrastructure, processes, and data to keep the business going.

Availability zones in a region are independent data centers that protect the customer from data center failures. Larger CSPs like AWS, Azure, and Google define regions. Within a region, latency is low. However, a major disaster could impact all the data centers in a region and eliminate all availability zones in that region. A customer can set up their plan to include redundancy across a single region using multiple availability zones or redundancy across multiple regions to provide the greatest possible availability. This is a concept known as resiliency, which describes system architecture that accounts for and anticipates failure by design, without affecting availability. Some cloud concepts, like serverless computing and containerization, are inherently portable and make a resilient architecture relatively inexpensive.

One drawback of multiregion plans is that the cost grows quickly. For this reason, many organizations put only their most critical data—the core systems that they cannot operate the business without—across two or more regions, but less critical processes and data may be stored in a single region. To minimize costs, an organization might back up only critical data and infrastructure definitions in another region but not maintain fully duplicate architecture. The cost of simply storing the data is relatively low, and in the event of an interruption, the environment can be rebuilt, though there will be a temporary loss of availability.

Functions and data that are on-premises may also utilize cloud backups, which help reduce backup costs compared to fully redundant architecture. However, these will also not be back up and running as quickly as a multiregion cloud architecture.

DRP defines processes the organization will use to resume normal operations, such as rebuilding the original computing environment and supporting business infrastructure like office buildings. Depending on the cloud backup strategy used, this might involve restoring data and processing to rebuilt on-premises infrastructure or to the original cloud region.

One failure of many DRPs is the lack of an offsite or offline backup, or the ability to quickly access necessary data backups. In the cloud, a data backup exists in the locations (regions or availability zones) you specify and is available from anywhere network access is available. A cloud-based backup works only if you have network access and sufficient bandwidth to access that data. That network infrastructure must be part of the DRP, and if that is deemed impossible in certain circumstances, a physical, local backup can also be beneficial.

Business Impact Analysis

Performing a business impact analysis (BIA) allows an organization to identify critical assets and capabilities that allow the organization to deliver its products or serve its missions. The BIA identifies the impact to the business if an asset or process is lost and enables prioritization of limited resources for business continuity and disaster recovery planning.

Collaboration tools like email and instant messaging are critical assets for many organizations. Company A has employees who do not communicate with the outside world, so the loss of email would have a minimal impact as long as instant messaging is still available. Company B is reliant on outside communications, so a loss of email would be highly disruptive.

Once all assets and processes are identified, the most critical should be prioritized for planning. The criticality of a process or asset determines the level of resources committed to its continuity. In the previous example, Company A might choose to take no proactive steps to plan for a loss of email, instead relying on an ad hoc strategy if a disruption occurs. Company B, by contrast, should have a plan and resources prepared to ensure continuity of their operations.

Cost-Benefit Analysis

Cloud computing is not always the correct solution. Choosing correct solutions is a business decision guided by a cost-benefit analysis. Cloud computing benefits include more resilient architecture and potential business cost savings as cloud services can be written off as operating expenses. CSPs typically offer incredibly robust solutions that provide uptime that exceeds levels most organizations can achieve due to the costs involved.

These benefits come with certain risks, or costs. These include the loss of control over physical infrastructure, as well as potential for unauthorized data access by CSP staff or other tenants using the cloud services. While most CSPs offer extremely reliable services, in the event of an outage the customer has no control over the recovery process and must wait for a CSP to restore the service. In addition, a migration to the cloud can disrupt traditional perimeter-based security controls since the computing services are no longer located inside an organization-controlled network.

The cost-benefit analysis must take all of these factors into account, and business decision makers should weigh them when deciding on a CSP, a deployment model, and what security infrastructure to implement.

Return on Investment

Measuring how much value an organization receives from investing in something is known as return on investment (ROI) and is often part of a security practitioner's work in justifying spending on security tools and resources. For example, the salary and benefits paid to a salesperson are an investment by the organization. A positive ROI in this example would be new business that brings in more revenue than the salesperson's compensation, allowing the organization to show a profit.

In security terms, ROI is often measured by risk reduction and is typically measured in financial terms. If an organization faces a risk of losing 1,000,000 due to a data breach, then a security control that costs 5,000 per year provides a positive ROI if it can produce significant risk reduction. Another common measure of ROI is efficiency or productivity of resources. A security tool has a positive ROI if it can speed up detection and remediation of security incidents or can automate tasks and free up resources to focus on more important issues.

Functional Security Requirements

Defining functional security requirements and ensuring that the chosen CSP can meet them is essential when developing a secure cloud solution. Some areas that require particular attention include portability of systems and data, interoperability with other CSPs and on-premises architecture, and the issue of vendor lock-in.

These challenges can be lessened through the use of a vendor management process to ensure standard capabilities, clearly identifying the responsibilities of each party and the development of SLAs as appropriate. For complex or expensive systems, the RFP process can be utilized to clearly state customer security requirements. A vendor that cannot meet the customer's security needs can be eliminated early on, and working with a cloud service partner or cloud service broker can be useful when selecting cloud services.

Portability

One time movement is when a customer moves to a cloud platform with no intention of moving again. Moving between CSPs may not occur frequently, but there are potential risks associated with being completely dependent on a single vendor like a CSP. The migration process can be challenging, as each CSP uses different tools and templates. A move from one CSP to another requires mapping service capabilities between the CSPs and possibly performing data cleanup if needed. Moving from your own infrastructure to a CSP has the same challenge.

Frequent movement between CSPs and between a CSP and your own infrastructure can be very difficult, and data can be lost or modified in the process, leading to a loss of availability or integrity. Portability means that the movement between environments is possible with minimal impact. Portable movement will move services and data seamlessly and may be automated, and recent changes like containerization have made this problem significantly easier.

The movement of data between software products is not a new issue. Legacy on-premises systems often had their own data models and formats, and migrating between them required conversion and cleanup. Similar challenges in the cloud may be additionally constrained by the cost of maintaining both old and new cloud infrastructure, which was less of an issue for on-premises systems where ongoing costs to operate systems were limited to utilities.

Interoperability

With customers using a variety of cloud services, often from different vendors, interoperability is an important consideration. The ability to share data between different cloud environments and between cloud and on-premises systems is important. Interoperability challenges include differences in the security tools and control sets between CSPs. A gap in security may result, and differing controls can lead to gaps in monitoring and oversight. Careful planning is essential, and the services of a cloud broker may also be warranted.

One way to improve the situation is through APIs. If properly designed, the API can reduce interoperability challenges by providing a standardized and consistent way to access systems and data across different cloud environments and on-premises architecture. For example, if a SaaS tool is used to build a data inventory and supports the corporate data/system classification scheme, an API could be built to securely share that information with the governance, risk management, and compliance (GRC) or system inventory tool. This allows the GRC tool to act as a single source for data, sharing it with relevant systems, and removes the potential of multiple inventories and processes for tracking assets.

Vendor Lock-in

Solving interoperability and portability challenges reduces vendor lock-in, which occurs when an organization does not have alternatives to a specific vendor. This can occur if a customer utilizes features that are specific to one CSP. Moving would incur significant financial costs, technical challenges, or possibly legal issues like early contract termination fees.

Vendor lock-in remains a significant concern with cloud computing and can be caused by taking advantage of newly created services in one CSP that other CSPs have not yet offered. Continued advances in virtualization, improvements in portability and interoperability, and a careful design within a reference architecture can decrease this issue.

An additional concern is the use of CSP-specific services. If a system is reliant on these proprietary or unique features, then moving to a different CSP could lead to a loss of system availability. This mirrors challenges with legacy environments when moving from one major application to another, such as a move from IBM Lotus Notes to Microsoft Exchange for email service. Both systems provide similar functionality, but resources are needed to migrate data and business processes built on top of these email functions.

One example is the use of AWS Lambda for serverless computing. Applications written to run on Lambda cannot be directly migrated over to the equivalent Azure Functions without being rewritten. Both CSPs offer some unique functions like security monitoring of serverless computing processes. A migration from one platform to the other requires engineering effort to update applications and at least some reconfiguration of security monitoring, if not entirely new tools.

Emerging cloud-agnostic technologies like containers and careful architecting of cloud systems can reduce vendor lock-in. Cost-benefit analyses can be useful to guide the business when making decisions about unique CSP features. An increasing number of security tools are designed to support multiple CSPs’ offerings and interact via standardized means like APIs.

Security Considerations for Different Cloud Categories

In a cloud environment, security responsibilities are shared between the service provider and the customer. In the SaaS model, the customer has the least responsibility, and in the IaaS model, the customer has the most responsibility. In a PaaS, the responsibility is shared more equally.

The Shared Responsibility Model for cloud services is widely used to identify which security tasks are owned by the CSP, which are owned by the customer, and which require participation by both parties. The basics of the shared responsibility model are presented earlier in this chapter, and each CSP publishes their own version of a shared responsibility model tailored to their unique offerings.

Cloud security is designed to address risks that exist in a typical cloud infrastructure stack, which usually has most of the following components:

  • Data
  • APIs
  • Applications/solutions
  • Middleware
  • Operating systems
  • Virtualization (VMs, virtual local area networks)
  • Hypervisors
  • Compute and memory
  • Data storage
  • Networks
  • Physical facilities/data centers

It is generally understood that the CSP is responsible for the last five items on the list in all delivery models. Customers always have ultimate responsibility for data they place in the cloud, and the responsibility for the items in between varies depending on the service model. Those responsibilities are detailed in the following sections.

Software as a Service

From a security standpoint SaaS offers the most limited security options. Most of the security responsibility falls to the SaaS provider, such as securing the infrastructure, operating system, application, networking, and storage of the information on their service.

The customer may have limited application-specific security options, such as configuring the type of encryption or supplying their own keys to be used for encrypting data. Access control to the SaaS application is also the responsibility of the customer, including user access and access to any APIs. All other layers are the responsibility of the CSP.

The user of a SaaS solution has responsibilities as well. When a service is subscribed to by an organization or an individual, it is important to understand the security policies and procedures of the SaaS provider to the extent possible. In addition, the user determines how information is transferred to the SaaS provider and can do so securely through end-to-end encryption. The SaaS user is responsible for determining how the data is shared. Finally, the user can provide access security through proper use of login credentials, secure passwords, and multifactor authentication when available.

Platform as a Service

In a PaaS solution, security of the underlying infrastructure, including the servers, operating systems, virtualization, storage, and networking, remains the responsibility of the PaaS service provider. The developer is responsible for the security of any solutions built on the PaaS offering. This includes patching any applications other than the PaaS offering, as well as adequate security for the data used by the application. Just as with SaaS, the customer is responsible for controlling user access to the solutions developed.

In the Shared Responsibility Model, this means the customer is responsible for the data, APIs, and applications, with potentially some middleware responsibility.

Infrastructure as a Service

IaaS places most security responsibility on the customer. The CSP secures the virtualization computing, storage, and networking resources, which may exist on servers or hardware devices. The IaaS customer is responsible for the security of everything built on top of this virtualized computing environment, including the operating system and anything built on top of it.

In the Shared Responsibility Model, the customer is responsible for everything above the hypervisor. As in the other delivery models, the exact responsibility along this line can vary between the CSP and customer and must be clearly understood in each case. For example, the CSP may provide secure access to their network, which can then be used to access the customer's IaaS resources. Maintaining the secure access service falls to the CSP, while the customer is responsible for all other aspects of security.

Cloud Design Patterns

Similar to a reference architecture, cloud design patterns provide a set of guidelines and practices designed to create secure cloud computing services. Many of these are frameworks or models created by cloud organizations like CSPs and the Cloud Security Alliance. Utilizing these frameworks can guide organizational decisions such as choosing cloud service and deployment models, deploying cloud security technologies, and managing cloud environments once deployed.

SANS Security Principles

SANS (sans.org) is an organization that provides a variety of services to security practitioners, including training, templates, and the CIS control framework. This control framework is lightweight and is prioritized to address the most-prevalent security threats seen on the Internet. They provide a solid foundation for organizations looking to secure cloud infrastructure, due to the broad network accessibility of cloud solutions.

SANS security principles follow a risk-management approach that begins with inventorying assets. These include on-premises and cloud assets, as well as critical service providers like CSPs. Once inventoried, the risks facing these assets are documented, and risk mitigation strategies are implemented. SANS is vendor agnostic, and the security principles are designed to be applied to cloud infrastructure in any CSP.

Well-Architected Framework

The Well-Architected Framework comprises pillars, which are a collection of best practices that can be used to evaluate and manage cloud infrastructure. Each CSP publishes a Well-Architected Framework unique to their offerings, but there are similarities among those. These include the following:

  • Security: Protecting system and data assets is an obvious concern for security practitioners and appears in all CSPs’ versions of the framework. Tasks that fall under this pillar typically include identity and access management, encryption, and monitoring.
  • Reliability: Maintaining access to systems and data is the focus of this pillar, which aligns with the availability objective.
  • Performance: In many CSP frameworks, this pillar is also focused on availability, though it focuses on the perspective of a well-architected system to scale up and down based on demand.

The selection and use of a Well-Architected Framework will be driven by the organization's choice of CSP. Although the elements are common, some frameworks include more pillars, and each CSP's documentation has implementation guidance specific to their service offerings. Several CSP frameworks are presented here:

Cloud Security Alliance Enterprise Architecture

The CSA Enterprise Architecture (CSA EA) is a framework for modeling IT resource architecture in a way that aligns it to business needs within the organization. It is maintained by a working group and has been published as a freely available standard in NIST SP 500-299 and SP 500-292.

The CSA EA comprises four domains where IT services are required and associated security concerns within each domain. The domains include Business Operation Support Services (BOSS), IT Operations & Support (ITOSS), Technology Solution Services (TSS), and Security and Risk Management. The last domain contains the practices that security often owns, such as identity and access management, vulnerability management, and data protection. Infrastructure protection services are a specific area of focus within this domain, and guidance is provided for identifying tools, processes, and practices needed to address the items in this domain.

Full details of the CSA EA and related supporting documents are available at the EA Working Group site: cloudsecurityalliance.org/research/working-groups/enterprise-architecture.

DevOps Security

DevOps represents an integration of development and operations teams, with the goal of speeding delivery of quality software. When this concept evolved, security either was usually an afterthought or was an entirely siloed function within the organization. The reality, however, is that security is vital in both development and operations tasks; otherwise, it is significantly less effective. For example, catching a bug in code as soon as it is written requires fewer resources to find and fix and poses less risk to application availability than designing a workaround or patch.

Integrating security into DevOps follows a fundamental practice of shifting left, that is to say, taking security activities traditionally performed at the end of a system development lifecycle and shifting them to the left on a timeline. Instead of all testing being done after a system is code complete, testing can begin while code is being written.

There are different approaches to solving this problem. DevSecOps and SecDevOps have both emerged as models for integrating security into DevOps processes. Both approaches share the goal of shifting left but have different philosophies regarding how to effect this change. The choice of a model will be driven largely by the DevOps mindset in a given organization, as the practices and philosophy of each model align with development and management practices like Agile and Six Sigma.

A NIST working group for integrating security into DevOps is a popular resource for DevOps security and can be found here: csrc.nist.gov/projects/devsecops.

Evaluate Cloud Service Providers

Evaluation of CSPs should be done through objective criteria, and using standardized criteria makes it easier to compare offerings from different CSPs. Standards may be voluntary, and a CSP would choose them as a way to demonstrate security capabilities to customers. Other standards are required for CSPs offering services with specific functions or to specific markets.

For example, SOC 2 is a voluntary standard that a CSP can implement, and a SOC 2 Type II audit report demonstrates to potential customers that the CSP's security program is in place and operating as intended. In contrast, PCI DSS is a required standard for any organization that is accepting and processing payment card transactions.

Some standards are required for certain organizations and may be adopted by others voluntarily. For example, the U.S. federal government requires that technologies conform to the Federal Information Processing Standards (FIPS) in order to be used by federal agencies. These standards are not required for nongovernmental organizations, but many have chosen to use these standards because they are robust, well understood, and widely deployed by technology providers.

Verification Against Criteria

Different organizations have published compliance criteria for technology products and systems. For cloud computing, there are a mix of regulatory and voluntary standards, depending on the market that the CSP is serving. The International Organization for Standardization/International Electrotechnical Commission (ISO/IEC) standard is voluntary, but it may be a customer requirement in some parts of the world. By contrast, the PCI Council contractually requires compliance with PCI DSS in order to process payment card transactions, and CSPs looking to work with the U.S. federal government must undergo an audit against the standards defined in the Federal Risk and Authorization Management Program (FedRAMP).

International Organization for Standardization/International Electrotechnical Commission

ISO 27017 and 27018 standards extend guidance from ISO 27001, which establishes an organization's information security management system (ISMS), and ISO 27002, which provides related security controls. 27017 guides the implementation of these controls in cloud computing environments, while 27018 extends the controls to implement protection of personally identifiable information (PII) processed in the cloud. 27017 added 35 supplemental controls and extended 7 existing controls to the original ISO documents.

ISO 27018 serves as a supplement to ISO 27002 and is specifically geared toward PII processors, so it is often implemented where privacy controls are required. Like ISO 27017, these principles are recommendations and not requirements, though the requirements of privacy regulations are usually mandatory. ISO 27018 added 14 supplementary controls and extended 25 other controls. As an international standard, adherence to this standard can help an organization address a wide and ever-changing data protection and privacy environment, as laws like GDPR in the EU and PIPEDA in Canada continue to evolve.

While these are recommendations and not requirements, many international corporations strive to be ISO-compliant. In that case, the criteria provided by ISO/IEC become governing principles for the organization, including the reference framework, cloud service models (of which there are seven instead of just SaaS, PaaS, and IaaS), and the implementation of controls from the approved control set. Auditing the controls and conducting a risk assessment should help identify which controls best address identified risk.

The ISO standard is important for companies in the international marketplace due to their wide acceptance throughout the world. These standards are also well suited to the international nature of cloud services, which rely on broad network accessibility. Cloud security practitioners must understand the context of their business, legal, and regulatory requirements that arise from it and select control frameworks or standards as appropriate to address them.

Payment Card Industry Data Security Standard

The PCI Council regularly updates its Data Security Standard to include updated guidance for emerging technologies and evolving threats. Although the PCI DSS is legally required for organizations, the payment card companies that comprise the council make it a contractual requirement that payment processors maintain compliance. The 12 requirements in the standard are designed to reduce the risk of payment fraud and unauthorized access to sensitive payment information and maintain the integrity of transaction data.

Although originally written before cloud computing became widespread, the PCI DSS contains high-level guidance for information system security. This means it is adaptable to any organization's computing environment, whether it is cloud-based, on-premises, or some mix of the two. Some of the key requirements that deal with cloud computing can be summarized as follows:

  • Ensure that a customer's processes can only access their data environment.
  • Restrict customer access and privileges to their data environment using the concept of least privilege.
  • Enable logging and audit trails that are unique to each environment.
  • Provide processes to support forensic investigations.
  • Provide an isolated, access-controlled environment for processing cardholder data, known as the cardholder data environment (CDE).

Because of the commoditized nature of payment processing, many organizations utilize an external payment processor rather than implementing these functions internally. This has the benefit of shifting some risk and elements of PCI DSS compliance away from the organization. However, the outsourcing arrangement brings its own shared responsibility model, meaning the consuming organization must still implement some controls to ensure the security of the payment data.

Government Cloud Standards

Governments around the world have recognized the benefits and cost savings that cloud computing offers. While they lack the motivation to maximize profitability, fiscal responsibility with taxpayer dollars makes cloud computing an attractive option. Because of the sensitive nature of data processed by government agencies, a number of standards have been created. These offer the governments, which are cloud customers, assurance that the CSPs they engage implement adequate protection of sensitive data like PII, financial data, or possibly even national security information.

As a CSP it is important to implement standards required by any government customers your organization serves. Some private organizations may also choose to use these higher-security services due to additional controls offered. Some standards used by governments around the world include the following:

  • FedRAMP is required by the U.S. federal government and provides a set of risk-based security controls that CSPs must implement. In addition, there is an audit process known as assessment and authorization (A&A) designed to validate the proper implementation and function of those controls.
  • The UK G-Cloud is a marketplace of cloud services that have implemented controls in line with the G-Cloud framework, which can be consumed by UK governmental agencies. They are divided into categories of service, including cloud hosting, cloud software, and cloud support.

CSA Security, Trust, Assurance, and Risk

CSPs can enroll in the CSA Security, Trust, Assurance, and Risk (STAR) registry as a way to demonstrate the security and privacy controls they offer. This allows customers with specific security needs to select a CSP or specific services that meet their needs.

CSA STAR is a voluntary scheme and offers two levels of assurance. CSPs can provide evidence of their security controls, privacy controls, or both, and the registry has two levels of assurance. Level 1 is a self-assessment and allows the CSP to detail their controls in a standardized format so customers can easily compare across CSPs. The CSP completes either a CSA Cloud Controls Matrix (CCM) or Consensus Assessments Initiative Questionnaire (CAIQ) and submits it to the STAR registry.

Level 2 of CSA STAR provides an additional layer of assurance for customers by requiring the CSP to undergo a third-party audit. This audit assesses the security control implementations at the CSP and their effectiveness, so customers have additional evidence that the services of the CSP provide adequate security. The audit may be done as a stand-alone activity or may be incorporated into an existing SOC 2 or ISO 27001 audit.

System/Subsystem Product Certifications

The following are system/subsystem product certifications. Unlike certifications that attest to overall capabilities in a cloud environment, like FedRAMP or ISO 27001, these certifications apply to smaller parts of a system, such as a cryptographic module. For organizations with very specific security needs, they provide a way to choose products that can meet those requirements.

Common Criteria

Common Criteria (CC) is an international set of guidelines and specifications to evaluate information security products. These evaluations are typically done on systems configured according to specific standards and provide customers with assurance that the system can meet requirements when configured appropriately. There are two parts to CC.

  • Protection profile: This defines a standard set of security requirements for a specific product type, such as a network firewall. This creates a consistent set of standards for comparing like products.
  • Evaluation assurance level: Known as EAL, these are scores from 1 to 7, with 7 being the highest. This measures the level of rigor and amount of testing conducted on a product. It should be noted that a level 7 product is not automatically more secure than a level 5 product. It has simply undergone more testing. The customer must still decide what level of testing is sufficient. The manufacturer chooses the level of testing they want to pursue; the costs can be significant, so not all products are assessed to EAL 7 unless a viable definite market for such a solution exists.

The testing is performed by an independent lab from an approved list. Successful completion of this certification allows sale of the product to government agencies and may improve competitiveness outside the government market as CC becomes better known. Customers must define what component or system capabilities they require and pick systems matching that protection profile. Furthermore, the customer must determine the level of assurance they need and choose the correct EAL required. Using those criteria, the customer can then select a system that meets their needs. The goal of the CC is validation and ability to reliably compare systems and to provide a framework for product vendors to improve their products through testing.

FIPS 140-2

To secure data processed by government agencies, the United States publishes FIPS for various information processing use cases. The use of cryptography to protect data is one such use case, and FIPS 140-2 provides a scheme for validating the strength of cryptographic modules.

Organizations that want to do business with the U.S. government must meet the FIPS criteria. Because the U.S. federal government is such a large technology purchaser, FIPS 140-2 is widely implemented in hardware and software products and can be used by any organization. Specifications for data security like the Advanced Encryption Standard (AES) arose from FIPS requirements for data protection and have become widespread. In many cryptography implementations, using FIPS-validated encryption is simply a configuration option that can be enabled, giving organizations access to strong data security controls.

FIPS-validated cryptographic modules undergo testing to check the implementation of the cryptographic functions. These modules are considered adequate for government use if they are configured in the same manner as the validated test environment. This usually requires enabling only FIPS-validated encryption algorithms like AES, and disabling algorithms like 3DES that do not provide adequate security.

There are many cryptographic modules and algorithms that are not FIPS validated. The lack of validation does not mean these are automatically insecure, but an organization must determine what level of assurance is required before selecting cryptographic functions. The public nature of FIPS and its wide acceptance simply make it an easy standard to adopt for any organization.

FIPS 140-2 is scheduled to be retired by 2026, and the process of transition to FIPS 140-3 began in 2019. The basic functions of FIPS validation will remain the same, but the new standard addresses emerging technologies and security challenges that encryption faces.

Summary

Cloud security practitioners must be familiar with the basic terminology surrounding this technology. This understanding includes characteristics of cloud computing, as well as the service models and deployment models of cloud computing. It also includes the role of the CSP in cloud computing, the shared security model that exists between the CSP and the customer, and the implications this has for ensuring that cloud services and data are adequately protected. Finally, the technologies that make cloud computing possible were discussed in this chapter alongside the emerging technologies that will support and transform cloud computing in the future. Understanding this chapter will make it easier to access the discussion in each of the following domains and allows a security practitioner to evaluate cloud computing technologies based on their organization's security requirements.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.255.197