Hybrid multi-cloud support is the foundational pillar of Cloud Pak for Data, and the feature of deploying and running anywhere (private cloud or any public cloud) has been one of Cloud Pak for Data's key differentiators. In this chapter, we will learn more about the supported public and private clouds, the evolution of Cloud Pak for Data as a Service, and IBM's strategy to support managed services on third-party clouds. We will explore both the business and technical concepts behind IBM's multi-cloud support, including a brief overview of IBM Cloud Satellite. This will allow the customer to define an effective multi-cloud strategy by leveraging IBM technology.
We're going to cover the following main topics:
IBM is one of the few technology vendors to have embraced a hybrid multi-cloud strategy from day one, and this is evident from the deployment options that are supported by IBM. Being able to deploy anywhere is a key differentiator for Cloud Pak for Data. While software and system deployment options enjoy customer adoption the most, Cloud Pak for Data as a Service is IBM's strategic direction in the long run. The as-a-Service managed edition of Cloud Pak for Data was launched in 2020 on IBM Cloud and is now supported on third-party clouds through Cloud Satellite, which we will cover in detail later in the chapter.
The value proposition of as-a-Service is that it allows customers to modernize how they collect, organize, analyze, and infuse AI with no installation, management, or updating required. In essence, you can derive all the benefits of an integrated data and AI platform, namely Cloud Pak for Data, without the overhead of managing the infrastructure or software.
Cloud Pak for Data as a Service is a strategic priority for IBM as there is a growing demand for Software as a Service (SaaS). Furthermore, IBM launched Cloud Satellite to deliver SaaS on-premises and third-party clouds. The objective is to give customers the flexibility of choice so that they can consume software and SaaS on the infrastructure of their choice such as an on-premise private cloud (x86, Power, Z) or a public cloud such as IBM Cloud, AWS, Azure, and so on. Next, we will cover supported deployment options.
Cloud Pak for Data comes in three deployment options: software, system, and as-a-Service, offering significant choice to customers as to where they deploy and run their software and how they manage Cloud Pak for Data:
For enterprise customers who cannot yet embrace the public cloud or would prefer a non-IBM cloud, the options are as follows:
Red Hat has launched managed OpenShift offerings on all major cloud providers, including IBM Cloud, AWS, Azure, and GCP. Customers can run their workloads, including Cloud Pak for Data ones, without worrying about the maintenance, administration, and upgrading of OpenShift, which is not so easy for non-technology users.
IBM Cloud Pak for Data is currently available and supported on three major managed OpenShift services. They are IBM Red Hat ROKS, Amazon ROSA, and Microsoft ARO. Managed OpenShift's main value proposition includes four key points:
The main value proposition of this service spans 4 key points:
Here are all the items managed by the provider as part of Managed OpenShift:
"Quick Start" on AWS refers to templates (scripts) that automate the deployment of workloads – in this case, Cloud Pak for Data. A Quick Start launches, configures, and runs compute, network, storage, and all other related AWS infrastructure required to deploy Cloud Pak for Data on AWS in 3 hours or less. This is a significant value proposition for customers interested in running Cloud Pak for Data on AWS. It saves time through automation (eliminates many of the steps required for manual installation configuration) and helps implement AWS best practices by design.
Other benefits of AWS Quick Start include a free trial of Cloud Pak for Data for up to 60 days and a detailed deployment guide. However, it requires an AWS account – the customer is responsible for infrastructure costs. For storage, the customer has two options: OpenShift Container Storage (OCS), which was recently renamed OpenShift Data Foundations (ODF), or Portworx.
Also available on AWS, Azure, and IBM Cloud is Terraform automation, whose benefits include the following:
Azure Marketplace is an online applications and services marketplace ideal for IT professionals and cloud developers interested in Cloud Pak for Data software and its services. It enables customers to discover, try, buy, and deploy a solution in just a few clicks. Like AWS, Microsoft requires an Azure account and one of the two supported storage options, namely OCS or Portworx. Also, customers can have a trial at no cost for up to 60 days:
Let's move on to cover Cloud Pak for Data as a service.
Most of IBM's data and AI products are currently available as a service and packaged under a single Cloud Pak for Data subscription that is consumption-based, allowing customers to only pay for what they use. Furthermore, to accelerate the modernization of existing workloads and help customers with moving to the cloud/managed services, IBM has launched an initiative called Hybrid Subscription Advantage (HSA) that offers existing software customers discounts to use SaaS instead of software. In this section, we will cover a detailed overview of Cloud Pak for Data as a Service, including the capabilities available, how it's priced and packaged, and how IBM is enabling its customers to modernize through HSA.
As mentioned before, Cloud Pak for Data as a Service allows customers to modernize how they collect, organize, analyze, and infuse AI with no installation, management, or updating required. In other words, it helps deliver all the benefits of an integrated data and AI platform without the overhead of managing the infrastructure or software. The different services constituting Cloud Pak for Data as a Service are the same as for Cloud Pak for Data software, with the existing gaps addressed as part of the roadmap. Here is a short list of available services along with their descriptions and value propositions:
a) IBM Watson Machine Learning: IBM Watson Machine Learning is a full-service IBM Cloud offering that makes it easy for developers and data scientists to work together to integrate predictive capabilities with their applications. The Machine Learning service is a set of REST APIs that you can call from any programming language to develop applications that make smarter decisions, solve tough problems, and improve user outcomes.
b) IBM Watson OpenScale: IBM Watson OpenScale tracks and measures outcomes from AI throughout its life cycle and adapts and governs AI in changing business situations. It also helps with drift explainability and bias detection.
a) IBM Db2 Warehouse: A fully managed elastic cloud data warehouse that delivers independent scaling of storage and compute. It delivers a highly optimized columnar data store, actionable compression, and in-memory processing to supercharge your analytics and machine learning workloads.
a) The Speech to Text service converts human voice input into text. The service uses deep learning AI to apply knowledge of grammar, language structure, and the composition of audio and voice signals to accurately transcribe human speech. It can be used in applications such as voice-automated chatbots, analytic tools for customer-service call centers, and multi-media transcription, among many others.
b) The Text to Speech service converts written text into natural-sounding speech. The service streams the synthesized audio back with minimal delay. The audio uses the appropriate cadence and intonation for its language and dialect to provide voices that are smooth and natural. The service can be used in applications such as voice-automated chatbots, as well as a variety of voice-driven and screenless applications, such as tools for the disabled or visually impaired, video narration and voice-over, and educational and home-automation solutions.
c) Natural Language Understanding helps analyze text and extract metadata from content such as concepts, entities, keywords, categories, sentiment, emotion, relations, and semantic roles. Apply custom annotation models developed using Watson Knowledge Studio to identify industry-/domain-specific entities and relations in unstructured text with Watson NLU.
a) Cloudant is a fully managed JSON document database that offers independent serverless scaling of provisioned throughput capacity and storage. Cloudant is compatible with Apache CouchDB and accessible through a simple to use HTTPS API for web, mobile, and IoT applications.
b) Elasticsearch combines the power of a full text search engine with the indexing strengths of a JSON document database to create a powerful tool for the rich data analysis of large volumes of data. IBM Cloud Databases for Elasticsearch makes Elasticsearch even better by managing everything for you.
c) EDB is a PostgreSQL-based database engine optimized for performance, developer productivity, and compatibility with Oracle. IBM Cloud Databases for EDB is a fully managed offering with 24/7 operations and support.
d) MongoDB is a JSON document store with a rich query and aggregation framework. IBM Cloud Databases for MongoDB makes MongoDB even better by managing everything for you.
e) PostgreSQL is a powerful, open source object-relational database that is highly customizable. It's a feature-rich enterprise database with JSON support, giving you the best of both the SQL and NoSQL worlds. IBM Cloud Databases for PostgreSQL makes PostgreSQL even better by managing everything for you.
Cloud Pak for Data as a Service is offered as a subscription wherein customers pay for what they use and no more. Also, subscription credits can be used for any of the data and AI services that make up Cloud Pak for Data. This affords a lot of flexibility to customers as they can provision and use all the in-scope services. Each of these individual services is priced competitively to reflect the value they offer and include infrastructure costs such as compute, memory, storage, and networking.
Also, IBM has an attractive initiative called HSA to incentivize existing on-premises customers to modernize to Cloud Pak for Data as a Service. The value in this initiative is that existing software customers will receive a significant discount on Cloud Pak for Data as a Service to account for the value of the software that they have already paid for. This, of course, assumes that customers will stop using other software and instead move to Cloud Pak for Data as a Service.
IBM Cloud Satellite is an extension of IBM Cloud that can run within a client's data center, on an edge server, or on any cloud infrastructure. IBM has been porting all its cloud services to Kubernetes, enabling the different services to function consistently. IBM Cloud Satellite extends and leverages the same underlying concept, running on Red Hat OpenShift as its Kubernetes management environment.
To be more precise, every Cloud Satellite location is an instance of IBM Cloud running on local hardware or any third-party public cloud (such as AWS or Azure). Furthermore, each Cloud Satellite location is connected to the IBM Cloud control plane. This connection back to the IBM Cloud control plane provides audit, packet capture, and visibility to the security team, and a global view of applications and services across all satellite locations. IBM Cloud Satellite Link connects IBM Cloud to its satellite location and offers visibility into all the traffic going back and forth.
IBM's strategy is to bring Cloud Pak for Data managed services to third-party clouds and dedicated on-premises infrastructure using Cloud Satellite. As of 2021, a subset of Cloud Pak for Data managed services, including Jupyter Notebook and DataStage, is available on Cloud Satellite to be deployed on third-party clouds. The packaging and pricing for these services is no different from how it is on IBM Cloud, and included will be the third-party cloud infrastructure to run the services. The ultimate objective here is to deliver a managed service on the infrastructure of your choice.
So, how does this all work? Once a Satellite location is established, all workloads in these Satellite locations run on Red Hat OpenShift – specifically IBM Cloud's managed OpenShift service. Services integrated into Cloud Pak for Data as a Service – such as DataStage – can deploy runtimes to these Satellite locations so that these runtimes can be closer to the data or apps these services need to integrate with. A secure connection, called Satellite Link, provides communication between Cloud Pak for Data as a Service and its remote runtimes.
The value proposition of IBM Cloud Satellite specifically for data and AI workloads is threefold:
Easy to provision and scale up and down.
Seamless upgrades with negligible downtime.
Realize the benefits of a managed service.
Share insights without moving data: handle challenges with data sovereignty.
Comply with GDPR, CCPA, and so on.
Now that we have learned the basics of IBM Cloud Satellite and IBM's approach to multi-cloud, let's look at the data fabric.
The cloud is transforming every business, and multi-cloud is the future. To be successful, enterprises will have to access, govern, secure, transform, and manage data across private and public clouds. This is what IBM is addressing using the data fabric, a significant effort that will likely span multiple years. A solid foundation to a data fabric starts with centralized metadata management, data governance, and data privacy, which, when augmented by automation, really helps in amplifying the benefits.
The following image showcases data fabric for a multi-cloud future:
This repeats the last line in the previous paragraph - delete one of them.
Managed services/SaaS is an absolute must these days, and Cloud Pak for Data has a comprehensive and evolving set of capabilities that addresses end-to-end customer requirements. With simplified packaging and pricing and incentives to modernize, IBM makes it easy for existing and new clients to embrace Cloud Pak for Data as a Service. In this chapter you learned the different deployment options for Cloud Pak for Data, supported third-party clouds and an overview of its managed service (Cloud Pak for Data SaaS offering). You have also learned that IBM's vision is to deliver the Cloud Pak for Data managed service on any infrastructure of your choosing, including third-party cloud providers. This is made possible using Cloud Satellite, IBM's answer to AWS Outposts and Azure Stack/Arc.
Finally, IBM is making significant investments in its data fabric to access, govern, manage, and secure data and AI workloads across clouds to address the evolving requirements of a multi-cloud future.
In the next chapter, we will learn about the Cloud Pak for Data ecosystem, which complements and extends the capabilities included in the base Cloud Pak for Data.
3.137.198.183