Understanding Cloud Storage Fundamentals

The design of cloud storage is very similar to other cloud architectures in terms of self-service, elasticity, and scalability. Cloud storage is a technique of abstracting storage with a well-defined interface so it can be managed in a self-service manner. In addition, cloud storage needs to support a multi-tenant architecture so that each consumer’s cloud data is managed in isolation from other consumer’s cloud data. One of the most important characteristics of cloud storage is how it can dynamically interface with other cloud services such as SaaS, PaaS, IaaS, and BPaaS.

It is not new to think about attaching storage to systems — it has been done since the first systems rolled off the assembly line. Today, most storage environments are connected with systems through a standard interface called SCSI (Small Computer Systems Interface). SCSI is a very mature protocol that is widely adopted because of its reliability and performance.

Four key fundamentals of cloud storage — access protocols, usage scenarios, functions, and benefits — are addressed below:

Cloud storage access protocols

One important issue in cloud storage is the speed and ease of accessing the data when it’s needed. In order for cloud storage to be a viable alternative to on-premises data storage, you need to be able to access your data at a competitive cost and at a time that is appropriate for the situation. Today, there are four types of cloud storage access methods:

check.png Web services application programming interfaces (APIs): These use RESTful APIs (according to the principals of Representational State Transfer) to integrate with applications.

check.png File-based protocols: These protocols are used to transfer files and provide integration independent of the application being connected. They also provide a faster integration than web service APIs. Different types are

• Network File System (NFS)

• Common Internet File System (CIFS)

• File Transfer Protocol (FTP)

check.png Block-based APIs: These use Internet SCSI to connect a front end to storage middleware that support services such as data replication and data reduction.

check.png Web-based Distribution Authoring and Versioning (WebDAV): This is based on Hypertext Transfer Protocol (HTTP).

The most common method for accessing cloud storage is by using web service APIs such as REST (Representational State Transfer). Cloud storage vendors implement this technology because it’s dynamic and simple to use in the cloud. In addition, because of the use of virtualization in cloud environments, there’s a requirement for a more stateless (no set location for any code) access protocol. Web service APIs support this requirement for statelessness. This access method is used by Amazon Simple Storage Service (Amazon S3), Windows Azure (Microsoft’s Cloud Platform), and others. However, Web service APIs need to be integrated with a specific application when used for cloud storage, which can create some challenges. If you want to avoid the need to integrate with an application, file-based protocols and block-based APIs can be used as alternative access methods. Another connection protocol is WebDAV, a specification designed to create an efficient cloud storage interface.

Delivery options for cloud storage

How will your cloud provider deliver your storage capability? You can use an appliance or connect to a public or remote storage service.

Although latency is a big issue for primary (tier 1) cloud storage, particularly for data used frequently, vendors are currently offering a different class of products called hybrid cloud storage solutions that may ultimately address primary storage. (Because we talk about hybrid clouds in general throughout this book, some of the terminology may be confusing, but bear with us.) The idea is to use local and cloud-based resources to address performance issues associated with storage in the cloud. Generally, these offerings consist of two things:

check.png An appliance that is a physical or virtual server where the hardware and software are preconfigured so the user doesn’t have to understand the details

check.png A connection to a remote storage service

The appliance intelligently handles the movement between the local storage and the cloud; to the end user, all of the data seems to be in one place.

A cache is a block of memory for temporary storage on the appliance that provides a high-speed buffer between your client and the cloud service. The cache uses a host of algorithms to keep the most frequently used data on the local, expensive hardware. For read requests, attributes such as the age of the data, time since last accessed, time since last updated, and so on are used. For write requests, the appliance may write the data locally on the machine and then burst it out to the cloud storage provider.

The data is generally encrypted when it’s transported. When you request data from the provider, the data is first deduplicated to make it faster to retrieve. Vendors such as Nasuni, StorSimple, CTERA Networks, and others are providing solutions in this space.

Functions of cloud storage

The type of information you need to store and how quickly you need to access data both have an impact on the type of storage you will use. You can use policy-based replication to enable more granular control over how and where data is stored.

Cloud storage can serve multiple purposes:

check.png General-purpose storage for day-to-day or periodic use

check.png Data protection and continuity, which can include data replication and backup and restore functionality

check.png Archive and records management, meaning recoverable long-term data retention to support compliance and regulatory requirements

Benefits of cloud storage

Some of the benefits of cloud storage include:

check.png Agility: The elastic nature of the cloud enables you to gain potentially unlimited storage in an on-demand model.

check.png Fewer physical devices to purchase and maintain: When you’re storing data in a data center, you have to plan for the servers that will be part of this storage solution. This means you need to plan for purchasing the machines and maintain them during their lifecycle. Additionally, you must make sure that you have enough space and can meet power requirements. In the cloud, you don’t have to purchase physical devices or deal with environmental issues. The cloud provider should do this for you (but it pays to do your homework on the services that your provider offers).

check.png Disaster recovery: The cloud can serve as a good replacement for tape or other backups and can minimize concerns about your own data center capacity to support your backups. Instead of continuing to expand your on-premises storage, your information can be backed up to the cloud. If your systems go down, you can retrieve your data from the cloud.

check.png Cost: Although DAS is relatively inexpensive, NAS and SAN devices require significant capital expenditures. The cloud storage model is based on usage, so you only pay for what you use. This is similar to how you use your telephone — generally speaking, you pay for what you use.

Of course, no solution comes without drawbacks. Off-premises storage can affect performance, which will now be based on connectivity and latency between your LAN/WAN and your cloud provider. Network connectivity can affect performance (see more on this topic in the next section). Additionally, you need to deal with issues such as the security that your cloud provider puts in place and availability of your cloud provider.

Note: Chapter 11 discusses managing your data; Chapter 15 covers cloud security; and Chapter 17 has more information about service level agreements and managing your cloud provider.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.116.14.245