3 Storage Networking

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 3 Storage Networking

Images

CERTIFICATION OBJECTIVES

3.01 Storage Types and Technologies

3.02 Storage Access Protocols

3.03 Storage Provisioning

3.04 Storage Protection

Two-Minute Drill

Q&A Self Test

Storage is the foundation of a successful infrastructure. The traditional method of storing data is changing with the emergence of cloud storage. Storage is the instrument that is used to record and play back the bits and bytes that the compute resources process to provide their functions for delivering cloud services and applications.

Cloud storage is being leveraged for a wide range of enterprise functions from end user computing to enterprise storage and backup. Furthermore, cloud storage is a platform for explosive growth in organizational data because it is highly available and almost infinitely scalable. Understanding the advantages and disadvantages of storage types and technologies is a key concept for IT and cloud professionals because it will be your responsibility to help the organization understand the risks and the benefits of moving to cloud storage.

CERTIFICATION OBJECTIVE 3.01

Storage Types and Technologies

Just as there are many different environments in which computers are used, there are many types of storage to accommodate the needs of each of those environments. Some storage types are designed to meet primary organizational storage concerns such as cost, performance, reliability, and data security. Figure 3-1 displays a graphical comparison of the three storage types, DAS, SAN, and NAS, which we explore in more detail directly. A fourth type is object storage.

FIGURE 3-1 Three major storage types: DAS, SAN, and NAS

Images

In addition to the four storage types, this section also covers two storage technologies. These technologies, deduplication and compression, improve storage efficiencies by removing unnecessarily redundant data.

Direct Attached Storage (DAS)

Direct attached storage (DAS) is one or more drives that are connected to a machine as additional block-level storage. Some storage protocols that are used to access these storage devices are eSATA, USB, FC, SCSI, and SAS. USB and eSATA are most frequently utilized by desktops and laptops to connect to DAS, while companies typically connect DAS to servers using FC, SCSI, or SAS.

DAS is typically the least expensive storage option available for online storage (as opposed to offline storage such as a tape). As its name suggests, this type of storage is directly attached to the host computer that utilizes it and does not have to traverse a network to be accessed by that host. Direct attached storage is made available only to that local computer and cannot be used as shared storage. Shared storage, in this context, refers to storage that is made available to multiple machines at the block level. It is possible for a machine to share out storage that was provided to it by DAS.

Direct attached storage (DAS) cannot provide shared storage to multiple hosts.

Storage Area Network (SAN)

A storage area network (SAN) is a high-performance option that is employed by many data centers as a high-end storage solution with data security capabilities and a very high price tag to go along with it. A SAN is a storage device that resides on its own network and provides block-level access to computers that are attached to it.

The disks that are part of a SAN are combined into RAID groups for redundancy and higher performance. These RAID groups are then carved up into subdivisions called logical unit numbers (LUNs) that provide the block-level access to specified computers. LUNs can be interacted with just like a logical drive.

SANs are capable of very complex configurations, allowing administrators to divide storage resources and access permissions very granularly and with very high-performance capabilities. However, SAN maintenance and operations can be complicated, often requiring specialized skill sets and knowledge of proprietary technology (because each SAN solution is vendor specific). The role of SANs is mission critical, so there is little if any margin for error. Many storage administrators go to specialized training for their specific SAN solution, and they spend much of their time in the workplace giving SANs constant monitoring and attention. These administrative burdens add to the cost of deploying a SAN solution.

SANs are also able to provide shared storage or access to the same data at the same time by multiple computers. This is critical for enabling high availability (HA) in data center environments that employ virtualization solutions that require access to the same virtual machine files from multiple hosts. Shared storage allows hosts to perform migrations of virtual machines without any downtime, as discussed in more detail in Chapter 5.

Computers require a special adapter to communicate with a SAN, much like they need a network interface card (NIC) to access their data networks. The network that a SAN utilizes is referred to as a fabric and can be composed of fiber-optic cables, Ethernet adapters, or specialized SCSI cables.

A host bus adapter (HBA) is the most common device used to connect a machine to a SAN. An HBA is usually a PCI add-on card that can be inserted into a free slot in a host and then connected either to the SAN disk array directly or, as is more often the case, to a SAN switch. Virtual machines can use a virtual HBA, which emulates a physical HBA and allocates portions of the physical HBA’s bandwidth to virtual machines. Storage data is transferred from the disk array over the SAN to the host via the HBA, which prepares it for processing by the host’s compute resources.

There are two other adapters that may be used to connect to a storage network. A converged network adapter (CNA) can be used in lieu of an HBA. CNAs are computer expansion cards that can be used as an HBA or a NIC. NetApp has a proprietary adapter called universal target adapter (UTA). UTA has ports for one or more Ethernet or Fibre transceivers and can support Ethernet transceivers up to 10 Gbps and Fibre transceivers at native fibre channel speeds.

In addition to SANs, organizations can use a virtual SAN (VSAN), which can consolidate separate physical SAN fabrics into a single larger fabric, allowing for easier management while maintaining security. A VSAN allows identical Fibre Channel IDs to be used at the same time within different VSANs. VSANs allow for user-specified IDs that are used to identify the VSAN. VSANs can also span data centers with the use of VXLANs, discussed more in Chapter 4, or with encapsulation over routable network protocols.

HBAs usually can increase performance significantly by offloading the processing required for the host to consume the storage data without having to utilize its processor cycles. This means that an HBA enables greater efficiency for its host by allowing its processor to focus on running the functions of its operating system (OS) and applications instead of on storage I/O.

Network Attached Storage (NAS)

Network attached storage (NAS) offers an alternative to storage area networks for providing network-based shared storage options. NAS devices utilize TCP/IP networks for sending and receiving storage traffic in addition to data traffic. NAS provides file-level data storage that can be connected to and accessed from a TCP/IP network. Because NAS utilizes TCP/IP networks instead of a separate SAN fabric, many IT organizations can utilize existing infrastructure components to support both their data and storage networks. This use of common infrastructure can greatly cut costs while providing similar shared storage capabilities. Expenses are reduced for a couple of reasons:

Data networking infrastructure costs significantly less than storage networking infrastructure.

Shared configurations between data and storage networking infrastructure enable administrators to support both with no additional training or specialized skill sets.

NAS uses file sharing protocols to make shares available to users across a network. NAS systems typically support both the common Internet file system (CIFS)/Server Message Block (SMB) for Windows and the network file system (NFS) for Linux. NAS may also support uploading and downloading files to it via FTP or SSL/TLS-enabled FTP such as FTPS and SFTP.

One way to differentiate NAS from a SAN is that NAS appears to the client operating system as a file server, whereas a SAN appears to the client operating system as a disk (typically a LUN) that is visible in disk management utilities. This allows NAS to use Universal Naming Convention addressable storage. Network attached storage leverages protocols such as TCP/IP and iSCSI, both of which are discussed later in this chapter in more detail.

A storage area network (SAN) provides much better performance than network attached storage (NAS).

NAS also differs from SAN in that NAS natively allows for concurrent access to shares. However, there are some functions that can only be performed on block storage such as booting from SAN storage or loading applications from SAN storage. SAN connections usually offer much higher throughput to storage for high-performance needs.

Object Storage

Traditional file systems tend to become more complicated as they scale. Take, for example, a system that organizes pictures. Thousands of users may be requesting pictures from the site at the same time, and those pictures must be retrieved quickly. The system must track the location of each picture and, in order to retrieve them quickly, maintain multiple file systems and possibly multiple NAS devices for that data. As the user base grows further from the data center, latency issues can come up, so the data must be replicated to multiple sites and users directed to the location that has the lowest latency. The application now tracks the NAS where each picture resides, the location on that NAS, and which country the user is directed to, and it must keep the data synchronized between each site by tracking changes to the pictures. This application complexity makes applications harder to maintain and results in more processing to perform normal application functions. The solution is object storage.

Object storage is a storage system that abstracts the location and replication of data, allowing the application to become more simplified and efficient. Traditional file systems store data in blocks that are assembled into files, but the system does not know what each file actually contains. That is the responsibility of the application. However, object storage knows where the data is located and what the data is by utilizing metadata as a file organization method. For example, an application residing on top of object storage can ask the storage for the picture of John Doe at Middleton Beach on August 4, 2017, and the object storage will retrieve it for the application.

Object storage is often used for media files such as music, pictures, and video in cloud storage. Object storage has lower storage overhead because of the way data is organized. Objects have a unique identifier, but the application requesting the data does not need to know where it is stored. Object storage further avoids overhead by using a flat organization method rather than a hierarchical storage system. Hierarchical storage systems can only expand so far until suffering from latency in traversing directory trees, but object storage does not utilize such directory trees.

Object storage is heavily reliant upon metadata. Data and metadata are stored separately, and metadata can be expanded as needed in order to track additional details about the data. Object storage indexes metadata so that the data can be located using multiple criteria. Object stores are also application agnostic, supporting multiple applications for the same data set. In this way, the same data can be utilized for multiple applications, avoiding redundancy in storage and complexity in managing the data.

With object storage, capacity planning is done at the infrastructure level rather than the application level. This means that the application owners and the application itself do not need to monitor its capacity utilization, and it allows the system to be much more scalable.

Object storage is scalable because it uses a scale-out method that creates new object stores to handle additional data. These stores can exist at multiple sites, and redundancy can be defined per data type or per node so that the object store replicates data accordingly both to ensure that it is available if a single copy is lost and to avoid latency issues with satisfying application requests.

However, you should be aware that object storage requires changes to the application. You can move between different storage and database models usually by just changing a few parameters in the application, but moving to object storage requires the application to interface with the storage differently, so developers will need to modify their code accordingly. The change from traditional storage to object storage, therefore, is a dev change rather than an ops change.

Object storage provides better scalability than hierarchical file systems.

Deduplication Technologies

Deduplication technologies remove redundant data from a storage system to free up space. There are two forms of deduplication technologies: file-level deduplication and block-level deduplication.

File-level deduplication hashes each file on a file system and stores those hashes in a table. If it encounters a file with a hash that is already in its table, it places a pointer to the existing file on the file system rather than storing the data twice. Imagine a document that is e-mailed to 100 people at an office. Because each person who stores that file would be storing a duplicate on the system, file-level deduplication would save only one copy of that data, with pointers for all the remaining ones. File-level deduplication can remove many duplicates, but it is not nearly as efficient as block-level deduplication.

Block-level deduplication hashes each block that makes up a file. This allows deduplication to take place on the pieces of a file, so deduplication does not require that a file be 100 percent identical to perform deduplication. For example, a user may store nine versions of a spreadsheet. Each version is slightly different from the others as new information was added to it. File-level deduplication would see each file hash as different, so no deduplication would be performed. However, block-level deduplication would see that many of the blocks are the same between these different versions and would store only one block for each duplicate. In this example, block-level deduplication could save up to 90 percent of the space that otherwise would be used.

Compression Technologies

Compression is another method used to reduce space. Some forms of compression result in a loss of information. These forms are known as lossy compression. Other types, known as lossless compression, do not result in a loss of information.

In most cases, you will want to employ lossless compression, but there are cases, such as in the transmission of audio or video data, where lossy compression may be utilized to transmit data at a lower quality level when sufficient bandwidth for higher quality is not available or when it would interrupt more time-sensitive data streams. Lossy compression might also be used on a website in order to increase the speed at which site objects load.

Lossless compression uses mathematical formulas to identify areas of files that can be represented in a more efficient format. For example, an image might have a large section that is all one color. Rather than storing the same color value repeatedly for that section, the lossless compression algorithm would note the range of pixels that contain that color and the color code.

CERTIFICATION OBJECTIVE 3.02

Storage Access Protocols

Now that you have learned about the various storage technologies that are available, we can turn our attention to the access protocols and applications that utilize these technologies to transmit, shape, and prioritize storage information between hosts and their storage devices.

Fibre Channel (FC)

Fibre Channel is a technology for transmitting data between computers at data rates of up to 128 Gbps. IT organizations have made Fibre Channel the technology of choice for interconnecting storage controllers and drives when architecting infrastructures that have high-performance requirements. Fibre Channel architecture is composed of many interconnected individual units, which are called nodes. Each of these nodes has multiple ports, and these ports connect the nodes in a storage unit architecture using one of three different interconnection topologies: point-to-point, arbitrated loop, and switched fabric. Fibre Channel also can transmit over long distances. When deployed using optical fiber, it can transmit between devices up to about six miles apart. While Fibre Channel is the transmission medium, it still utilizes SCSI riding on top of it for its commands.

Fibre Channel is deployed when the highest levels of performance are required.

Fibre Channel Protocol

The SCSI commands that ride atop the Fibre Channel transport are sent via the Fibre Channel Protocol (FCP). In order to increase performance, this protocol takes advantage of hardware that can utilize protocol offload engines (POEs). This assists the host by offloading processing cycles from the CPU, thereby improving system performance.

FCP uses addresses to reference nodes, ports, and other entities on the SAN. Each HBA has a unique world wide name (WWN), which is an 8-byte identifier similar to an Ethernet MAC address on a network card. There are two types of WWNs on an HBA: a world wide node name (WWNN), which can be shared by either some or all of the ports of a device, and a world wide port name (WWPN), which is unique to each port. Fibre switches also have WWPN for each switch port. Other devices can be issued a world wide unique identifier (WWUI) so that they can communicate on the SAN.

The frames in Fibre Channel Protocol consist of three components: an encapsulating header called the start-of-frame (SOF) marker, the data frame itself, and the end-of-frame (EOF) marker. This encapsulated structure enables the FC frames to be transported across other protocols, such as TCP, if desired.

Fibre Channel over Ethernet (FCoE)

Fibre Channel over Ethernet (FCoE) enables the transport of Fibre Channel traffic over Ethernet networks by encapsulating Fibre Channel frames over Ethernet networks. Fibre Channel over Ethernet can utilize Ethernet technologies up to 10 Gigabit Ethernet (10GigE) networks and higher speeds as they are developed, while still preserving the FC protocol.

Ethernet

Ethernet is an established standard for connecting computers to a local area network (LAN). Ethernet is a relatively inexpensive and reasonably fast LAN technology, with speeds ranging from 10 Mbps to 10 Gbps. Because it enables high-speed data transmission and is relatively inexpensive, Ethernet has become ubiquitous in IT organizations and the Internet. Ethernet technology operates at the physical and data link layers of the OSI model (layers 1 and 2). Although it is capable of high speeds, it is limited by both the length and the type of cables over which it travels. The Ethernet standard divides its data traffic into groupings called frames. These frames are utilized by storage protocols to deliver their data from one point to another, such as from a NAS device to a server.

TCP/IP

Internet Protocol (IP) is a protocol that operates at the network layer of the OSI model (layer 3) and provides unique addresses and traffic-routing capabilities. Computers utilizing the IPv4 protocol are addressed using dotted decimal notation with four octets divided by dots. As the name suggests, IP is the protocol that enables the Internet. Like Ethernet networks, it is ubiquitous in IT departments and provides a proven and relatively inexpensive and well-understood technology on which to build storage networks.

Transmission Control Protocol (TCP) is a protocol that provides reliable transport of network data through error checking. TCP uses ports that are associated with certain services and other ports that can be dynamically allocated to running processes and services. TCP is most often combined with IP and it operates at the transport layer of the OSI model (layer 4).

Internet Fibre Channel Protocol

Internet Fibre Channel Protocol (iFCP) enables the transport of Fibre Channel traffic over IP networks by translating FC addresses to IP addresses and FC frames to IP packets. iFCP reduces overhead as compared with other protocols that transport FC over IP because it does not use tunneling to connect FC devices.

Internet Small Computer System Interface (iSCSI)

iSCSI is a protocol that utilizes serialized IP packets to transmit SCSI commands across IP networks and enables servers to access remote disks as if they were locally attached. iSCSI “initiator” software running on the requesting entity converts disk block-level I/O into SCSI commands that are then serialized into IP packets that traverse any IP network to their targets. At the destination storage device, the iSCSI packets are interpreted by the storage device array into the appropriate commands for the disks it contains. Figure 3-2 shows an example of how multiple servers can leverage iSCSI to connect to shared storage over an IP network.

FIGURE 3-2 Using iSCSI over an IP network

Images

iSCSI is limited by the transmission speeds of the Ethernet network it travels over; when administrators design iSCSI networks, they should pay close attention to the design so that the storage traffic is isolated from the data network traffic. Although its performance is not as high as that of a Fibre Channel SAN, iSCSI can be an inexpensive entry into shared storage for IT departments or a training ground using repurposed equipment for administrators who want to get hands-on experience with storage networking. iSCSI can be implemented on a NAS device or on a general-purpose machine. iSCSI’s flexibility and implementation ease make it a popular and versatile storage protocol.

The iSCSI address given to an initiator is known as an initiator qualified name (IQN). Initiators reside on clients in an iSCSI network and initiators connect to targets such as storage resources over the iSCSI network. An IQN uses the following naming convention: iqn.yyyy-mm.naming-authority:unique name

While working at a small IT shop that wanted to explore the use of virtualization, our focus was to create some solutions for our customers that promised higher availability. We needed to get some shared storage that we could use to enable some of the automatic migration and performance tuning capabilities of our virtualization platform. We did not, however, have much of a budget to spend on research and development. We wound up repurposing a Gigabit Ethernet switch, some category 6 Ethernet cable, and a couple of retired servers to set up our test environment with very little cost. Using the built-in capabilities of the operating system and some open-source software, we had everything we needed to build out an entire lab and evaluate the capabilities of our proposed solutions.

CERTIFICATION OBJECTIVE 3.03

Storage Provisioning

Now that you understand the technologies, protocols, and applications for moving storage data around networks, we will explore how that data is presented to computers. Data can be made available to computers in a number of ways, with varying degrees of availability and security.

Performance

Everyone has an expectation of performance for a system, and these expectations tend to increase as computing power increases. Storage systems also cannot stay the same. They must keep up with application and end-user demand. In order to do this, storage systems need a way to measure performance in a meaningful way. The most common method is input/output operations per second (IOPS). Storage must be provisioned to provide the required IOPS to the systems that utilize that storage. This requires having an understanding of the read and write throughput that different RAID sets and drive types can produce as well as the performance enhancements that can be gained from storage tiering.

IOPS

IOPS is a measurement of how much data is provided over a period of time. It is usually expressed in bits per second (bps), bytes per second (Bps), megabytes per second (MBps), or gigabytes per second (GBps). Drives are typically rated regarding the IOPS they can support. Hard disk drives may provide values for average latency and average seek time, or those value can be computed from their spindle speed. The formula for IOPS is as follows:

IOPS = 1 / (average latency + average seek time)

For example, if a SATA drive running has an average latency of 3.2 ms and an average seek time of 4.7 ms, we would take 1 / (.0032 + .0047), which gives us 126.58, or 127 IOPS rounded to the nearest integer.

When drives are combined together into a RAID array, the RAID technology will utilize a combination of these IOPS. For example, a RAID 5 array of six drives, each with 127 IOPS, would provide 635 IOPS. The array would have five drives for striping and one for parity, so that is five times 127 to produce the 635 read IOPS. There is a difference between read and write IOPS, as will be explained next.

Read/Write Throughput

The RAID types chosen as well as the caching settings can determine how many IOPS will be produced by a logical volume. Read and write throughput are also expressed as either sequential reads or writes or random reads or writes. Sequential reads are when data is read from contiguous portions of the disk, while random reads are when data is read from various locations on the disk. It is more efficient to be able to pull data from contiguous portions of the disk because the drives do not need to spend as much time seeking for the data.

Caching can also have an impact on read and write IOPS. Caching settings can be optimized for reading or writing or a little of both. A cache can hold files that were recently requested in case they are requested again, improving read speeds when those files are fetched from cache instead of disk. Similarly, files can be placed in cache and then written to the disk when it is most efficient to store the data so that the application does not need to wait for the data to actually be written to the disk if it exists in the write cache. The four read/write throughput values are thus as follows:

Random read IOPS The average number of read I/O operations that can be performed per second when the data is scattered around the disk

Random write IOPS The average number of write I/O operations that can be performed per second when the data must be written to scattered portions of the disk

Sequential read IOPS The average number of read I/O operations that can be performed per second when the data is located in contiguous sections of the disk

Sequential write IOPS The average number of write I/O operations that can be performed per second when the data must be written to contiguous sections of the disk

Storage Tiers

Storage tiering is an essential part of storage optimization. Not all data will be requested all the time, and so it does not have to be treated the same way. Tiering combines multiple classes of storage into a single storage pool to intelligently satisfy storage demands. Higher-speed storage is used for the data that is most often needed or that the system predicts will be needed, while data that is requested less often is moved down to lower tiers.

For example, the highest-speed storage tier could be made up of 5TB of high-speed SLC SSD storage, while the second tier would be 10TB of lower-speed MLC SSD storage, the third tier 20TB of TLC SSD storage, and the fourth tier 50TB of 15k SAS storage, drives that spin at 15,000 rotations per minute. This is shown in Figure 3-3. The application would see an 85TB pool of storage available to it that is made up of these different types of storage. The storage system intelligently moves the data around on the different tiers so that data is most often served from the highest speed storage.

FIGURE 3-3 Tiered storage pool

Images

Solid state storage was discussed in Chapter 2. There are three common types of SSDs in use today. Singlelevel cell (SLC) is the fastest but can store only one binary value in a cell, making it the SSD storage type with the smallest capacity. Looking at it another way, SLC has the highest cost per gigabyte. Multi-level cell (MLC) can store two binary values in each cell but is slower than SLC. Triple-level cell (TLC) is the slowest of the three, but it has the highest capacity because it can store three binary values per cell. This makes TLC the lowest cost per gigabyte in SSDs.

Logical Unit Numbers (LUNs)

Logical unit numbers, or LUNs (introduced earlier), have been around for a long time and were originally used to identify SCSI devices as part of a DAS solution for higher-end servers. Devices along the SCSI bus were assigned a number from 0 to 7, and SCSI 2 utilized 0 to 15, which designated the unique address for the computer to find that device. In storage networking, LUNs operate as unique identifiers, but now they are much more likely to represent a virtual hard disk from a block of allocated storage within a NAS device or a SAN. Devices that request I/O process are called initiators, and the devices that perform the operations requested by the initiators are called targets. Each target can hold up to eight other devices, and each of those devices is assigned a LUN.

Network Shares

Network shares are storage resources that are made available across the network and appear as if they are resources on the local machine. Traditionally, network shares are implemented using the Server Message Block (SMB) protocol when using Microsoft products and the network file system (NFS) protocol in Linux. It is also possible to share the same folder over NFS and SMB so that both Linux and Windows clients can access it. Access to these shares happens within an addressable file system as opposed to using block storage.

Zoning and LUN Masking

SANs are designed with high availability and performance in mind. In order to provide the flexibility that system administrators demand for designing solutions that utilize those capabilities, servers need to be able to mount and access any drive on the SAN. This flexible access can create several problems, including disk resource contention and data corruption. To mitigate these problems, storage devices can be isolated and protected on a SAN by utilizing zoning and LUN masking, which allow for dedicating storage on the SAN to individual servers.

Zoning controls access from one node to another. It enables isolation of a single server to a group of storage devices or a single storage device or associates a set of multiple servers with one or more storage devices. Zoning is implemented at the hardware level on Fibre Channel switches and is configured with what is referred to as “hard zoning” on a port basis or “soft zoning” using a WWN. In Figure 3-4, the Fibre Channel switch is controlling access to the red server and the blue server to connect to storage controllers 0–3. It grants access to the blue server to the LUNs on controllers 0 and 3, while the red server is granted access to all LUNs on all storage controllers.

FIGURE 3-4 Zoning using a Fibre Channel switch

Images

LUN masking is executed at the storage controller level instead of at the switch level. By providing LUN-level access control at the storage controller, the controller itself enforces access policies to the devices. LUN masking provides more detailed security than zoning because LUNs allow for sharing storage at the port level. In Figure 3-5, LUN masking is demonstrated as the blue server is granted access from the storage controller to LUNs 0 and 3, while the red server is granted access to all LUNs.

FIGURE 3-5 LUN masking using the storage controller

Images

Multipathing

Whereas zoning and LUN masking are configuration options that limit access to storage resources, multipathing is a way of making data more available or fault tolerant to the computers that need to access it. Multipathing does exactly what its name suggests, in that it creates multiple paths for the machine to reach the storage resources it is attempting to contact.

The redundant paths in multipathing are created by a combination of hardware and software resources. The hardware resources are multiple NICs, CNAs, or HBAs deployed to a single computer. These multiple adapters provide options for the software to run in multipath mode, which allows it to use either of the adapters to send traffic over in case one of them were to fail.

Setting up multipathing on the computer, however, is not enough to ensure high availability of the applications designed to run on it. The entire network infrastructure that the data traffic travels upon should be redundant so that a failure of any one component will not interrupt the storage data traffic. This means that in order to implement an effective multipath solution, redundant cabling, switches, routers, and ports on the storage devices must be considered as well. Enabling this kind of availability may be necessary to meet the business requirements of the applications being hosted, but such a configuration can be very expensive.

Provisioning Model

Cloud storage administrators will need to determine how to best provision storage depending on how the storage will be utilized and the available storage at hand. Some storage needs increase slowly, while others increase quickly. There are two options for provisioning storage. One is known as thick provisioning and the other as thin provisioning. Each has its own set of benefits and drawbacks.

Virtual hard disks on hypervisors can be provisioned as a thick disk or a thin disk. The size of a thick disk (termed a fixed disk in Microsoft Hyper-V) is specified and allocated during the creation of the virtual disk. A thin disk (termed a dynamically expanding disk in Microsoft Hyper-V) starts out small and adds space as required by the virtual machine.

While the different virtualization manufacturers use different terms to define their virtual disks, the concepts are similar. Whether you are using Hyper-V, VMware ESXi, or XenServer, you still need to decide which type of disk to use for which application. If you are concerned about disk space, then using a thin disk or dynamically expanding disk would be the best option. If size is not a concern, then you could use a fixed-size or thick disk.

Now that you understand the basics, let’s look at thin and thick provisioning in more detail.

Thick Provisioning

Thick provisioning allocates the entire size of the logical drive upon creation. This means that the virtual disk is guaranteed and consumes whatever amount of disk space is specified during the creation of that virtual disk. Thin provisioning ensures that space will not be claimed by some other application and keeps the provisioned storage in contiguous space on the disk. Thick provisioning provides better performance because the drive size is not being built as the application requires more drive space. Thick provisioning is best suited for volumes that are expected to multiply in size or for those that require dedicated performance.

For example, a thick-provisioned volume of 400GB will consume 400GB of space on the storage system. This storage will be allocated entirely upon creation and made available to the system.

Thin Provisioning

Thin provisioning allocates only the space that is actually consumed by the volume. For example, a 400GB thin-provisioned volume will start off consuming zero bytes of storage. As data is written to the volume, the storage system will continue to allocate more storage out of a storage pool until the volume reaches its max of 400GB. This results in storage space that is allocated from wherever there is free space on the drive at the time it is needed, so not all space assigned to the thin-provisioned volume will be in contiguous space.

Thin provisioning does not have the same performance level as a thick disk and needs to be monitored closely to prevent running out of available disk space since storage space is by definition overcommitted.

When comparing thin and thick provisioning and considering which one works best in the organization’s environment, it is important to keep a few things in mind. First, determine the performance requirements for the system, including the amount of data reads and writes you expect the system to perform. Each time new data is added to a thin-provisioned disk, space from the pool on which the thin-provisioned disk resides is allocated to the disk. This can lead to extensive fragmentation of the thin-provisioned volume if it grows frequently and rapidly. For example, an application that writes a lot of data to the drive, such as a database application, would not perform as well on a thin-provisioned disk. On the other hand, if space is a concern and a web server is not writing to the virtual disk that often, a thin-provisioned disk would be more appropriate.

The application workload is often the determining factor in choosing the type of virtual disk.

Second, determine how often the data will grow. Excessive growth of thin-provisioned disks can fill up the storage pool on which the disks reside if overprovisioning, discussed next, is not properly controlled.

Storage Overprovisioning

Storage overprovisioning, also known as overcommitting or oversubscribing, is the process of creating multiple volumes using thin provisioning with a total maximum size that exceeds available storage. Overprovisioning is often done because some volumes will never utilize the maximum available, yet applications perform better when there is some space available for temporary data. However, storage administrators must monitor overprovisioned storage closely to ensure that it does not fill up and cause downtime to the systems that are provisioned from it.

Each of the major virtualization manufacturers have different terms when describing virtual disk configurations. For example, if you are using Microsoft Hyper-V, you would have the options of making a dynamically expanding virtual disk, a fixed virtual disk, or a differencing virtual disk. If you are creating a fixed-size disk, you would specify the size of the disk when it is created. If you are creating a dynamically expanding virtual disk, the disk starts at a small size and adds storage as needed.

Encryption Requirements

Disk encryption is quickly becoming a minimum requirement for regulated industries and for protecting the data of cloud consumers. Some customers require their volumes to be encrypted so that other tenants or the cloud provider cannot read their data.

Disk encryption takes an entire drive and converts it to a form that is unreadable unless the decryption key is provided. Disk encryption can be performed on local drives or removable media. The process is mostly transparent to the user. Users provide their decryption key when they log onto the computer and from that point on, files are encrypted when stored and decrypted when opened without additional interaction. Disk encryption is also referred to as full disk encryption (FDE). Some software-based disk encryption methods encrypt all contents but not the MBR, while hardware disk encryption methods are able to encrypt the contents and the MBR. Hardware disk encryption does not store the decryption key in memory. Drives encrypted with hardware disk encryption are also known as self-encrypting drives (SEDs). Many disk encryption systems support trusted platform module (TPM), a processor on the system mainboard that can authenticate the encrypted hard drive to the system to prevent an encrypted drive from being used on another system.

Some limitations of disk encryption include the fact that once a user is logged in, the entire disk is available to them. Malicious code or a lost password could allow access to the entire drive even if it is encrypted. Additionally, some disk encryption systems have been circumvented, including those with TPM, by stealing the keys stored in memory shortly after a cold shutdown (not a controlled shutdown) before memory data fully degrades. Still, disk encryption is overall an effective way to prevent unauthorized access to data stored on local drives and removable disks.

Tokenization

Tokenization can be used to separate sensitive data from storage media that does not have a high enough security classification. Tokens are identifiers that can be mapped to sensitive data. The token is just an identifier and cannot be used to create the data without interfacing with the tokenization system.

A system storing data on a cloud might store public data and then store a token in place of each sensitive data element, such as personally identifiable information (PII) or protected health information (PHI). The PII or PHI would be stored in the tokenization system, and the public cloud storage would retain the token for that information. When retrieving the data, the system would retrieve public data directly from the cloud storage, but would need to query the tokenization system to pull out the sensitive data, a process known as de-tokenization.

CERTIFICATION OBJECTIVE 3.04

Storage Protection

Storage protection guards against data loss, corruption, or unavailability. Users expect their data to be present when they request it, and loss of data is almost always considered unacceptable to cloud consumers. Storage protection must guard against equipment failures, site failures, user error, data corruption, malware, and other threats that could damage data integrity or availability.

High Availability

High availability (HA) refers to systems that are available almost 100 percent of the time. These systems are usually measured in terms of how many “nines” of availability they offer. For example, a system that offers 99.999 percent availability is offering five nines of availability. This equates to 5.39 minutes of downtime in a year.

HA systems achieve such availability through redundancy of components and sites. HA systems might also replicate data to multiple sites, co-locations (COLOs), or cloud services to protect against site failure or unavailability. Storage replication is discussed after failover zones.

Failover Zones

HA systems utilize clusters to divide operations across several systems. Some systems are active-active, where all systems can service application requests, whereas others are active-passive, where one or more systems services requests while one or more remain in a standby state until needed. Active-active systems must retain enough available resources to handle the remaining load if a system in the cluster becomes unavailable. This is known as N+1 redundancy because they can suffer the loss of one system.

HA systems require regular maintenance and yet, in the five-nines example, 5.39 minutes of downtime per year is hardly enough time to perform regular maintenance. HA systems accomplish this by performing upgrades to redundant equipment independently. In a cluster, the services on one cluster node are failed over to other cluster nodes. That node is upgraded, and then services are failed back to it. Maintenance or upgrades continue on the other nodes in the same fashion until all are upgraded. Throughout the process, the user does not experience any downtime. Clustering typically requires some level of shared storage where each node can access the same storage. When shared storage is not available, systems will use some form of replication to keep each system consistent with other systems. For example, when failover is performed across sites, replication is usually required in order to keep both sites consistent.

Storage Replication

Storage replication transfers data between two systems so that any changes to the data are made on each node in the replica set. A replica set consists of the systems that will all retain the same data. Multiple sites are used to protect data when a single site is unavailable and also to ensure low-latency availability by serving data from sources that are close to the end user or application.

Replication can be implemented as regional replication with redundant locations chosen in such a way that a disaster impacting one site would not impact the redundant site. Multiregional replication expands replication to many different sites in multiple regions.

Replication is performed synchronously or asynchronously. Synchronous replication writes data to the local store and then immediately replicates it to the replica set or sets. The application is not informed that the data has been written until all replica sets have acknowledged receipt and storage of the data. Conversely, asynchronous replication stores the data locally and then reports back to the application that the data has been stored. It then sends the data to replication partners at its next opportunity.

Regional Replication

Regional replication uses replication to store data at a primary site and a secondary site. In regional replication, the secondary site is located in a different region from the primary site so that conditions impacting the primary site are less likely to impact the secondary site. Site unavailability is usually the result of a natural disaster such as a flood, fire, tornado, or hurricane. Many data centers are placed in regions where natural disasters are less common. For example, you will not find many data centers on the Florida coast. Not only is this land very expensive, but it is also prone to floods and hurricanes, which could render the site unavailable. Redundant sites are usually chosen in different regions that are far enough apart from one another that a single disaster will not impact both sites.

When implementing sites in different regions, also consider the power distribution method. Choose regions that are serviced by different power suppliers so that a disruption in the power network will not impact both sites.

Data will need to be replicated to these regions. This requires a connection between the data centers. This can be a leased line such as an MPLS network or dark fibre (fiber optic cable that is not owned and operated by a telco), or it could be a VPN tunnel over a high-speed Internet link. Ensure that the link between the sites will support the amount of replication data plus some overhead and room for spikes. Some Internet service providers will allow for a consistent data rate with bursting for the occasional large transfer. Bursting allows the connection to exceed the normal data transmission limits, but it comes at a charge from the ISP.

Multiregional Replication

Multiregional replication replicates data between many different sites in multiple regions. Replication schemes should be planned so that the entire replica set can be consistent with a minimum of effort and yet still provide redundancy in case of site link failures.

Each site typically has one or more replication partners, but they will not replicate with all sites. This is to save on bandwidth costs and latency since longer distance links will incur additional latency and cost more to operate. A hub-and-spoke model is often utilized with redundant links added in to protect against site link failure. This is depicted in Figure 3-6.

FIGURE 3-6 Multiregional replication

Images

Synchronous and Asynchronous Replication

There are two forms of replication that can be used to keep replica sets consistent, synchronous and asynchronous. Synchronous replication writes data to the local store and then immediately replicates it to the replica set or sets. The application is not informed that the data has been written until all replica sets have acknowledged receipt and storage of the data. Asynchronous replication stores the data locally and then reports back to the application that the data has been stored. It then sends the data to replication partners at its next opportunity.

Synchronous replication requires high-speed, low-latency links in between sites in order to ensure adequate application performance. Synchronous replication ensures greater consistency between replication partners than asynchronous replication.

Asynchronous replication can tolerate fluctuations that are more significant in latency and bandwidth, but not all members of the replica set may be fully consistent in a timely manner if latency is high or bandwidth is low. This can lead to issues with multiple concurrent access from different sites that are dependent upon transactions being current. Figure 3-7 shows the multiregional replication scheme with a combination of asynchronous and synchronous replication. The sites that are farther away are using asynchronous replication, while the closer sites with lower latency are using synchronous replication.

FIGURE 3-7 Synchronous and synchronous multiregional replication

Images

CERTIFICATION SUMMARY

Storage networking is an essential component of the CompTIA Cloud+ exam and it is the foundation of a successful cloud infrastructure. This chapter discussed storage types and technologies, how to connect storage to devices, how to provision storage and make it available to devices, and how to protect storage availability through replication and redundancy.

The chapter began with a discussion on storage types and technologies. Understanding when to use the different storage types is important for optimizing a cloud deployment. These include direct attached storage (DAS), consisting of one or more drives that are connected to a single machine to provide block-level storage; SAN storage that is made available to one or more machines at the block level; NAS shares that make data available to multiple machines at the file level; and object storage, a system that stores and retrieves data based on its metadata, not on its location within a hierarchy.

In addition to the four storage types, this section also covers two storage technologies. These technologies are deduplication and compression and they are designed to improve storage efficiency. Deduplication improves storage efficiency by removing unnecessarily redundant data while compression improves efficiency by decreasing the amount of storage required to store the data. Lossy compression results in some reduction in data quality, while lossless does not change the data when it is decompressed.

Storage needs to be connected to devices for it to be useful. The second section of this chapter provided details on storage connectivity. Connecting to storage can be simple, as in the case of DAS since it is connected to only one machine, but NAS and a SAN can involve complex networking to ensure adequate storage performance and reliability needed in today’s cloud environments. This includes how devices are connected to storage networks or how NAS is connected to traditional networks as well as the benefits of each connection type. Connection types include FC, FCP, FCoE, Ethernet, IP, iFCP, and iSCSI.

The next section covered how storage is provisioned. The first step is to create storage that meets the performance requirements of the applications that will use it. SAN storage may be created from many disks and the portions that are carved out from those disks are called LUNs. Next, storage is made available only to the devices that need it through the use of zoning and LUN masking. There are some options when provisioning storage on how much space is allocated when new storage is created. Thin and thick provisioning offer two different methods to provision storage. Thick provisioning consumes all the allocated storage immediately, while thin provisioning allocates only what is actually used. Thin provisioning can help companies maximize capacity and utilization, but it can impact performance. Thick provisioning results in underutilized resources in order to offer more reliable performance.

The chapter closed with a discussion on some methods used to protect storage against data loss, corruption, or unavailability. The concept of high availability (HA) was presented first. HA systems are systems that are available almost 100 percent of the time. Next, storage replication was discussed. Storage replication transfers data between two systems so that any changes to the data are made on each node in the replica set. A replica set consists of the systems that will all retain the same data. Multiple sites are used to protect data when a single site is unavailable and also to ensure low-latency availability by serving data from sources that are close to the end user or application.

KEY TERMS

Use the following list to review the key terms discussed in this chapter. The definitions also can be found in the glossary.

asynchronous replication The process of copying data between replica sets where applications are notified of successful writes when the data has been written to the local replica set. Other replicas are made consistent at the earliest convenience.

co-location (COLO) A facility owned and operated by a third party that houses technology assets such as servers, storage, backup systems, and networking equipment.

converged network adapter (CNA) A computer expansion card that can be used as a host bus adapter or a network interface card.

direct attached storage (DAS) Storage system that is directly attached to a server or workstation and cannot be used as shared storage at the block level because it is directly connected to a single machine.

failover Switching from one service node to another without an interruption in service.

Fibre Channel (FC) Technology used to transmit data between computers at data rates of up to 10 Gbps.

Fibre Channel over Ethernet (FCoE) Enables the transport of Fibre Channel traffic over Ethernet networks by encapsulating Fibre Channel frames over Ethernet networks.

Fibre Channel Protocol (FCP) Transport protocol that transports SCSI commands over a Fibre Channel network.

host bus adapter (HBA) A network card that allows a device to communicate directly with a storage area network (SAN) or a SAN switch.

initiator qualified name (IQN) The iSCSI address given to an initiator. Initiators reside on clients in an iSCSI network and initiators connect to targets such as storage resources over the iSCSI network. An IQN uses the following naming convention: iqn.yyyy-mm.naming-authority:unique name

input/output operations per second (IOPS) A measurement of how much data is provided over a period of time.

Internet Fibre Channel Protocol (iFCP) A communication protocol that enables the transport of Fibre Channel traffic over IP networks by translating FC addresses to IP addresses and FC frames to IP packets.

Internet Small Computer System Interface (iSCSI) The communication protocol that leverages standard IP packets to transmit typical SCSI commands across an IP network; it then translates them back to standard SCSI commands, which enables servers to access remote disks as if they were locally attached.

logical unit number (LUN) Unique identifier used to identify a logical unit or collection of hard disks in a storage device.

LUN masking Makes a LUN available to some hosts and unavailable to others.

multipathing Creates multiple paths for a computer to reach a storage resource.

network attached storage (NAS) Provides file-level data storage to a network over TCP/IP.

network shares Storage resources that are made available across a network and appear as if they are a resource on the local machine.

overprovisioning The process of creating multiple volumes using thin provisioning with a total maximum size that exceeds available storage.

Server Message Block (SMB) Network protocol used to provide shared access to files and printers.

Session Control Protocol (SCP) A protocol that manages multiple connections over TCP. SCP operates at layer 4 of the OSI model.

storage area network (SAN) Storage device that resides on its own network and provides block-level access to computers that are attached to it.

synchronous replication The process of copying data between replica sets where applicationsare notified of successful writes only when the data has been written to all synchronous replica sets.

thick provisioning Allocates the entire size of the logical drive upon creation.

thin provisioning Allocates only the space that is actually consumed by the volume.

tokenization Replaces sensitive data with identifiers called tokens. De-tokenization returns the value associated with the token ID.

Transmission Control Protocol (TCP) A protocol that provides reliable transport of network data through error checking. TCP uses ports that are associated with certain services and other ports that can be dynamically allocated to running processes and services. TCP is most often combined with IP.

trusted platform module (TPM) A microprocessor that is dedicated to performing cryptographic functions. TPM are integrated into supporting systems and include features such as generation of cryptographic keys, random number generation, encryption, and decryption.

universal target adapter (UTA) A proprietary network adapter from NetApp that is extremely versatile due to its use of transceivers. UTA has ports for one or more Ethernet or Fibre transceivers and can support Ethernet transceivers up to 10 Gbps and Fibre transceivers at native Fibre Channel speeds.

virtual SAN (VSAN) Consolidating separate physical SAN fabrics into a single larger fabric, allowing for easier management while maintaining security.

world wide name (WWN) Unique identifier used in storage technologies similar to Ethernet MAC addresses on a network card.

world wide node name (WWNN) A unique identifier for a device on a Fibre Channel network.

world wide port name (WWPN) A unique identifier for a port on a Fibre Channel network. A single device with a WWNN will have multiple WWPN if it has multiple Fibre Channel adapters or adapters with multiple ports.

world wide unique identifier (WWUI) An address that is not used by other entities on a network and can represent only one entity.

zoning Controls access from one node to another in a storage network and enables isolation of a single server to a group of storage devices or a single storage device.

TWO-MINUTE DRILL

Storage Types and Technologies

A direct attached storage (DAS) system is a storage system that is directly attached to a server or workstation and does not have a storage network between the two devices.

A storage area network (SAN) is a storage device that resides on its own network and provides block-level access to computers that are attached to the SAN.

Network attached storage (NAS) is a file-level data storage device that is connected to a computer network and provides data access to a group of clients.

Object storage is a storage system that abstracts the location and replication of data, allowing the application to become more simplified and efficient. Object storage stores and retrieves data based on its metadata.

Deduplication and compression storage technologies improve storage efficiencies. Deduplication does this by removing unnecessarily redundant data, while compression improves efficiency by decreasing the amount of storage required to store the data.

Storage Access Protocols

Fibre Channel (FC) can be used to connect servers to shared storage devices with speeds of up to 10 Gbps.

Fibre Channel frames can be encapsulated over Ethernet networks by utilizing Fibre Channel over Ethernet (FCoE) or over IP using iFCP.

Ethernet is an established standard for connecting computers to a LAN. It is relatively inexpensive and can provide data speeds ranging from 10 Mbps to 10 Gbps.

Internet Small Computer System Interface (iSCSI) utilizes serialized IP packets to transmit SCSI commands across IP networks and enables servers to access remote disks as if they were locally attached.

Storage Provisioning

A logical unit number (LUN) is a unique identifier assigned to an individual hard disk device or collection of devices (a “logical unit”) as addressed by the SCSI, iSCSI, or FC protocol.

A LUN identifies a specific logical unit, which can be a portion of a hard disk drive, an entire hard disk, or several hard disks in a storage device like a SAN.

A network share provides storage resources that are accessible over the network.

Multipathing creates multiple paths for a computer to reach storage resources, providing a level of redundancy for accessing a storage device.

Storage Protection

HA systems are systems that are available almost 100 percent of the time.

HA systems utilize clusters to divide operations across several systems.

Storage replication transfers data between two systems so that any changes to the data are made on each node in the replica set.

Replication is performed synchronously or asynchronously. Synchronous replication writes data to the local store and then immediately replicates it to the replica set or sets. Conversely, asynchronous replication stores the data locally and then reports back to the application that the data has been stored. It then sends the data to replication partners at its next opportunity.

SELF TEST

The following questions will help you measure your understanding of the material presented in this chapter. As indicated, some questions may have more than one correct answer, so be sure to read all the answer choices carefully.

Storage Types and Technologies

1. Which type of storage system is directly attached to a computer and does not use a storage network between the computer and the storage system?

A. NAS

B. SAN

C. DAS

D. Network share

2. Which of the following characteristics describe a network attached storage (NAS) deployment?

A. Requires expensive equipment to support

B. Requires specialized skill sets for administrators to support

C. Delivers the best performance of any networked storage technologies

D. Provides great value by utilizing existing infrastructure

3. Which statement would identify the primary difference between NAS and DAS?

A. NAS cannot be shared and accessed by multiple computers.

B. DAS provides fault tolerance.

C. DAS does not connect to networked storage devices.

D. NAS uses an HBA and DAS does not.

4. Which storage type can take advantage of Universal Naming Convention addressable storage?

A. SAN

B. NAS

C. DAS

D. SATA

5. Which storage type provides block-level storage?

A. SAN

B. NAS

C. DAS

D. SATA

6. Which of the following connects a server and a SAN and improves performance?

A. Network interface card

B. Host bus adapter

C. Ethernet

D. SCSI

Storage Access Protocols

7. Which of the following protocols allows Fibre Channel to be transmitted over Ethernet?

A. HBA

B. FCoE

C. iSCSI

D. SAN

8. Which of the following is considered a SAN protocol?

A. FCP

B. IDE

C. SSD

D. DTE

9. Which of the following allows you to connect a server to storage devices with speeds of 128 Gbps?

A. Ethernet

B. iSCSI

C. Fibre Channel

D. SAS

10. Which of the following uses IP networks that enable servers to access remote disks as if they were locally attached?

A. SAS

B. SATA

C. iSCSI

D. Fibre Channel

Storage Provisioning

11. Warren is a systems administrator working in a corporate data center, and he has been tasked with hiding storage resources from a server that does not need access to the storage device hosting the storage resources. What can Warren configure on the storage controller to accomplish this task?

A. Zoning

B. LUN masking

C. Port masking

D. VLANs

12. Which of the following would increase availability from a virtualization host to a storage device?

A. Trunking

B. Multipathing

C. Link aggregation

D. VLANs

13. Which of the following allows you to provide security to the data contained in a storage array?

A. Trunking

B. LUN masking

C. LUN provisioning

D. Multipathing

14. Which provisioning model would you use if data is added quickly and often? The solution must ensure consistent performance.

A. Thin provisioning

B. Thick provisioning

C. Overprovisioning

D. Encryption

Storage Protection

15. Which HA solution involves multiple servers that each service requests concurrently, but can assume the load of one member if that member fails.

A. Active-passive

B. Active-active

C. Passive-passive

D. Passive-active

16. Which of the following are requirements for adequate application performance when using synchronous replication? (Choose two.)

A. Object storage

B. Low latency

C. Multipathing

D. High-speed links

SELF TEST ANSWERS

Storage Types and Technologies

1. C. DAS is a storage system that directly attaches to a server or workstation without a storage network in between the devices.

A, B, and D are incorrect. NAS provides file-level storage that is connected to a network and supplies data access to a group of devices. A SAN is a dedicated network and provides access to block-level storage. A network share is a storage resource on a computer that can be accessed remotely from another computer.

2. D. Network attached storage can utilize existing Ethernet infrastructures to deliver a low-cost solution with good performance.

A, B, and C are incorrect. Expensive and often proprietary hardware and software along with systems administrators with specialized skill sets are required to run storage area networks. Storage area networks, although more expensive to build and support, provide the best possible performance for storage networking.

3. C. DAS is a storage system that directly attaches to a server or workstation without a storage network in between the devices.

A, B, and D are incorrect. NAS can be shared and accessed by multiple computers over a network. DAS would not provide fault tolerance since it is connected to a single server, and neither NAS nor DAS technologies utilize HBAs as a part of their solution.

4. B. NAS appears to the client operating system as a file server, which allows it to use Universal Naming Convention addressable storage.

A, C, and D are incorrect. A SAN only provides storage at a block level. DAS is directly attached to a server and is accessed directly from an indexed file system. SATA is an interface technology, not a storage type.

5. A. A SAN is a storage device that resides on its own network and provides block-level access to computers that are attached to it.

B, C, and D are incorrect. NAS provides file-level storage. DAS is not accessible over a storage network. SATA is an interface technology, not a storage type.

6. B. An HBA card connects a server to a storage device and improves performance by offloading the processing required for the host to consume the storage data without having to utilize its own processor cycles.

A, C, and D are incorrect. A network interface card connects a computer to an Ethernet network or an iSCSI network but does not improve performance. Ethernet and SCSI would not improve performance, because they cannot offload the processing for the host computer to connect to the storage device.

Storage Access Protocols

7. B. Fibre Channel over Ethernet (FCoE) enables the transport of Fibre Channel traffic over Ethernet networks by encapsulating Fibre Channel frames over Ethernet networks.

A, C, and D are incorrect. iSCSI is a protocol that utilizes serialized IP packets to transmit SCSI commands across IP networks and enables servers to access remote disks as if they were locally attached. A SAN is a storage technology and an HBA is an adapter used to improve the performance of a SAN. They are not protocols.

8. A. The Fibre Channel Protocol is a transport protocol that transports SCSI commands over a Fibre Channel network. These networks are used exclusively to transport data in FC frames between storage area networks and the HBAs attached to servers.

B, C, and D are incorrect. IDE is used to connect devices to a computer. SSD is a type of hard drive. DTE stands for “data terminal equipment.” A computer is an example of DTE.

9. C. You can use Fibre Channel (FC) to connect servers to shared storage devices with speeds of up to 128 Gbps. FC also comes in 64, 32, 16, 8, 4, and 2 Gbps versions.

A, B, and D are incorrect. Ethernet and iSCSI have max transmission speeds of 10 Gbps running over 10GigE. SAS has a max speed of 12 Gbps, and 22.5 Gbps is in the works.

10. C. iSCSI utilizes serialized IP packets to transmit SCSI commands across IP networks and enables servers to access remote disks as if they were locally attached.

A, B, and D are incorrect. SAS and SATA do not allow you to connect to remote disks as if they were locally attached to the system. Fibre Channel utilizes the Fibre Channel Protocol to transmit data packets to SANs across a fabric of fiber-optic cables, switches, and HBAs.

Storage Provisioning

11. B. LUN masking is executed at the storage controller level instead of at the switch level. By providing LUN-level access control at the storage controller, the controller itself enforces access policies to the devices, making it more secure. This is the reason that physical access to the same device storing the LUNs remains “untouchable” by the entity using it.

A, C, and D are incorrect. LUN masking provides more detailed security than zoning because LUNs allows for sharing storage at the port level. Port masking occurs at the switch level instead of the controller, and VLANs are also not modified at the controller. VLANs are discussed in Chapter 4.

12. B. Multipathing creates multiple paths for the computer to reach the storage resources it is attempting to contact, improving fault tolerance and possibly speed.

A, C, and D are incorrect. Trunking provides network access to multiple clients by sharing a set of network lines instead of providing them individually. Link aggregation combines multiple network connections in parallel to increase throughput. VLANs do not have any effect on increasing availability to storage resources.

13. B. LUN masking enforces access policies to storage resources, and these storage policies make sure that the data on those devices is protected from unauthorized access.

A, C, and D are incorrect. Trunking provides network access to multiple clients by sharing a set of network lines instead of providing them individually. LUN provisioning does the opposite of LUN masking by making LUNs available for data access, and multipathing creates multiple paths for the computer to reach the storage resources that it is attempting to contact.

14. B. Thick provisioning would consume all the allocated space upon creation of the LUN, but performance would be consistent for a LUN that expects data to be added quickly and often because storage would not need to be continually allocated to the LUN and the storage would not be fragmented.

A, C, and D are incorrect. Thin provisioning saves space but can result in lower performance when there are frequent writes. Overprovisioning is the allocation of more space than is available in the storage pool. It requires thin provisioning, so it is also incorrect. Encryption would increase security but it would come at a cost to performance and this question is asking about performance, not security.

Storage Protection

15. B. Active-active solutions allow for all systems to service application requests.

A, C, and D are incorrect. Active-passive solutions involve one or more systems that service requests while one or more remain in a standby state until needed. Passive-passive and passive-active are not HA types.

16. B and C. Synchronous replication requires high-speed, low-latency links in between sites in order to ensure adequate application performance.

A and D are incorrect. Object storage allows data to be retrieved based on its metadata, and multipathing provides more than one connection to a node. Neither of these would be required for synchronous replication.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 3 Storage Networking

Create new playlist

Sign In

Sign Up

Chapter 3

Storage Networking

CERTIFICATION OBJECTIVE 3.01

Storage Types and Technologies

Direct Attached Storage (DAS)

Storage Area Network (SAN)

Network Attached Storage (NAS)

Object Storage

Deduplication Technologies

Compression Technologies

CERTIFICATION OBJECTIVE 3.02

Storage Access Protocols

Fibre Channel (FC)

Fibre Channel Protocol

Fibre Channel over Ethernet (FCoE)

Ethernet

TCP/IP

Internet Fibre Channel Protocol

Internet Small Computer System Interface (iSCSI)

CERTIFICATION OBJECTIVE 3.03

Storage Provisioning

Performance

IOPS

Read/Write Throughput

Storage Tiers

Logical Unit Numbers (LUNs)

Network Shares

Zoning and LUN Masking

Multipathing

Provisioning Model

Thick Provisioning

Thin Provisioning

Storage Overprovisioning

Encryption Requirements

Tokenization

CERTIFICATION OBJECTIVE 3.04

Storage Protection

High Availability

Failover Zones

Storage Replication

Regional Replication

Multiregional Replication

Synchronous and Asynchronous Replication

CERTIFICATION SUMMARY

KEY TERMS

TWO-MINUTE DRILL

Storage Types and Technologies

Storage Access Protocols

Storage Provisioning

Storage Protection

SELF TEST

Storage Types and Technologies

Storage Access Protocols

Storage Provisioning

Storage Protection

SELF TEST ANSWERS

Storage Types and Technologies

Storage Access Protocols

Storage Provisioning

Storage Protection

Table of Contents for
3 Storage Networking