CHAPTER 2

Disk Storage Systems

In this chapter, you will learn about

•   Disk types and configurations

•   Tiering

•   File system types

Storage devices are the foundation of a storage network and are the building blocks of storage in a disk subsystem, stand-alone server, or cloud data center. Disk system performance is a key factor in the overall health of the cloud environment, and you need to understand the different types of disks that are available and the benefits of each. Once an organization chooses the type of disk to use in its cloud environment, it needs to protect the data that is stored on the disk. Along with describing the different types of disks and how to connect those disks to the system, this chapter illustrates how data can be tiered to provide better utilization of disk resources and better application performance. Those who have passed the Network+, Server+, or Storage+ exam might possibly skip this chapter.

Disk Types and Configurations

Disk drive technology has advanced at an astonishing rate over the past few years, making terabytes of storage available at a relatively low cost to consumers. Evaluating what types of disks to buy requires careful planning and evaluation of the purpose of the disk.

Two main factors are used in determining the appropriate disk type for a use case. These are speed and capacity. Speed is typically measured in input/output operations per second (IOPS). You want to use a disk type that can offer the IOPS needed by the application or system. For example, if you were looking for a type of drive to support a database environment, this requires high IOPS to support queries across large data sets and many random reads and writes. In this case, you would be interested in a disk type with high IOPS, such as flash. However, if you were selecting a drive to support a file share on a test network, a disk type with medium IOPS would be acceptable, whereas archival storage such as backup might be able to use a drive type with low IOPS.

The second factor is capacity. You want to choose a drive type that will provide enough storage for the application or system data, not just for now, but also factoring in future growth.

Companies have limited resources, so they must balance speed and capacity when choosing the drive type. Storage with high IOPS such as flash costs significantly more than storage with low IOPS, like spinning disk. In the following sections, we examine each of the different disk types and clarify these distinctions.

Rotational Media

Disk storage is a generic term used to describe storage mechanisms where data is digitally recorded by various electronic, magnetic, optical, or mechanical methods on a rotating disk or media. A disk drive is a device that uses this storage mechanism with fixed or removable media. Removable media refers to an optical disc, memory card, flash media, or USB drive, and fixed or nonremovable media refers to a hard disk drive.

A hard disk drive (HDD) uses rapidly rotating disks called platters coated with a magnetic material known as ferrous oxide to store and retrieve digital information. An HDD retains the data on the drive even when the drive is powered off. The data on an HDD is read in a random-access manner. What this means is that an individual block of data can be stored or retrieved in any order rather than only being accessible sequentially, as in the case of data that might exist on tape.

An HDD contains one or more platters with read/write heads arranged on a moving arm that floats above the ferrous oxide surface to read and write data to the drive. HDDs have been the primary storage device for computers since the 1960s. Today the most common sizes for HDDs are the 3.5 inch, which is used primarily in desktop computers, and the 2.5 inch, which is mainly used in laptop computers. The primary competitors of the HDD are the solid-state drive (SSD) and flash memory cards. HDDs should remain the dominating medium for secondary storage, but SSDs have replaced rotating hard drives for primary storage.

Images

EXAM TIP   Hard disk drives are used when speed is less important than total storage space.

Solid State Drive

An SSD is a high-performance storage device that contains no moving parts. It includes either dynamic random-access memory (DRAM) or flash memory boards, a memory bus board, a central processing unit (CPU), and sometimes a battery or separate power source. The majority of SSDs use “not and” (NAND)–based flash memory, which is a nonvolatile memory type, meaning the drive can retain data without power. SSDs produce the highest possible I/O rates because they contain their own CPUs to manage data storage. SSDs are less susceptible to shock or being dropped, are much quieter, and have a faster access time and lower latency than HDDs. SSDs and traditional hard disks have the same I/O interface, allowing SSDs to easily replace a traditional HDD without changing the computer hardware.

While SSDs can be used in all types of scenarios, they are especially valuable in a system where I/O response time is critical, such as a database server, a server hosting a file share, or any application that has a disk I/O bottleneck. Another example of where an SSD is a good candidate is in a laptop. SSDs are shock-resistant; they also use less power and provide a faster startup time than HDDs. Since an SSD has no moving parts, both sleep response time and system response time are improved. SSDs are currently more expensive than traditional HDDs but are less of a risk for failure and data loss. Table 2-1 shows you some of the differences between SSDs and traditional HDDs.

Images

Table 2-1  SSD vs. HDD

Images

TIP   SSDs have faster response times than HDDs and are used in high-performance servers where speed is more important than total storage space.

SSDs are made up of flash memory cells that can hold one or more bits. A bit is a binary 0 or 1, and it is the foundation of information storage. The cells are organized into rows and columns. Current is sent to the cell to give it a charge. The voltage is stored in the cell and can be read by evaluating the voltage level. Writing new data requires setting a new voltage level in the cell. Cells have a limited lifespan. While you can read from the cell as many times as you wish, each time data is written to the cell, charge deteriorates the layer of the cell that conducts the voltage to it. Eventually, the conductive layer will break down and no longer allow for the cell to be charged.

SSDs come in four different types based on how many bits can be stored in each cell. The types are named accordingly, as follows:

•   Single-level cell (SLC)

•   Multi-level cell (MLC)

•   Triple-level cell (TLC)

•   Quad-level cell (QLC)

SLC flash stores only one bit in each cell. A binary one is represented by a charge, and a binary zero by no charge. MLC flash can store two bits per cell. This equates to four binary possibilities: 00, 01, 10, and 11. It does this by setting four different voltage levels in a cell. A charge of 25 percent equates to a binary value of 11, 50 percent is 01, 75 percent is 00, and a full charge is 10. Similarly, TLC flash stores three bits, which equates to eight binary possibilities, and QLC flash stores four bits, with sixteen binary possibilities. Each of these possible binary values in the cell must have a corresponding voltage level that can be set in the cell and read by the flash media. This requires more precise charging and voltage evaluation methods.

Adding more bits to the memory cell increases capacity, but it reduces the access speed and longevity of the cell because a change to any of the bits in that cell requires changing the voltage level of the cell. This equates to more frequent cell writes as compared to cells containing fewer bits.

Each of the SSD types has an average number of Program Erase (PE) cycles that can be performed per cell. Enterprise variants of some types will allow for more PE cycles. These drives also cost more but are better suited to commercial purposes where drives are utilized more intensely than consumer drives. SSDs will perform an action called write leveling to balance writes across the cells to avoid premature deterioration of some cells over others.

Following the release of TLC flash, a change occurred in the industry. The organization of cells based on rows and columns was augmented to include a third axis by stacking the cells on top of one another. This version of flash is referred to as 3D NAND, and the former is known as 2D NAND. 3D NAND cells are larger, which improves reliability through higher PE cycles per cell. Some TLC flash uses the 2D NAND, and some 3D NAND. QLC only comes in a 3D NAND version. QLC is the slowest of the flash types and it has the lowest PE cycle rating. However, it is also the cheapest. Table 2-2 compares the different SSD types.

Images

Table 2-2  SSD Types

Images

NOTE   There is a Penta-Level Cell (PLC) currently in development that is planned to house five bits per cell.

USB Drive

A universal serial bus (USB) drive is an external plug-and-play storage device that can be plugged into a computer’s USB port and is recognized by the computer as a removable drive and assigned a drive letter by the computer. Unlike an HDD or SSD, a USB drive does not require a special connection cable and power cable to connect to the system because it is powered via the USB port of the computer. Since a USB drive is portable and retains the data stored on it as it is moved between computer systems, it is a great device for transferring files quickly between computers or servers. Many external storage devices use USB, such as hard drives, flash drives, and DVD drives.

Tape

A tape drive is a storage device that reads and writes data to magnetic tape. Using tape as a form of storage has been around for a long time. The role of tape has changed tremendously over the years and is still changing. Tape is now finding a niche in the market for longer-term storage and archiving of data, and it is the medium of choice for storage at an off-site location.

Tape drives provide sequential access to the data, whereas an HDD provides random access to the data. A tape drive has to physically wind the tape between reels to read any one particular piece of data. As a result, it has a slow seek time, having to wait for the tape to be in the correct position to access the data. Tape drives have a wide range of capacity and allow for data to be compressed to a size smaller than that of the files stored on the disk.

Images

TIP   Tape storage is predominantly used for off-site storage and archiving of data.

Interface Types

HDDs interface with a computer in a variety of ways, including via ATA, SATA, Fibre Channel, SCSI, SAS, and IDE. Here we look at each of these interface technologies in greater detail. HDDs connect to a host bus interface adapter with a single data cable. Each HDD has its own power cable that is connected to the computer’s power supply.

•   Advanced Technology Attachment (ATA) is an interface standard for connecting storage devices within computers. ATA is often referred to as Parallel ATA (or PATA).

•   Integrated Drive Electronics (IDE) is the integration of the controller and the hard drive itself, which allows the drive to connect directly to the motherboard or controller. IDE is also known as ATA.

•   Serial ATA (SATA) is used to connect host bus adapters to mass storage devices. Designed to replace PATA, it offers several advantages over its predecessor, including reduced cable size, lower cost, native hot swapping, faster throughput, and more efficient data transfer.

•   Small Computer System Interface (SCSI) is a set of standard electronic interfaces accredited by the American National Standards Institute (ANSI) for connecting and transferring data between computers and storage devices. SCSI is faster and more flexible than earlier transfer interfaces. It uses a bus interface type, and every device in the chain requires a unique ID.

•   Serial Attached SCSI (SAS) is a data transfer technology that was designed to replace SCSI and to transfer data to and from storage devices. SAS is backward compatible with SATA drives.

•   Fibre Channel (FC) is a high-speed network technology used in storage networking. Fibre Channel is well suited to connect servers to a shared storage device such as a storage area network (SAN) due to its high-speed transfer rate of up to 16 gigabits per second. Fibre Channel is often referred to as FC in the industry and on the Cloud+ exam.

Table 2-3 explains the different connection types and some of the advantages and disadvantages of each interface.

Images

Table 2-3  HDD Interface Types

Images

EXAM TIP   Understanding the differences in the interface types is key for the test. You need to know when to use each connector and the benefits of that connector.

Access Speed

Just knowing the types of hard disks and the interface is not enough to calculate which drive type is best for a particular application. Understanding the speed at which a drive can access the data that is stored on that drive is critical to the performance of the application. A hard drive’s speed is measured by the amount of time it takes to access the data that is stored on the drive. Access time is the response time of the drive and is a direct correlation between seek time and latency. The actuator arm and read/write head of the drive must move for data to be located. First, the actuator arm must move the head to the correct location on the platter. The time it takes for the arm to move to the correct location is known as seek time. At the same time, the platter must rotate to the desired sector. The time it takes for the platter to spin to the desired location is known as rotational latency or just latency for short.

The access time of an HDD can be improved by either increasing the rotational speed of the drive or reducing the time the drive has to spend seeking the data. Seek time generally falls in the range of 3 to 15 milliseconds (ms). The faster the disk can spin, the faster it can find the data and the lower the latency for that drive will be. Table 2-4 lists the average latency based on some common hard disk speeds.

Images

Table 2-4  Hard Disk Speed and Latency

Redundant Array of Independent Disks (RAID)

So far in this chapter, you have learned about the different disk types and how those disk types connect to a computer system. The next thing you need to understand is how to make the data that is stored on those disk drives as redundant as possible while maintaining a high-performance system. RAID is a storage technology that combines multiple hard disk drives into a single logical unit so that the data can be distributed across the hard disk drives for both improved performance and increased security according to their various RAID levels.

There are four primary RAID levels in use and several additional RAID levels, called nested RAID levels, that are built on top of the four primary types. RAID 0 takes two disks and stripes the data across them. It has the highest speed, but a failure of any disk results in data loss for the entire RAID set. RAID 1, also known as a mirror, stores identical copies of data on two drives for reliability. However, speeds are limited to the capabilities of a single drive, and twice as much storage is required for data. RAID 5 stripes data across disks in the set and uses parity to reconstruct a drive if it fails in the set. RAID 5 requires at least three drives. It has good read performance, but the computation of parity can reduce write speeds in what is known as the write penalty. RAID 6 is like RAID 5, except it stores double parity and can recover from a loss of two drives. It has a higher write penalty.

Nested RAID consists of RAID 10 (RAID 1+0), which takes a number of mirror sets and stripes data over them. It has high read and write performance but requires double the drives for storage. RAID 50 is another type of nested RAID where two RAID 5 sets are striped together. It can offer higher performance than the RAID 5 arrays could individually while still retaining the parity on the underlying RAID 5.

Table 2-5 compares the different RAID configurations to give you a better understanding of the advantages and requirements of each RAID level.

Images

Table 2-5  RAID Level Benefits and Requirements

Two different options are available when implementing RAID: software RAID and hardware RAID using a RAID controller. Software RAID is implemented on a server by using software that groups multiple logical disks into a single virtual disk. Most modern operating systems have built-in software that allows for the configuration of a software-based RAID array. Hardware RAID controllers are physical cards that are added to a server to offload the overhead of RAID and do not require any CPU resources; they allow an administrator to boot straight to the RAID controller to configure the RAID levels. Hardware RAID is the most common form of RAID due to its tighter integration with the device and better error handling.

Images

EXAM TIP   You need to understand the difference between each RAID level and when each particular level is appropriate to use.

Write Once, Read Many (WORM)

WORM is storage that cannot be overwritten. WORM typically refers to optical storage such as CD-R or DVD-R devices that allow you to write data once to them and then never again. The data can be read from the CD or DVD as many times as you want, but you cannot overwrite the data with new data as you can with a hard drive. Writing data to a CD-R or DVD-R is called burning. WORM can also refer to storage that has write protection enabled. Floppy disks used to have a tab that could be moved to determine whether the data could be overwritten. Software disks would often lack this tab so that consumers could not switch them to writable mode. Similarly, some SD cards have firmware that prevents modifying or deleting data stored on them. This can be used for delivering content that you do not want the end user to change.

Another important reason for WORM is to protect the data on the storage from malicious alteration. Configuration data can be stored on WORM so that a malicious program cannot modify it. The disadvantage is that new storage must be created with the new configuration parameters, and then that storage is placed in the device. For example, a computer could be used as a firewall with a live version of Linux running off a DVD. The firewall configuration would be burned to the DVD and cannot be changed. To change this configuration, a new DVD would need to be burned with the updated configuration, and then the firewall would be shut down, the DVDs swapped, and the firewall started again to take on the new configuration.

Images

CAUTION   CD-R and DVD-R are both WORM, but CD-RW and DVD-RW are not. The RW in CD-RW and DVD-RW stands for rewritable, meaning that data can be changed after the initial burn.

Tiering

In the previous section, we discussed the different types of disks and the benefits of each. Now that you understand the benefits of each disk, you know that storing data on the appropriate disk type can increase performance and decrease the cost of storing that data. Having flexibility in how and where to store an application’s data is key to the success of cloud computing.

Tiered storage permits an organization to adjust where its data is being stored based on the performance, availability, cost, and recovery requirements of an application. For example, data that is stored for restoration in the event of loss or corruption would be stored on the local drive so that it can be recovered quickly, whereas data that is stored for regulatory purposes would be archived to a lower-cost disk, like tape storage.

Tiered storage can refer to an infrastructure that has a simple two-tier architecture, consisting of SCSI disks and a tape drive, or to a more complex scenario of three or four tiers. Tiered storage helps organizations plan their information life cycle management, reduce costs, and increase efficiency. Tiered storage requirements can also be determined by functional differences, for example, the need for replication and high-speed restoration.

With tiered storage, data can be moved from fast, expensive disks to slower, less expensive disks. Hierarchical storage management (HSM), which is discussed in the next section, allows for automatically moving data between four different tiers of storage. For example, data that is frequently used and stored on highly available, expensive disks can be automatically migrated to less expensive tape storage when it is no longer required on a day-to-day basis. One of the advantages of HSM is that the total amount of data that is stored can be higher than the capacity of the disk storage system currently in place.

Performance Levels of Each Tier

HSM operates transparently to users of a system. It organizes data into tiers based on the performance capabilities of the devices, with tier 1 containing the devices with the highest performance and each tier after that containing storage with lower performance than the tier before it. HSM tiers can include a wide variety of local and remote media such as solid state, spinning disk, tape, and cloud storage.

HSM places data on tiers based on the level of access required and the performance and reliability needed for that particular data or based on file size and available capacity. Organizations can save time and money by implementing a tiered storage infrastructure. Each tier has its own set of benefits and usage scenarios based on a variety of factors. HSM can automatically move data between tiers based on factors such as how often data is used, but organizations can also specify policies to further control where HSM stores data and what priority it gives to migration operations between tiers.

The first step in customizing HSM policies is to understand the data that will reside on HSM storage. Organizations and IT departments need to define each type of data and determine how to classify it so that it can configure HSM policies appropriately. Ask yourself some of the following questions:

•   Is the data critical to the day-to-day operation of the organization?

•   Is there an archiving requirement for the data after so many months or years?

•   Is there a legal or regulatory requirement to store the data for a period of time?

Once the data has been classified, the organization can create HSM policies so that data is moved to the appropriate tier and given the correct priority.

Tier 1

Tier 1 data is defined as mission-critical, recently accessed, or secure files and should be stored on expensive and highly available enterprise flash drives such as RAID with parity. Typically tier 1 drives would be SLC or MLC flash drives for the best speed. Tier 1 storage systems have better performance, capacity, reliability, and manageability.

Tier 2

Tier 2 data is data that runs major business applications, for example, e-mail and Enterprise Resource Planning (ERP) software. Tier 2 is a balance between cost and performance. Tier 2 data does not require subsecond response time but still needs to be reasonably fast. This typically consists of lower-speed flash drives, such as MLC or TLC flash, or a hybrid of flash media and spinning disks, such as spinning disks with flash as cache.

Tier 3

Tier 3 data includes financial data that needs to be kept for tax purposes but is not accessed on a daily basis and so does not need to be stored on the expensive tier 1 or tier 2 storage systems. Tier 3 storage typically uses QLC flash or spinning disks.

Tier 4

Tier 4 data is data that is used for compliance requirements for keeping e-mails or data for long periods of time. Tier 4 data can be a large amount of data but does not need to be instantly accessible. Tier 4 storage is long-term storage such as offline drives, tape, or other media.

Policies

A multitiered storage system provides an automated way to move data between more expensive and less expensive storage systems. Using a multitiered storage system, an organization can implement policies that define what data fits each tier and then manage how data migrates between the tiers. For example, when financial data is more than a year old, the policy could be to move that data to a tier 4 storage solution, much like the HSM defined earlier.

Tiered storage provides IT departments with the best solution for managing the organization’s data while also saving time and money. Tiered storage helps IT departments meet their service level agreements at the lowest possible cost and the highest possible efficiency.

Images

TIP   Tiered storage allows companies to achieve greater return on investment (ROI) on disk investments by utilizing a combination of storage types and capitalizing on their strengths.

File System Types

After choosing a disk type and configuration, an organization needs to be able to store data on those disks. The file system is responsible for storing, retrieving, and updating a set of files on a disk. It is the software that accepts the commands from the operating system to read and write data to the disk. It is responsible for how the files are named and stored on the disk.

The file system is also responsible for managing access to the file’s metadata (“the data about the data”) and the data itself and for overseeing the relationships to other files and file attributes. It also manages how much available space the disk has. The file system is responsible for the reliability of the data on the disk and for organizing that data in an efficient manner. It organizes the files and directories and tracks which areas of the drive belong to a particular file and which areas are not currently being utilized.

This section explains the different file system types. Each file system has its own set of benefits and scenarios under which its use is appropriate.

Unix File System

The Unix file system (UFS) is the primary file system for Unix and Unix-based operating systems. UFS uses a hierarchical file system structure where the highest level of the directory is called the root and all other directories span from that root.

Images

NOTE   The Unix root directory is depicted with the / character. This character is pronounced “slash.”

Under the root directory, files are organized into subdirectories and can have any name the user wishes to assign. All files on a Unix system are related to one another in a parent/child relationship, and they all share a common parental link to the top of the hierarchy.

Figure 2-1 shows an example of the UFS structure. The root directory has three subdirectories called bin, tmp, and users. The users directory has two subdirectories of its own called Nate and Scott.

Images

Figure 2-1  Unix file system (UFS) structure

Extended File System

The extended file system (EXT) is the first file system created specifically for Linux. The metadata and file structure are based on the Unix file system. EXT is the default file system for most Linux distributions. EXT is currently on version 4, or EXT4, which was introduced in 2008 and supports a larger file and file system size. EXT4 is backward compatible with EXT3 and EXT2, which allows for mounting an EXT3 and EXT2 partition as an EXT4 partition.

File Allocation Table File System

The file allocation table (FAT) file system is a legacy file system that provides good performance but does not deliver the same reliability and scalability as some of the newer file systems. The FAT file system is still supported by most operating systems for backward-compatibility reasons. Still, it has mostly been replaced by NTFS (more on this in a moment) as the preferred file system for the Microsoft operating system. If a user has a drive running a FAT32 file system partition, however, they can connect it to a computer running Windows 7 and retrieve the data from that drive because all modern versions of Windows, including 7, 8, and 10, still support the FAT32 file system.

FAT originally came in two flavors, FAT16 and FAT32, the difference being that FAT16 supported fewer files in the root directory, smaller maximum file sizes, and smaller maximum partition size. Modern iterations of FAT use exFAT, which further increases the maximum file and partition sizes.

The FAT file system is used by a variety of removable media, including solid-state memory cards, flash memory cards, and portable devices. The FAT file system does not support the advanced features of NTFS like encryption, VSS, and compression.

New Technology File System

The New Technology File System (NTFS) is a proprietary file system developed by Microsoft to support the Windows operating systems. It first became available with Windows NT 3.1 and has been used on all of Microsoft’s operating systems since then. NTFS was Microsoft’s replacement for the FAT file system. NTFS has many advantages over FAT, including improved performance and reliability, larger partition sizes, and enhanced security. NTFS uses the NT LAN Manager (NTLM) protocol for authentication.

Starting with version 1.2, NTFS added support for file compression, which is ideal for files that are written on an infrequent basis. However, compression can lead to slower performance when accessing the compressed files; therefore, it is not recommended for .exe or .dll files or for network shares that contain roaming profiles due to the extra processing required to load roaming profiles.

NTFS version 3.0 added support for volume shadow copy service (VSS), which keeps a historical version of files and folders on an NTFS volume. Shadow copies allow you to restore a file to a previous state without the need for backup software. The VSS creates a copy of the old file as it is writing the new file, so the user has access to the previous version of that file. It is best practice to create a shadow copy volume on a separate disk to store the files.

Exercise 2-1: Formatting a Disk with the NTFS Partition in Windows

In this exercise, we will create an NTFS partition on a USB drive. Please note that this operation removes all data from the drive, so please move any data you want to save off the drive before creating the partition.

1.   Insert the USB drive into your Windows computer.

2.   The contents of the drive will appear. Please note the drive letter of the device if it already has a partition. If you have never used the drive before, it will likely prompt you to format the drive to begin working with it.

3.   Right-click on the Start button and select Disk Management.

4.   Scroll down to the disk that you want to create the partition on.

5.   For this example, I inserted a 4GB flash drive that has a FAT32 volume on it called ELEMENTARY.

Images

6.   Right-click on the partition you want to remove and then select Format.

7.   You will be warned that all data will be lost. Select Yes.

Images

8.   A format box will appear. Give the partition a name and then select the NTFS file system from the second drop-down box. We will leave the rest of the fields at their defaults. Your screen should look like this:

Images

9.   Click OK. The disk will now be formatted using the NTFS file system. The quick format option usually only takes a few seconds, and then the operation will complete.

Encrypting File System

The Encrypting File System (EFS) provides an encryption method for any file or folder on an NTFS partition and is transparent to the user. EFS encrypts a file by using a file encryption key (FEK), which is associated with a public key that is tied to the user who encrypted the file. The encrypted data is stored in an alternative location from the encrypted file. To decrypt the file, EFS uses the private key of the user to decrypt the public key that is stored in the file header. If the user loses access to their key, a recovery agent can still access the files. NTFS does not support encrypting and compressing the same file.

Disk quotas allow an administrator to set disk space thresholds for users. This gives an administrator the ability to track the amount of disk space each user is consuming and limit how much disk space each user has access to. The administrator can set a warning threshold and a deny threshold and deny access to the user once they reach the deny threshold.

Resilient File System

The Resilient File System (ReFS) is a proprietary file system developed by Microsoft to support the Windows operating systems. It first became available with Windows Server 2012 and is supported on Windows Server 2012 and later server operating systems, as well as Windows 8.1 and later versions.

Rather than fully replacing NTFS, ReFS offers support for some new features by sacrificing some other features. ReFS’s new features include the following:

•   Integrity checking and data scrubbing  These features are a form of file integrity monitoring (FIM) that checks data for errors and automatically replaces corrupt data with known good data. It also computes checksums for file data and metadata.

•   Storage virtualization  Remote mounts such as SAN storage can be formatted as local storage. Additionally, mirroring can be applied to disks in a logical volume to provide redundancy and striping to provide better performance.

•   Tiering  Multiple storage types with different capacity and performance ratings can be combined together to form tiered storage.

•   Disk pooling  A single ReFS logical volume can consist of multiple storage types. Unlike RAID, the drives do not need to be the same size and type to be in a pool.

•   Support for longer file paths than NTFS  NTFS was limited to 256-character filenames and file paths, but ReFS can support filenames and file paths up to 32,768 characters in length each. This allows for more descriptive names and deeper folder hierarchies.

•   Block cloning  A feature that decreases the time required for virtual machine copies and snapshots.

•   Sparse valid data length (VDL)  A feature that reduces the time required for the creation of thick-provisioned virtual hard disks.

These new features, however, come at a cost. Microsoft has sacrificed some NTFS features that system administrators have become quite comfortable with, including support for EFS, compression, data deduplication, and disk quotas.

Images

EXAM TIP   Take careful note of which features are needed when selecting the correct file system. The newest file system may not necessarily be the right answer. For example, if encryption was required, NTFS would be a better choice than ReFS.

Virtual Machine File System

The Virtual Machine File System (VMFS) is VMware’s cluster file system. It is used with VMware ESXi server and vSphere and was created to store virtual machine disk images, including virtual machine snapshots. It allows for multiple servers to read and write to the file system simultaneously while keeping individual virtual machine files locked. VMFS volumes can be logically increased by spanning multiple VMFS volumes together.

Z File System

The Z File System (ZFS) is a combined file system and logical volume manager designed by Sun Microsystems. The ZFS file system protects against data corruption and support for high storage capacities. ZFS also provides volume management, snapshots, and continuous integrity checking with automatic repair.

ZFS was created with data integrity as its primary focus. It is designed to protect the user’s data against corruption. ZFS is currently the only 128-bit file system. It uses a pooled storage method, which allows space to be used only as it is needed for data storage.

Table 2-6 compares a few of the different file system types, lists their maximum file and volume sizes, and describes some of the benefits of each system.

Images

Table 2-6  File System Characteristics

Images

NOTE   Some of the data sizes shown here or referenced elsewhere in this text use rather large numbers. For reference, 1000 bytes (B) = 1 kilobyte (KB), 1000KB = 1 megabyte (MB), 1000MB = 1 gigabyte (GB), 1000GB = 1 terabyte (TB), 1000TB = 1 petabyte (PB), 1000PB = 1 exabyte (EB), 1000EB = 1 zettabyte (ZB), 1000ZB = 1 yottabyte (YB).

Images

EXAM TIP   You should know the maximum volume size of each file system type for the exam. For example, if the requirement is a 3TB partition for a virtual machine drive, you would not be able to use the FAT file system; you would need to use NTFS.

Chapter Review

Understanding how different storage technologies affect the cloud is a key part of the CompTIA Cloud+ exam. This chapter discussed the various physical types of disk drives and how those drives are connected to systems and each other. It also covered the concept of tiered storage with HSM. We closed the chapter by giving an overview of the different file system types and the role proper selection of these systems plays in achieving scalability and reliability. It is critical to have a thorough understanding of all these issues as you prepare for the exam.

Questions

The following questions will help you gauge your understanding of the material in this chapter. Read all the answers carefully because there might be more than one correct answer. Choose the best response(s) for each question.

1.   Which type of storage device has no moving parts?

A.   HDD

B.   SSD

C.   Tape

D.   SCSI

2.   Which type of storage device would be used primarily for off-site storage and archiving?

A.   HDD

B.   SSD

C.   Tape

D.   SCSI

3.   You have been given a drive space requirement of 2TB for a production file server. Which type of disk would you recommend for this project if cost is a primary concern?

A.   SSD

B.   Tape

C.   HDD

D.   VLAN

4.   Which of the following storage device interface types is the most difficult to configure?

A.   IDE

B.   SAS

C.   SATA

D.   SCSI

5.   If price is not a factor, which type of storage device interface would you recommend for connecting to a corporate SAN?

A.   IDE

B.   SCSI

C.   SATA

D.   FC

6.   You need to archive some log files, and you want to make sure that they can never be changed once they have been copied to the storage. Which type of storage would be best for the task?

A.   SSD

B.   Rotational media

C.   WORM

D.   USB drive

7.   What RAID level would be used for a database file that requires minimum write requests to the database, a large amount of read requests to the database, and fault tolerance for the database?

A.   RAID 10

B.   RAID 1

C.   RAID 5

D.   RAID 0

8.   Which of the following statements can be considered a benefit of using RAID for storage solutions?

A.   It is more expensive than other storage solutions that do not include RAID.

B.   It provides degraded performance, scalability, and reliability.

C.   It provides superior performance, improved resiliency, and lower costs.

D.   It is complex to set up and maintain.

9.   Which data tier would you recommend for a mission-critical database that needs to be highly available all the time?

A.   Tier 1

B.   Tier 2

C.   Tier 3

D.   Tier 4

10.   Which term describes the ability of an organization to store data based on performance, cost, and availability?

A.   RAID

B.   Tiered storage

C.   SSD

D.   Tape drive

11.   Which data tier would you recommend for data that is financial in nature, is not accessed on a daily basis, and is archived for tax purposes?

A.   Tier 1

B.   Tier 2

C.   Tier 3

D.   Tier 4

12.   Which of the following file systems is used primarily for Unix-based operating systems?

A.   NTFS

B.   FAT

C.   VMFS

D.   UFS

13.   Which of the following file systems was designed to protect against data corruption and is a 128-bit file system?

A.   NTFS

B.   UFS

C.   ZFS

D.   FAT

14.   Which file system was designed to replace the FAT file system?

A.   NTFS

B.   ZFS

C.   EXT

D.   UFS

15.   Which of the following file systems was the first to be designed specifically for Linux?

A.   FAT

B.   NTFS

C.   UFS

D.   EXT

Answers

1.   B. A solid-state drive is a drive that has no moving parts.

2.   C. Tape storage is good for off-site storage and archiving because it is less expensive than other storage types.

3.   C. You should recommend using an HDD because of the large size requirement. An HDD would be considerably cheaper than an SSD. Also, since it is a file share, the faster boot time provided by an SSD is not a factor.

4.   D. SCSI is relatively difficult to configure, as the drives must be configured with a device ID and the bus has to be terminated.

5.   D. Fibre Channel delivers the fastest connectivity method, with speeds of up to 128 Gbps, but it is more expensive than the other interface types. If price is not a factor, FC should be the recommendation for connecting to a SAN.

6.   C. WORM is the correct answer because it cannot be overwritten once the data has been stored on it.

7.   C. RAID 5 is best suited for a database or system drive that has a lot of read requests and very few write requests.

8.   C. Using RAID can provide all these benefits over conventional hard disk storage devices.

9.   A. Tier 1 data is defined as data that is mission-critical, highly available, and secure data.

10.   B. Tiered storage refers to the process of moving data between storage devices based on performance, cost, and availability.

11.   C. Tier 3 storage would be for financial data that you want to keep for tax purposes and is not needed on a day-to-day basis.

12.   D. UFS is the primary file system in a Unix-based computer.

13.   C. ZFS was developed by Sun Microsystems and is focused on protecting the user’s data against corruption. It is currently the only 128-bit file system.

14.   A. NTFS was designed by Microsoft as a replacement for FAT.

15.   D. EXT was the first file system designed specifically for Linux.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset