Tape technology introduction
Tape systems traditionally were associated with the mainframe computer market. They represented an essential element in mainframe systems architectures since the early 1950s as a cost-effective way to store large amounts of data. By contrast, the midrange and client/server computer market made limited use of tape technology until recently.
Over the past few years, growth in the demand for data storage and reliable backup and archiving solutions greatly increased the need to provide manageable and cost-effective tape library products. The value of using tape for backup purposes has only gradually become obvious and important in these environments.
This chapter reviews the history of tape technology, including the technologies, formats, and standards that you see for tape products in today’s market. This chapter also includes information about several products from non-IBM vendors. For more information about these non-IBM vendors’ products, see their respective websites.
This chapter includes the following sections:
1.1 Introduction
Magnetic tape was first used in 1930 for sound recording and later for video recording in 1951. In the following year, IBM invented the concept of using magnetic tape for computer data storage with the introduction of the IBM Model 726: the world’s first reel tape system. It had a storage density of 100 characters per inch and speeds up to 70 inches per second.
Since these early days, tape has continued to figure significantly in IT infrastructures for high-capacity storage backup. Its unique attributes can help users manage their storage requirements and contribute to the ever-present value of tape in the storage hierarchy.
Tape includes the following features:
Removable: Store it securely to protect it from viruses, sabotage, and other corruption
Scalable: Simply add more low-cost cartridges, not drives
Portable: Easily move it to another site to avoid destruction if the first site suffers threat or damage
Fast: Provides up to 750 MBps (with 2.5:1 compression ratio) for the IBM Linear Tape-Open (LTO) generation 7 systems
Reliable: IBM servo technology, read after write verification, and advanced error correction systems to help to make tape more reliable than disk
Green: Has low power consumption
As new storage formats and devices are developed and refined, industry experts periodically forecast the demise of tape, pronouncing it slow and outmoded. However, tape continues to be the most cost-effective, flexible, and scalable medium for high-capacity storage backup.
Over the last 60 years, IBM has delivered many innovations in tape storage and that innovation continues today. This chapter provides a brief overview of the major changes that have taken place in tape technology over this time.
 
1.2 Timeline
Figure 1-1 shows a summary of significant events in the history of tapes in relation to data storage. These events are described in more detail in this chapter.
Figure 1-1 IBM tape timeline
1.3 Tape products and technologies
Table 1-1 shows a summary of important milestones in the evolution of tapes over the last 60 years. Each milestone is described in more detail in this chapter.
Table 1-1 Tape timeline
Year
Manufacturer
Model number
Density or capacity
Advancements
1952
100 characters/inch
First use of plastic tape on a reel
1958
200 characters/inch
First readback drive
1964
2400
800 bpi
1970
3400
6250 bpi
 
1972
20 MB
1974
IBM
 
Tape cartridge with a single reel. First robotic tape library.
1984
IBM
200 MB
Thin-film magnetoresistive (MR) head
1984
94 MB
(DLT)
1986
IBM
400 MB
Hardware data compression IRDC
1987
 
2.4 GB
First helical scan tape drive
1989
HP
DDS1
2 GB
4 mm tape
1991
IBM
3490E
800 MB
 
1992
IBM
3490E
2.4 GB
IRDC compression
1995
IBM
10 GB
 
1997
IBM
 
Virtual tape
1999
Exabyte
60 GB
 
1999
20 GB
2000
Quantum
110 GB
More precise head positioning
2000
IBM
LTO-1
100 GB
First LTO drive
2003
IBM
 
Virtual backhitch
2003
IBM
LTO-2
200 GB
 
2005
IBM
LTO-3
400 GB
 
2006
IBM
 
integrated into drive
2007
IBM
LTO-4
800 GB
 
2008
IBM
1 TB
GMR heads
2010
IBM
LTO-5
1.5 TB
2011
IBM
TS1140
4 TB
 
2012
IBM
LTO-6
2.5 TB
4 tape partitions
2014
IBM
TS1150
10 TB
 
2015
IBM
LTO-7
6 TB
32 Tracks
1.3.1 Recording technology
The first computer tape systems used linear recording technology. This technology provides excellent data integrity, rapid access to data records, and reasonable storage density. Until the mid-1980s, all computer tape systems employed this linear recording technology, which uses a stationary head writing data in a longitudinal way. (Figure 1-5 on page 9 shows an example of longitudinal technology.)
However, in the mid-1980s, helical tape technology (developed for video applications) became available for computer data storage. This technology uses heads that rotate on a drum and write data at an angle. Helical tape systems found natural applications in backing up magnetic disk systems where their cost advantages substantially outweighed their operational disadvantages. (Figure 1-6 on page 10 shows an example of helical scan technology.)
The first implementation of linear recording technology used magnetic tapes on open reels. Later, the tape was protected inside cartridges by using one or two reels. Linear technology drives write each data track on the entire length of the tape. Data is first written onto a track along the entire length of the tape. When the end is reached, the heads are repositioned to record a new track again along the entire length of the tape, which is now traveling in the opposite direction. This method continues back and forth until the tape is full. On linear drives, the tape is guided around a static head.
By contrast, on helical scan systems, the tape is wrapped around a rotating drum that contains read/write heads. Because of the more complicated path, mechanical stress is placed on the tape. When contrasted with linear tape systems, helical tape systems have higher density (and, therefore, lower media cost). However, they have lower data transfer rates (because of the smaller number of active read/write heads), less effective access to random data records, increased maintenance requirements, and reduced data integrity.
Linear and helical tape systems advanced substantially over the past decade. Linear systems improved significantly in storage density (and, therefore, cost). They also improved in operational convenience, with various removable cartridge systems, such as 3590, quarter-inch cartridge (QIC), digital linear tape (DLT), and now Linear Tape-Open (LTO), replacing reel-to-reel systems. Helical systems improved in the areas of transfer rate and data integrity with the implementation of both channel and error correction coding technologies.
Over the past few decades, one of the most significant advances in tape technology for computer applications was the maturation of serpentine linear recording systems. For the first time, linear recording systems can provide recording density that is comparable with that of helical systems. The first commercially successful serpentine linear tape system for professional applications was DLT. Another important improvement is the use of servo tracks, which was first introduced by IBM on the Magstar® 3590 tape. Servo tracks are recorded at the time of manufacture. These tracks enable the tape drive to position the read/write head accurately regarding the media while the tape is in motion.
1.3.2 Tape reels
The first data backup device (and the ancestor of magnetic tape devices with a ½-inch-wide tape format) used magnetic tape reels, as shown in Figure 1-2. They were manufactured and sold in many different lengths and brands. The most common densities used were 1600 and 6250 bpi. The IBM 2400 was a 9-track 800 bpi tape drive. This was the first 9-track model from IBM that recorded EBCDIC (8-bit) or ASCII (7-bit) data. In the 1970s, the IBM 3400 was introduced and it supported up to 6250 bpi.
Figure 1-2 Tape reels, ½-inch
In the 1970s, the IBM 3420 shown in Figure 1-3 was introduced and supported up to 6250 bpi.
Figure 1-3 IBM 3420
1.3.3 Quarter-inch cartridge
The QIC tape device was first introduced in 1972 by the 3M company as a means to store data from telecommunications and data acquisition applications. As time passed, the comparatively inexpensive QIC tape device became an accepted data storage system, especially for stand-alone PCs.
A QIC tape device (shown in Figure 1-4) looks similar to an audio tape cassette with two reels inside, one with tape and the other for take-up.
Figure 1-4 QIC tape
The QIC format employs a linear (or longitudinal) recording technique in which data is written to parallel tracks that run along the length of the tape. The number of tracks is the principle determinant of capacity.
The QIC uses a linear read/write head similar to the heads found in cassette recorders, as shown in Figure 1-5. The head contains a single write head that is flanked on either side by a read head so that the tape drive can verify data just written when the tape is running in either direction.
Figure 1-5 QIC head diagram
Tandberg Data manufactures QIC drives with its Scalable Linear Recording (SLR) technology. Their most recent drive, the SLR140, provides 70 GB (native) and 140 GB (with a 2:1 compression ratio) capacity on a single data cartridge. The maximum data transfer rates are 6 MBps decompressed and 12 MBps (with a 2:1 compression ratio).
1.3.4 Digital Data Standard
The Digital Audio Tape (DAT) standard was created in 1987. As its name implies, it was originally conceived as a CD-quality audio format that offered three hours of digital sound on a single tape. The Digital Data Standard (DDS) is based on DAT and uses a similar technology. The cartridge design is common to both, but different tape formulations have been developed. In 1988, Sony and Hewlett-Packard (HP) defined the DDS standard, which transformed the format into one that can be used for digital data storage.
DAT technology is a 4 mm tape that uses helical scan recording technology (as shown in Figure 1-6). Over the years, the DAT capacity has grown from DDS-1 at 1.3 GB to DAT-320 at 160 GB (native) and 320 GB (with a 2:1 compression ratio). This technology is the same type of recording that is used in videocassette recorders (VCRs) and is inherently slower than the linear type. The tape in a helical scan system is pulled from a two-reel cartridge and wrapped around a cylindrical drum that contains two read heads and two write heads, arranged alternately. The read heads verify the data that is written by the write heads. The cylinder head is tilted slightly in relation to the tape and spins at 2000 revolutions per minute (RPM). The tape moves in the opposite direction to the cylindrical spin, at less than one inch per second. However, because it is recording more than one line at a time, it has an effective speed of 150 inches per second.
Figure 1-6 Helical-scan recording diagram
A directory of files is stored in a partition at the front of the tape. Similar to linear recording, the performance can be greatly improved if more read/write heads are added. However, this change is difficult with helical scan devices because of the design of the rotating head. The fact that the heads only can be added in pairs makes it challenging to fit the wiring inside a single cylinder, which limits the potential performance of helical scan devices. Because of the wide-wrap angle of the tape and the consequent degree of physical contact, the head and the media are prone to wear and tear.
1.3.5 The 8 mm format
Designed for the video industry, 8 mm tape technology was created to transfer high-quality color images to tape for storage and retrieval and was adopted by the computer industry. Similar to DAT, but with greater capacities, 8 mm drives also are based on the helical scan technology. One of the earliest was the Exabyte EXB-8200. A drawback to the helical scan system is the complicated tape path. Because the tape must be pulled from a cartridge and wrapped tightly around the spinning read/write cylinder (as shown in Figure 1-7), a great deal of stress is placed on the tape.
Figure 1-7 An 8 mm tape path
Two major protocols use different compression algorithms and drive technologies, but the basic function is the same. Exabyte Corporation sponsors standard 8 mm and VXA formats, while Seagate and Sony represent the 8 mm technology known as Advanced Intelligent Tape (AIT).
Mammoth tape format
The Mammoth tape format is a Small Computer System Interface (SCSI)-based 8 mm tape technology that is designed for open systems applications. It is a proprietary implementation of the 8 mm original format that has been available since 1987 and uses Advanced Metal Evaporative (AME) media. This media has a coating over the recording surface that seals and protects the recording surface.
The Exabyte Mammoth drives have a 5¼-inch form factor. The first generation provided 20 GB (native) and 40 GB (with a 2:1 compression ratio) capacity on a single 8 mm data cartridge. The maximum data transfer rates were 3 MBps uncompressed and 6 MBps (with a 2:1 compression ratio).
With the Mammoth-2 technology, the capacity and data rate increased to 60 GB (150 GB with a 2.5:1 compression ratio) and 12 MBps decompressed 30 MBps with a 2.5:1 compression ratio). Mammoth-2 drives are read compatible with the previous models.
VXA tape format
The VXA tape format is 8 mm tape technology that is designed for open systems applications. It is a proprietary implementation of an 8 mm packet format that is available since 2001 and uses AME media.
The Exabyte VXA drives have a 5¼-inch form factor. The first generation provided 32 GB (native) and 64 GB (with a 2:1 compression ratio) capacity on a single 8 mm data cartridge. The maximum data transfer rates were 3 MBps decompressed and 6 MBps (with a 2:1 compression ratio).
With the VXA-2 technology, the capacity and data rate increased to 80 GB (160 GB with a 2:1 compression ratio) and 6 MBps decompressed (12 MBps with a 2:1 compression ratio). VXA-2 drives are read compatible with the previous model.
With the VXA-3 technology, the capacity and data rate increased to 160 GB (320 GB with a 2:1 compression ratio) and 12 MBps decompressed (24 MBps with a 2:1 compression ratio). VXA-3 drives are read compatible with the previous model.
Advanced Intelligent Tape format
The Advanced Intelligent Tape (AIT) format was developed by Sony. Available in a 3½-inch form factor, Sony AIT-1 drives and media provide 25 GB (native) and 50 GB (with a 2:1 compression ratio) capacity on a single 8 mm data cartridge. The maximum data transfer rate is 3 MBps native.
The AIT-5 format from Sony has the capacity and performance of 400 GB (1040 GB with a 2.5:1 compression ratio) and 24 MBps data transfer rate.
AIT drives feature an Auto Tracking Following (ATF) system, which provides a closed-loop, self-adjusting path for tape tracking. This servo tracking system adjusts for tape flutter, so that data tracks can be written much closer together for high-density recording. AIT uses the Adaptive Lossless Data Compression (ALDC) technology compression algorithm.
Digital Linear Tape
Digital Linear Tape (DLT) drives became available in 1985 when Digital Equipment Corporation needed a backup system for their MicroVAX systems. This system uses a square cartridge that contains tape media but no take-up reel. The take-up reel was built into the drive itself. This design eliminated the space that was typically associated with cassette and cartridge drives, such as QIC or 8 mm. The drive had to be made larger than most drives to accommodate the internal take-up reel. The drive fit into a Full-High, 5¼-inch drive bay. Called the TK50, the tape drive was capable of storing 94 MB per cartridge.
By using a ferrite read/write head, the TK50 recorded data in linear blocks along 22 tracks. Its read/write head contained two sets of read/write elements. One set was used when reading and writing forward, and the other set was used when reading and writing backward.
In 1987, Digital Equipment Corporation released the TK70. This tape drive offered 294 MB of storage on the same square tape cartridge, a threefold improvement over the TK50. Digital accomplished this capacity by increasing the number of tracks to 48 and by increasing density on the same ½-inch tape.
In 1989, Digital Equipment Corporation introduced the first true DLT system. The TF85 (later called the DLT 260) incorporated a new feature that enabled the system to pack 2.6 GB onto a 1200-foot tape (CompacTape III, later known as DLTtape III).
The DLT Tape Head Guide Assembly was incorporated for the first time in the TF85 drive. Six precision rollers improved tape life. The six-roller head guide assembly gave the TF85 a much shorter tape path than helical scan systems, as shown in Figure 1-8.
Figure 1-8 DLT tape path mechanism
The read/write head was equipped with another write element so that the elements were arranged in a write/read/write pattern. With this pattern, the TF85 reads after writing on two channels and in forward and reverse directions, as shown in Figure 1-9.
Figure 1-9 DLT 2000 recording head design
Two years later, Digital Equipment Corporation introduced the TZ87, later known as the DLT 2000 tape drive. This system offered 10 GB of native capacity on a single CompacTape III cartridge (Figure 1-10), later known as DLTtape III. It supported 2 MB of read/write data cache memory and offered a data transfer rate of 1.25 MBps. This cartridge was the first generation of DLT.
Figure 1-10 DLT cartridge
In 1994, Quantum acquired the Storage division of Digital Equipment Corporation. In late 1994, Quantum released the DLT 4000. By increasing real density (bits per inch) from 62,500 to 82,000 and tape length by 600 feet (DLTtape IV), the capacity of the DLT 4000 grew up to 20 GB (40 GB compressed) on a single ½-inch DLTtape IV cartridge. The new DLT tape system provided a data transfer of 1.5 MBps (3 MBps compressed) and was fully read/write compatible with previous generations of DLT tape drives.
DLT 2000 and DLT 4000 drives write data on two channels simultaneously in linear tracks that run the length of the tape, as shown in Figure 1-11.
Figure 1-11 DLT 2000/4000 linear recording format
The DLT 7000 became available in 1996. This drive offered a total storage capacity of 35 GB native and 70 GB compressed on the 1800-foot DLTtape IV cartridge. The DLT 7000 incorporated a 4-channel head that gives the drive a transfer rate of 5 MBps of data in native mode, as shown in Figure 1-12.
Figure 1-12 DLT 7000/8000 tape head
The latest DLT product from Quantum is the DLT 8000 drive. This tape drive features a native transfer rate of up to 6 MBps, with a native capacity of 40 GB. The DLT 7000/8000 drives incorporate the Symmetric Phase Recording technology that writes data in an angled pattern, as shown in Figure 1-13.
Figure 1-13 Symmetric Phase Recording technology
1.3.6 SuperDLT
SuperDLT (SDLT) is a format specification that was developed by Quantum Corporation as an evolution of the DLT standard. It uses Laser Guided Magnetic Recording (LGMR) technology. This technology includes the Pivoting Optical Servo (POS). This optically-assisted servo system is implemented on the unused reverse side of the media and uses a laser to read the servo guide. SDLT uses 100% of the media for data recording. SDLT uses Advanced Metal Powder (AMP) media, which contains embedded information for the Pivoting Optical Servo system.
The recording mechanism is made of Magneto-Resistive Cluster (MRC) heads, which are a cluster of small magneto-resistive tape heads.
The first SDLT drive (the SDLT 220) was introduced in late 2000. It provides a capacity of 110 GB (native) and 220 GB (with a 2:1 compression ratio). The native data transfer rate is 11 MBps. This first drive was not backward-read compatible with earlier models. In 2001, Quantum released a version of the SDLT 220 drive that was backward-read compatible with the DLTtape IV cartridge.
The second SDLT 320 drive from Quantum became available in 2002. It increased the native capacity to 160 GB (320 GB with a 2:1 compression ratio) and the native transfer rate to 16 MBps (32 MBps with a 2:1 compression ratio). The SDLT 320 is backward read compatible with DLTtape IV cartridges and uses Super DLTtape I media.
The SDLT 600 is the third generation of the SDLT product range from Quantum. It provides a capacity of 300 GB (native) and 600 GB (with a 2:1 compression ratio) and the native transfer rate increased to 36 MBps (72 MBps with a 2:1 compression ratio). The SDLT600 comes with an LVD 160 Small Computer System Interface (SCSI) or with a 2 GB Fibre Channel (FC) interface. The SDLT 600 is compatible with earlier versions with the SDLT 320 and the DLT VS 160.
The DLT-S4 is the fourth generation of the SDLT product range from Quantum. It provides a capacity of 800 GB (native) and 1.6 TB (with a 2:1 compression ratio) and the native transfer rate increased to 324 MBps (400 MBps with a 2:1 compression ratio). The DLT-S4 comes with an LVD 320 Small Computer System Interface (SCSI) or with a 4 GB Fibre Channel (FC) interface. The DLT-S4 can read all Super DLTtape 2 cartridges written by SDLT 600 drives, and Super DLTtape 1 cartridges written by SDLT 320 drives
The media format has the following capacity:
DLT-4 (read/write) 800 GB native capacity
Super DLTtape II (read only) 300 GB native capacity
Super DLTtape I (read only) 160 GB native capacity
1.3.7 IBM 3850
Beginning in the late-1960s, IBM engineers in Boulder, Colorado, began development of a low-cost mass storage system that was based on magnetic tape in cartridges. By 1970, the proposed device was code named “Comanche” and was described as an online tape library to provide computer-controlled access to stored information. Numerous marketing studies and design changes were made during the early 1970s, and finally Comanche was announced as the IBM 3850 Mass Storage System (MSS) in October 1974.
The components of the 3850 were new data cartridges. The data cartridges were circular cylinders, two inches in diameter and four inches long, each holding a spool holding 770 inches of tape. Cartridges were stored in a two-dimensional array of bins, which were hexagonal, rather than square, to save space and, for the first time, were automatically accessed via a robot (accessor), shown in Figure 1-14.
Figure 1-14 IBM 3850
1.3.8 IBM 3480
The second generation of IBM magnetic media and the first one to use an enclosed cartridge containing ½-inch tape, the IBM 3480 Magnetic Tape Subsystem was announced on 22 March 1984, shown in Figure 1-15. The tape was stored in a now-familiar cartridge, which was smaller, more robust, and easier to handle than tape reels. The cartridge capacity was 200 MB, and the channel data rate was 3 MBps, writing 18 tracks in one direction.
Figure 1-15 IBM 3480
1.3.9 IBM 3490
The IBM 3490 replaced the IBM 3480 tape technology and used the same tape cartridge media. With a tape capacity of 800 MB uncompacted (2.4 GB compressed assuming 3:1 compression ratio) and a channel data rate of 3 MBps, the IBM 3490E increased the capacity of the 3480 four-fold. It used a double-length tape and wrote data in both directions: 18 tracks to the end of tape and 18 tracks back to the start of the tape.
During this second generation, automatic cartridge loaders and automated tape libraries, such as the IBM 3495 and 3494 libraries, were introduced to reduce or eliminate the need for tape operators. Software management applications, such as CA-1, TLMS, and the DFSMS Removable Media Manager (DFSMSrmm), were implemented to manage the tape volumes automatically.
The IBM 3490 and compatible drives were probably the first family of tape products that was mostly used with automatic tape libraries rather than being installed as stand-alone drives operated manually.
The Improved Data Recording Capability (IDRC), which compacts the data, reduced the number of tape volumes that were used.
Magnetic disks were widely used for online data. Therefore, these second-generation tape systems became primarily a medium for backup and were introduced as an archive medium. The process of archiving was also automated with products, such as Hierarchical Storage Manager (HSM) and DFSMShsm (a component of DFSMS), by using tape as the lowest level in a storage hierarchy. Tape was still used as an interchange medium, but networks were also used for that purpose.
1.3.10 IBM 3590
The IBM 3590 drive was originally called the IBM Magstar 3590. The IBM Magstar tape technology was first introduced in July 1995. The original cartridge maintained the external form factor of the IBM 3490 (as shown in Figure 1-16), had a capacity of 10 GB uncompacted (30 GB compressed), and a data rate of 9 MBps. Later drive models and newer media increased these figures. The data format was incompatible with the IBM 3490.
Figure 1-16 IBM 3590 tape cartridge
The IBM 3590 drive (as shown in Figure 1-17) incorporated longitudinal technology, Serpentine Interleaved Longitudinal Recording.
Figure 1-17 IBM 3590 tape drive
Data was written in each direction in turn. To increase capacity further, the concept of head indexing was introduced, which wrote multiple sets of tracks in parallel. The entire set of heads was slightly shifted after one pass, and all subsequent passes (for a total of eight) were used to write data tracks next to the existing ones. This method meant a significant improvement in the tape capacity and transfer rates without changing the tape speed (2 mps) and media length (600 m). The IBM 3590 drive used a buffer and compressed the data before it wrote the data to tape. In addition, the drive completed a stop-start cycle in approximately 100 ms. The performance was improved for both start-stop and streaming applications.
With the IBM 3590 Model H, the capacity and data rate increased to 60 GB (180 GB assuming 3:1 compression ratio). With the Extended Length Cartridges, the capacity and data rate increased to 14 MBps native. Both drives were made available in 2002 and maintain compatibility with earlier version for reading with the base models.
This design incorporated innovations such as servo tracks on the tape to guide the read/write heads along the data tracks and the implementation of an improved error correcting code (ECC). A portion of the tape within each cartridge was reserved for statistical information. This portion was continually updated after each read or write. It provided statistics that you can use to obtain drive and media information and identify problems with a particular tape or drive as early as possible.
Technology
The IBM 3590 provided high capacity, performance, reliability, and a wide range of host connectivity. This technology used a fourth-generation magneto resistive (MR) head, a
16 MB buffer, predictive failure analysis, and state-of-the-art electronic packaging.
While reading or writing 16 tracks at a time, the IBM 3590 models used serpentine, interleaved, longitudinal recording technology for a total of four, eight, or twelve round trips from the physical beginning to the physical end of the tape and back again. The tape read/write head indexes, or moved vertically, when it completed each round trip so that the recorded tracks are interleaved across the width of the tape.
Figure 1-18 shows the recording element of the IBM Enterprise 3590 tape drives. It also shows how the read/write heads moved over the width of the tape medium.
Figure 1-18 IBM 3590 recording
The 3590 tape drives used a metal particle medium in the tape cartridge that stores 10, 20, 30, 40, or 60 GB of uncompacted data, depending on the cartridge type and the drive model. The integrated control unit used a compaction algorithm that increases the storage capacity of these cartridges. Assuming a compression ratio of three to one (3:1), the cartridge capacity increased to 60 GB on E models and to 90 GB on H models.
The 3590E and 3590H models have a 14 MBps device data rate, and the 3590B models have a 9 MBps device data rate. With data compression, the 3590 tape drive can more effectively use the full capability of the Ultra-SCSI data rate, the IBM Enterprise Systems Connection (ESCON) data rate, or the IBM Fibre Connection (FICON®) data rate. The Ultra Wide SCSI data rate is up to 40 MB per second and the Fibre Channel data rate is up to 100 MB per second.
Metal particle media
A chromium dioxide medium was used in the IBM 3480 and 3490 cartridges. The IBM 3590 High Performance Tape Cartridge used a metal particle medium, which has a significantly increased coercivity. Therefore, it permits a much higher data recording density in comparison with chromium dioxide media as the linear density is proportional to the coercivity of the medium. The linear density of the IBM 3590 tape is approximately three times that of the IBM 3480 and 3490. The track density is also improved approximately four-fold. Advances in the metal particle coatings and media binders afford reliability and magnetic stability equal or superior to chrome media.
1.3.11 LTO Ultrium tape
The LTO standard was released as a joint initiative of IBM, Hewlett-Packard, and Seagate Technology. As a result of this initiative, two LTO formats (Ultrium and Accelis) were defined. However, for performance reasons, there was no demand for the Accelis format of the LTO tape, and drive nor media were commercially produced.
The consortium now consists of IBM, Hewlett-Packard, and Quantum, which are known as the technology provider companies. The technology specifications are available at this LTO website:
The LTO Ultrium 7 technology is the current generation of LTO Ultrium tape. It provides 6 terabytes (TB) of native physical capacity (15 TB compressed) per cartridge and native data transfer rate of up to 300 MBps (750 MBps by assuming a 2.5:1 compression ratio).
The previous format LTO Ultrium generations provided the following native capacities by using the transfer rates shown:
LTO Ultrium generation 6 provided a native capacity of 2.5 TB with a native transfer rate up to 160 MBps
LTO Ultrium generation 5 provided a native capacity of 1.5 TB with a native transfer rate up to 140 MBps
LTO Ultrium generation 4 provided a native capacity of 800 GB with a native transfer rate up to 120 MBps
LTO Ultrium generation 3 provided a native capacity of 400 GB with a native transfer rate up to 80 MBps
LTO Ultrium generation 2 provided a native capacity of 200 GB with a native transfer rate up to 40 MBps
LTO Ultrium generation 1 provided a native capacity of 100 GB with a native transfer rate of up to 20 MBps
Each LTO Ultrium generation has doubled the compressed media storage capacity and increased the data transfer rate. Further, each LTO Ultrium drive generation is compatible with earlier versions for read and write capability with the prior media generation. Each LTO Ultrium drive generation is also compatible with earlier versions for read capability with the two prior media generations.
For more information about the LTO Ultrium tape format specification, see 2.1.2, “LTO standards” on page 47. For more information about the IBM LTO Ultrium tape drive, see 2.1.5, “IBM LTO Ultrium common subassembly drive” on page 57.
LTO WORM cartridges
The IBM Ultrium Write Once Read Many (WORM) cartridges (Machine Type 3589) were designed for applications, such as archiving and data retention, and for applications that require an audit trail. The IBM Ultrium WORM cartridges work with the IBM LTO Ultrium tape drive to prevent the alteration or deletion of user data. In addition, IBM has taken the following steps to reduce the possibility of tampering with the information:
The bottom of the cartridge is molded in a different color than rewritable cartridges
The special cartridge memory helps protect the WORM nature of the media
A unique format is factory-written on each WORM cartridge
The IBM LTO Ultrium 7 WORM format, based on LTO specifications, provides a tape cartridge capacity of up to 6 TB native physical capacity (15 TB with a 2.5:1 compression ratio). The 6 TB WORM cartridge can be used only in the IBM Ultrium 7 tape drive. Additionally, the IBM LTO Ultrium 7 tape drive can process the previous LTO Ultrium 6 2.5 TB WORM format for read and write data and can read data from the LTO Ultrium 5 1.5 GB WORM format.
1.3.12 IBM TS1100 tape drive family
The IBM TS1100 tape drive family offers a design that is focused on high capacity, performance, and high reliability for storing mission-critical data. Introduced in October 2003, the 3592-J1A tape drive had 300 GB of native capacity in a ½-inch format tape cartridge. It was also a foundation for future generations of this new tape drive family based on the concept of media reuse. This design helps protect the client’s investment in tape cartridges.
In October 2005, the second generation of the 3592 drive, the IBM TS1120 tape drive Model E05, was introduced. The IBM 3592-E05 has the same physical measurements as the 3592-J1A tape drive, but the capacity increased 1.6 times from 300 GB to 500 GB native capacity on one cartridge. It has a 4 GB Fibre Channel attachment and a native data rate of up to 100 MBps.
The capacity characteristics of the third generation of 3592 tape drives increased again. The TS1130 Model E06 tape drive was the third generation of the 3592 family achieving the unprecedented capacity of 1 TB of decompressed data.
The fourth generation of 3592, the IBM TS1140 model E07, again took tape capacity to a new level. The TS1140 can store 1.6 TB of decompressed data on the JB cartridge type and 4 TB of decompressed data on the advanced JC cartridge type.
With the fifth generation of 3592, the IBM TS1150 model E08, IBM has taken tape capacity to a new level yet again. The TS1150 can store 7 TB of decompressed data on the existing JC cartridge types, and unprecedented capacity of 10 TB of decompressed data on the new advanced JD cartridge types, with improved levels of performance.
Figure 1-19 shows the IBM TS1150 tape drive, 3592 Model E08.
Figure 1-19 IBM TS1150 tape drive 3592 Model E08
For more information about the IBM 3592 J1A tape drive, the TS1120 tape drive, the TS1130 tape drive, TS1140 tape drive, and the TS1150 tape drive, see Chapter 3, “IBM TS1100 tape drives” on page 105.
The TS1150 tape drive maintains the same features and technology enhancements that were introduced with the TS1120 and extended by the TS1130 and TS1140. The TS1150 also offers several enhancements over the predecessor models. These are explained next.
TS1100 family key features
The TS1150 has the following key features, including those below which were introduced with the 3590 J1A, TS1120, TS1130, and TS1140:
Digital speed matching
Channel calibration
High-resolution tape directory
Recursive accumulating backhitchless flush or non-volatile caching
Backhitchless backspace
Streaming Lossless Data Compression (SLDC) algorithm
Capacity scaling
Single field replaceable unit (FRU)
Error detection and reporting
Statistical Analysis Recording System (SARS) algorithm with extended mount count
Revised encryption support
Dual-stage 32-head actuator
Offboard data string searching
Enhanced logic to report logical end of tape
Added partitioning support
End-to-end logical block protection support
Data safe mode
Enhanced Ethernet support
Enhanced Barium Ferrite (BaFe) particle media types
8 Gbps Fibre Channel (FC) dual port interface
Enhanced read-ahead buffer management
High access performance for locate/search
SkipSync and FastSync write performance accelerators
32-channel enhanced ECC recording format
Performance improvement
The performance on the TS1150 is increased by the following improvements:
Improved data rate and capacity
Improved latency by reducing access time to data
Improved data compression
Beginning of Partition (BOP) caching
Humidity sensor support
Increased Cartridge Memory size and related functions
Improved High Resolution Tape Directory (HRTD)
Larger main data buffer
Extended copy support
Higher data rates and capacity
The following format data rates are available for the TS1150:
Using the E08 format, maximum data rates increase to 360 MBps native and to 700 MBps compressed
Using the E07 format, maximum data rates increase to 250 MBps native and to 650 MBps compressed
Table 1-2 summarizes the capacity and performance characteristics for decompressed data.
Table 1-2 Native capacity and performance summary
Media
Type
E08 format capacity data rate (minimum–maximum)
E07 format capacity data rate (minimum–maximum)
JC, JY
7 TB
80 MBps - 300 MBps
4 TB
50 MBps - 250 MBps
JD, JZ
10 TB
90 MBps - 360 MBps
N/A
JK
900 GB
60 MBps - 250 MBps
500 GB
50 MBps - 250 MBps
JL
2 TB
60 MBps - 360 MBps
N/A
1.3.13 Libraries
System administrators are clamoring for technologies that help them to efficiently and economically manage the explosive growth in stored data. As the amount of data increases, the backup process takes longer and longer. The solution to this problem is to use a device that integrates the tape drive with a level of automation. The challenge is to choose the right solution in terms of size and automation level.
System administrators industry-wide recognized the need to automate the backup-and-restore process to the extent that it requires little or no human intervention. This method has become known as lights-out backup. This process can be done off-shift or concurrently with other applications during normal operations. Multi-drive tape libraries are the only available technology to offer reliability and low cost to make lights-out backup practical.
The hardware options for automation are autoloaders and a range of multi-drive automated tape libraries.
Autoloaders
Autoloaders have one tape drive. Clients typically use autoloaders to access a few tapes once a day. Most autoloaders are designed for purely sequential operations. These units place little emphasis on performance.
Automated tape libraries
Automated tape libraries have one or more tape drives, but clients typically use them with at least two tape drives. All tape cartridges are accessible to all drives, thereby making concurrent reading and writing operations possible.
You can increase throughput by adding more drives and HBAs. With automation eliminating the manual intervention to load tapes, file-restore response times are substantially improved. Tape libraries are mandatory for lights-out operations and other higher performance tape storage applications. Tape libraries also offer the security of knowing that other drives are available if one fails.
Multi-drive automated tape libraries and ultra-scalable tape libraries, combined with storage management software, including concurrent backup, archive, and hierarchical storage management (HSM), offer the most robust solution for managing and protecting huge amounts of corporate data. Automated tape libraries allow random access to large numbers of tape cartridges and the concurrent use of two or more drives, rather than manually loading one tape after another or by using a single-drive sequential autoloader.
Enterprise tape libraries
Enterprise tape libraries are automated tape libraries that provide enhanced levels of automation, scalability, reliability, availability, and serviceability. They typically have the capacity to house dozens of drives and hundreds of tapes. Equipped with high-performance robotic mechanisms, bar code scanners, and support for cartridge I/O ports, these libraries often offer redundant components and a high degree of flexibility through a modular design. Certain models add support for multiple SCSI, Fibre Channel-Arbitrated Loop (FC-AL), Fibre Channel Protocol (FCP), and ESCON or FICON connections to allow connection to more than one host platform.
The top-of-the-line of the enterprise tape library products, such as the IBM TS3500 tape library and the newer IBM TS4500 tape library, can be shared between two or more heterogeneous host systems. All of the hosts have access to the control functions of the tape library robotics. The library is shared in a physical way, with each system operating as though it really owns the entire library.
TS3500 tape library shuttle complex
The TS3500 tape library shuttle complex enables extreme scalability of over 300,000 LTO cartridges (or over 2 EiB of decompressed TS1150 data) in a single library image by supporting transport of cartridges from one TS3500 tape library string to another TS3500 tape library string. Application software that supports this capability can move tape cartridges directly from its home logical library to the destination logical library. Shuttle connections span high-density TS3500 S24 or S54 frames from different TS3500 tape library strings. The TS3500 tape library shuttle complex supports new and existing TS3500 tape library installations and is particularly well-suited for High Performance Storage System (HPSS) environments. The recent introduction of IBM Tape System Library Manager (TSLM) now gives users an alternative means of managing the shuttle complex.
 
Important: At the time of writing, HPSS and TSLM are the only library management solutions that support the TS3500 tape library shuttle complex.
To meet the needs of large data center archives that must store increasing amounts of data, the TS3500 tape library offers shuttle technology that enables flexible library growth on a z-axis. This growth flexibility, enabled by shuttle connections between HD libraries, allows a higher maximum capacity for a single library image of multiple TS3500 tape library strings. This flexibility also accommodates constrained data center layouts that do not have room to expand on the x-axis, and data centers with large archives that exceed the maximum cartridge count of an individual TS3500 tape library string.
Drive sharing across library resources
As shown in Figure 1-20, the TS3500 tape library shuttle complex can move tapes from one library string to another by bypassing the intermediate library strings in comparison to a traditional pass-through method. The TS3500 tape library transports tape cartridges in shuttle cars that pass over the libraries. This method of transporting cartridges is called direct flight. With the direct flight capability, if there is no drive available in the home logical library. The cartridge is moved across a shuttle connection to a logical library with an available drive. This configuration of interconnected parallel library strings is called a shuttle complex.
Figure 1-20 TS3500 tape library shuttle complex
The following components of a shuttle complex are shown in Figure 1-20:
1. Shuttle stations
The shuttle station mounts on top of an HD frame. It consists of a base pad and a shuttle slot. The shuttle slot docks into the base pad. When the shuttle slot is all the way down into the frame station, it can accept or deliver a cartridge. Each shuttle station has its own import/export element (IEE) address.
2. Shuttle span
One or more shuttle spans are linked together to form a shuttle connection between HD frames in parallel library strings. Shorter shuttle spans support distances between library strings ranging from 762 mm (30 inch) to 1524 mm (60 inch). Longer shuttle spans support distances between library strings ranging from 1524 mm (60 inch) to 2743.2 mm (108 inch).
3. Shuttle connection
A shuttle connection consists of one shuttle car, two or more shuttle stations, and one or more spans between these shuttle stations. Each shuttle connection supports one shuttle car.
For more information about the TS3500 tape library shuttle complex, see 11.1.3, “TS3500 tape library storage-only frames S24 and S54” on page 304.
HPSS overview
High Performance Storage System (HPSS) is cluster-based software that provides for overall management and access of many petabytes of data. HPSS is capable of concurrently accessing hundreds of disk arrays and tape drives for extremely high aggregate data transfer rates, thus enabling HPSS to easily meet otherwise unachievable demands of total storage capacity, file sizes, data rates, and number of objects stored.
HPSS has been used successfully for digital image libraries, scientific data repositories, university mass storage systems, and weather forecasting systems, and defense and national security applications.
A High Performance Computing (HPC) system needs a high performance storage system, and that is what HPSS offers. HPSS is installed in some of the greatest HPC systems worldwide, including ORNL Titan, LLNL Sequoia, RIKEN K-Computer, ANL Mira, CEA Curie Thin Nodes, and NCAR Yellowstone.
HPSS is the result of the collaboration of the following institutions:
IBM Global Services in Houston, Texas
Lawrence Berkeley National Laboratory
Lawrence Livermore National Laboratory
Los Alamos National Laboratory
Oak Ridge National Laboratory
Sandia National Laboratories
The collaborative development is important because the developers are users of the technology.
HPSS can provide an extremely scalable repository for content management software systems, including the integrated Rule Oriented Data Systems (iRODS) and IBM FileNet®. HPSS can be used alone, with its own interfaces, or it can be used to provide space management and disaster recovery backup for IBM Spectrum Scale.
For more information about HPSS, see 11.1.6, “High Performance Storage System” on page 320 or the IBM System Storage Solutions Handbook, SG24-5250.
The IBM TS4500 tape library
The IBM TS4500 is a highly scalable, stand-alone tape library that provides high-density tape storage and high-performance, automated tape handling for open systems environments.
Figure 1-21 on page 27 shows a three-frame version of the TS4500 tape library. An individual library consists of one base frame and up to 17 expansion frames and can include up to 128 tape drives and more than 23,000 tape cartridges.
Figure 1-21 TS4500 introduction
The TS4500 tape library provides the following capabilities:
All frames include high-density (HD) slot technology
Additional frame models can be placed in any active position so that the library can grow from both the right side and the left side of the first L frame
Integrated management console (IMC)
New user interface for improved usability
Updated control system
Input/output (I/O) magazine to allow individual cartridge handling to be performed independent of the library
Top-rack space to house extra tape solution components within the library footprint
Support for HD2-compatible models of the TS1150 (3592 EH8), TS1140 (3592 EH7), LTO-7 (3588 F7C), LTO-6 (3588 F6C), and LTO-5 (3588 F5C) tape drives
TS4500 support for S54 and S24 frames from TS3500 (requires feature code 1742)
The TS4500 tape library is available with several tape drive, frame model, and feature options to meet your specific needs. Additional features of the TS4500 tape library are highlighted in the following list:
Advanced Library Management System (ALMS)
Ability to attach multiple simultaneous heterogeneous servers
Remote management with the TS4500 management GUI or the TS4500 command-line interface (CLI)
Remote monitoring using Simple Network Management Protocol (SNMP), email, or syslog
Multipath architecture
Drive and media exception reporting
In-depth reporting using the Tape System Reporter (TSR)
Host-based path failover
Up to 288 I/O slots (36 I/O slots standard for LTO libraries and 32 I/O slots standard for 3592 libraries)
For more information about TS4500, see the IBM Tivoli Storage Productivity Center Beyond the Basics, SG24-8236.
1.4 Tape solutions in a SAN environment
Connectivity to tape is essential for most backup processes. However, manual tape operations and tape handling are expensive. Studies show that automation of tape processing saves money and increases reliability. Enterprises have long had to use staff to remove these tapes, transport them to a storage site, and then return them to the tape drive for mounting when necessary. Client tape planning initiatives are directed at more efficient use of drives and libraries and at minimization of manual labor that is associated with tape processing.
The biggest challenges with SCSI tape implementations are the limited cable length and the limited possibilities to share drives between several systems. For LVD SCSI, the total cable length is limited to 25 m (82 ft.) that uses point-to-point interconnection (such as one host connected to only one tape drive). With multidrop interconnection (one host connected to more than one tape drive on the same SCSI bus), the total cable length is 12 m (39.4 ft.) for LVD SCSI and 25 m (82 ft.) for High Voltage Differential (HVD) SCSI. Most SCSI tape drives have only one SCSI port and, therefore, can be attached only on one SCSI bus. This method severely limits the number of hosts that can physically use the drive without recabling.
SANs enable greater connectivity of the tape libraries and tape drives and enable tape sharing. With Fibre Channel, the distance between the server (or data point) and the connected tape node can be up to 10 km. Fibre Channel enables multiple host scenarios without recabling.
If software to manage tape-drive sharing is unavailable, you must isolate (or zone) the drives to unique hosts by using functions that are commonly available on SAN gateways or switches. With the correct management software, each drive can communicate with each host, and connections can be dynamic without recabling.
Backup solutions can use SAN technology several ways to reduce the costs of their implementation while increasing their performance.
Sharing tape devices in a SAN environment
The tape world has the following distinct means of sharing:
Library sharing
Drive sharing
Media sharing
Library sharing
Library sharing occurs when multiple servers that are attached to a tape library share the library and the robotics. Tape drives within the library might or might not be shared (pooled) among the attached servers. Tape library sharing is a prerequisite for tape-drive sharing.
Drive sharing
The sharing of one or more tape drives among multiple servers is called drive sharing. To share drives between heterogeneous applications within a tape library, the tape library must provide multiple paths to the robotics and be able to define the drives and slots of a library as multiple logical libraries. The server that is attached to each logical library has no knowledge of any drives or slots outside the logical library.
Media sharing
Media sharing today is possible only in a homogeneous environment between servers that use the same backup server and the same library to back up their data. For systems that are not backed up by the same backup server, it is possible to share only a tape scratch pool.
1.5 Tape virtualization for open systems
A virtual tape library (VTL) is a unique blend of several storage tiers. The lifecycle of data starts with its creation at the server tier, then migrates by backup software to a VTL tier. You now have many options for the data because the VTL is a combination of high performance SAN-attached disk and high performance servers that are running Linux, which emulates a tape storage device.
With disk prices falling and the need to recover data quickly increasing, more organizations started adopting disk as a viable alternative to slower, more resource-intensive magnetic tape in their backup and recovery environments. Any business that has a need for high performance backup and restore might need virtualization solutions. Only the disk storage tier can provide instant access backup/restore capabilities. All backup software products were designed to back up data to tape drives and libraries, and a VTL emulates tape. You can go directly to a VTL and have your backups completed quickly.
VTLs fill a void in the backup infrastructure for data that needs to be restored at a certain moment. Most restores often happen within six weeks of the data being backed up. Backup software can be configured to back up data to a VTL and then create a tape-to-tape virtual copy for off-site deployment. It is no longer necessary to call the tapes back from an off-site location, unless data is required from years past. However, to meet the requirements of the enterprise data center, a VTL must do more than simply reduce storage requirements. To be viable, it must maintain high sustained throughput performance and meet the availability demands of the enterprise. The TS7650G data deduplication gateway from IBM is an example of a deduplicating VTL that meets the demands of the enterprise data center. It is the first solution that supports a cluster of two nodes while sustaining high performance, inline data deduplication.
VTLs are used in the same way that physical tape devices are used. The VTL is inserted as a staging area between the first-line disk devices and the tape libraries. From then on, it appears to the backup and recovery software to be a physical tape library with physical drives. This placement changes the physical paradigm for backups from disk-to-tape to disk-to-disk-to-tape (D2D2T), a shift that has interesting consequences.
All VTLs perform the same basic functions, serving as a disk-based target for the backup process, and offering rapid recoveries from local disk-based storage. However, all VTLs are not equal. Users of first-generation VTLs, which use a disk-to-disk (D2D) backup model, found that as their data protection needs increased, they ran out of disk capacity quickly. Additionally, many VTLs provide value through more features such as encryption, compression, and data deduplication.
The use of virtual tapes for backup reduces failure rates allows for smaller backup windows and significantly speeds up recovery. The VTL solution offers the following benefits:
Increased restore speed
Reduced cost of online storage
Improved reliability of the storage environment
Assured data security
Reduced time spent managing backups
1.5.1 ProtecTIER virtual tape
The IBM ProtecTIER virtual tape service emulates traditional tape libraries. By using this tool, you can move to disk backup without having to replace your entire backup environment. Your existing backup application can access virtual robots to move virtual cartridges between virtual slots and virtual drives. The backup application perceives that the data is being stored on cartridges while ProtecTIER stores data on a deduplicated disk repository on the storage fabric. Figure 1-22 shows the concept of virtual tape.
Figure 1-22 ProtecTIER virtual tape concept
In addition to tape virtualization, ProtecTIER provides the following main features:
Data deduplication
ProtecTIER native replication
Each of these features brings its own set of options to customize to your environment. The ProtecTIER software is configured by and ordered from IBM or its business partners. The order includes the ProtecTIER and the ProtecTIER Manager GUI application.
1.5.2 Data deduplication
While VTLs are now a fundamental element of the data center infrastructure, the growth of data under management is outpacing the ability of most firms to add disk capacity to their VTLs. This phenomenon has led to a strong demand for data deduplication. Data deduplication is a technology that finds and eliminates redundant data within a disk repository. The effect is a dramatic increase in the usable capacity of a given disk pool. By combining this powerful technology with VTLs, firms can store and retain far more data than they were previously able to, while saving significantly on disk storage costs.
Data deduplication refers to the elimination of redundant data. In the data deduplication process, duplicate data is deleted, leaving only one copy of the data to be stored. However, indexing of all data is still retained if that data is ever required. Data deduplication can reduce the required storage capacity because only unique data is stored. This task is achieved by storing a single copy of data that is backed up repetitively. Data deduplication can provide greater data reduction than previous technologies, such as Lempel-Ziv (LZ) compression and differencing, which is used for differential backups.
By reducing the amount of data under management, data centers reduce the amount of hardware that is required, thus decreasing operational expenses for power and cooling. Data centers also reduce tape media costs and make it easier for storage administrators to cope with drastic increases in raw data and increased complexity in the rest of their storage infrastructures. Figure 1-23 shows the basic components of data deduplication for the IBM TS7650 ProtecTIER server.
Figure 1-23 Data deduplication concept
With data deduplication, data is read by the data deduplication system, which breaks the data into elements or chunks. The data deduplication process creates a signature or identifier for each data element. Whether inline or post processing data deduplication is used, data element signature values are compared to identify duplicate data. After the duplicate data is identified, one copy of each element is retained, pointers are created for the duplicate items, and the duplicate items are not stored.
The effectiveness of data deduplication depends upon many variables, including the rate of data change, the number of backups, and the data retention period. For example, if you back up the same uncompressible data once a week for six months, you save the first copy and do not save the next 24, which provides a 25:1 data deduplication ratio.
If you back up an uncompressible file in week one, back up the exact same file again in week two and never back it up again, you have a 2:1 data deduplication ratio. A more likely scenario is that some portion of your data changes from backup to backup so that your data deduplication ratio changes over time. For example, you take weekly full and daily differential incremental backups. Your data change rate for the full backups is 15%, and your daily incrementals are 30%. After 30 days, your data deduplication ratio might be around 6:1. However, if you kept your backups up to 180 days, your data deduplication ratio might increase to 10:1.
Different data deduplication products use different methods of breaking up the data into elements, but each product uses a technique to create a signature or identifier for each data element. The IBM TS7650G and TS7620 use the HyperFactor® Deduplication Engine to perform high-speed pattern matching.
1.5.3 HyperFactor
The cornerstone of ProtecTIER is IBM HyperFactor, the technology that deduplicates data inline as it is received from the backup application. The bandwidth efficient replication, inline performance, and scalability of ProtecTIER directly stem from the technological breakthroughs that are inherent in HyperFactor. HyperFactor is based on a series of algorithms that identify and filter out the elements of a data stream that were stored by ProtecTIER.
IBM HyperFactor uses its algorithms to perform high-speed pattern matching, identifying, and eliminating of instances of duplicate data. This method reduces the total amount of stored data (and thus the need to buy more storage) by up to 25:1, depending on the environment.
HyperFactor can reduce any duplicate data, regardless of its location or how recently it was stored. Unlike hash-based techniques, HyperFactor finds duplicate data without needing exact matches of chunks of data. When new data is received, it checks to see whether similar data already was stored. If so, only the difference between the new data and previously stored data must be retained. This technique of finding duplicate data performs well.
With this approach, HyperFactor can surpass the reduction ratios that are attainable by any other data reduction method.
1.5.4 ProtecTIER Interfaces
Apart from the VTL, the ProtecTIER data deduplication solution can be ordered in two other interface styles:
The OpenStorage API
The FSI, with CIFS and NFS support
Because the different interface methods cannot be intermixed, you must choose one interface or deploy multiple ProtecTIER models simultaneously.
OpenStorage API Interface
With the OpenStorage (OST) API, ProtecTIER can be integrated with Symantec NetBackup to provide backup-to-disk without having to emulate traditional tape libraries. By using a plug-in that is installed on an OST-enabled NetBackup media server, the ProtecTIER product can implement a communication protocol that supports data transfer and control between the backup server.
File System Interface
The File System Interface (FSI) is a feature that was implemented in ProtecTIER server version 3.2. The ProtecTIER FSI emulates Windows file system behavior and presents a virtualized hierarchy of file systems, directories, and files to Windows CIFS clients. Clients can perform all Windows file system operations on the emulated file system content. The ProtecTIER FSI interface is intended to be used for backup and restore of data sets by using a backup application. It is single-node only, and FSI/CIFS cannot be deployed on dual-node clusters.
 
Note: You can have CIFS and NFS connectivity on the same ProtecTIER FSI model.
1.5.5 ProtecTIER models
IBM currently offers two virtualization solutions for open systems: The TS7650G (3958) gateway and the IBM TS7620 Deduplication Appliance Express® (3959 SM2).
TS7650G ProtecTIER Deduplication Gateway
The TS7650G does not include a disk repository and consists of the following components:
3958 DD5
This is the latest model available from May 2012, which shipped with the ProtecTIER server software version 3.3 or higher. This server is based on the x7145 type. When it is ordered as the ProtecTIER TS7650G, the machine type and model are 3958 DD5.
System console
The system console is a TS3000 System Console (TSSC). This document uses the terms system console and TSSC interchangeably.
The TS7650G gateway solution also connects to the disk subsystem. The customer chooses the disk subsystem for use with the TS7650G. A list of compatible controllers is available at the IBM Tape Systems Resource Library at this website:
Verify compatibility in the TS7650G ISV and the interoperability matrix document on the website.
TS7620 ProtecTIER Deduplication Appliance Express
The TS7620 appliance is model 3959 SM1 SM2. The IBM ProtecTIER Entry Edition software is loaded on the IBM TS7620 ProtecTIER Deduplication Appliance to create a TS7620 ProtecTIER Deduplication Appliance solution. A separate order for ProtecTIER Entry Edition is required. Clients can choose between VTL configuration, OpenStorage (OST), or the FSI, which supports Common Internet File System (CIFS) as a backup target.
The TS7620 appliance uses the 3959 SM2 server. The 3959-SM2 is a rack unit (2U) bundled appliance. It comes together with the disk storage and the ProtecTIER software in a 2U device. With the rail kit, it is 3U. The following levels of disk capacity can be ordered:
6 TB (base, software limited)
12 TB (base, software full capacity)
23 TB (base, software full capacity plus expansion 1)
35 TB (base, software full capacity plus expansion 1 and 2)
The TS7620 deduplication appliance has system console functions included, and therefore does not require a TSSC.
For more information about ProtecTIER, see Chapter 13, “IBM TS7600 ProtecTIER Systems” on page 381.
1.6 IBM Spectrum Scale
IBM Spectrum Scale™ is a proven, scalable, high-performance file management solution (based on IBM General Parallel File System (GPFS™). IBM Spectrum Scale provides world-class storage management with extreme scalability, flash accelerated performance, and automatic policy-based storage tiering from flash, to disk to tape. IBM Spectrum Scale reduces storage costs up to 90% while improving security and management efficiency in cloud, big data, and analytics environments.
First introduced in 1998, this very mature technology enables a maximum volume size of
8 YB, a maximum file size of 8 EB, and up to 18.4 quintillion (two by the 64th power) files per file system. IBM Spectrum Scale provides simplified data management and integrated information lifecycle tools as a software defined storage for cloud, big data and analytics. It introduces enhanced security, flash accelerated performance, and improved usability. Also capacity quotas, ACLs, and a powerful snapshot functionality are implemented.
Key Capabilities
IBM Spectrum Scale adds elasticity with the following capabilities:
Global namespace with high performance access scales from departmental to global
Automated tiering, data lifecycle management from flash (6x acceleration) to tape (10x savings)
Enterprise ready: data security (encryption), availability, reliability, large scale proven
Open: POSIX compliant, integrated with OpenStack components and Hadoop
Benefits
Improves performance by removing data-related bottlenecks
Automated tiering, data lifecycle management from flash (acceleration) to tape (savings)
Lowers cost by eliminating duplicate data
Enables sharing of data across multiple applications
Reduces cost per performance by placing data on most applicable storage (flash to tape)
IBM Spectrum Scale is part of the IBM market-leading software-defined storage family:
As a software-only solution: Runs on virtually any hardware platform and supports almost any block storage device. IBM Spectrum Scale runs on Linux (including Linux on z Systems), IBM AIX®, and Microsoft Windows-based systems.
As an integrated IBM Elastic Storage™ Server solution: A bundled hardware, software, and services offering that includes installation and ease of management with a graphical user interface. Elastic Storage Server (ESS) provides unsurpassed end-to-end data availability, reliability and integrity with unique technologies including IBM Spectrum Scale Redundant Array of Independent Disks (RAID).
As a Cloud service: IBM Spectrum Scale delivered as a service, brings high performance, scalable storage, and integrated data governance for managing large amounts of data and files in the IBM SoftLayer® cloud.
IBM Spectrum Scale features enhanced security with native encryption and secure erase. It can increase performance utilizing server-side flash cache to increase I/O performance up to six times. IBM Spectrum Scale provides improved usability through data replication capabilities, data migration capabilities, Active File Management (AFM), File Placement Optimizer (FPO), and IBM Spectrum Scale Native RAID.
Figure 1-24 shows an example of the Spectrum Scale architecture.
Figure 1-24 Spectrum Scale Architecture
 
More information: http://www.ibm.com/systems/storage/spectrum/scale
1.6.1 IBM Spectrum Archive
IBM Spectrum Archive, a member of the IBM Spectrum Storage™ family, enables direct, intuitive and graphical access to data stored in IBM tape drives and libraries by incorporating the Linear Tape File System (LTFS) format standard for reading, writing and exchanging descriptive metadata on formatted tape cartridges. Spectrum Archive eliminates the need for additional tape management and software to access data. Spectrum Archive offers three software solutions for managing your digital files with the LTFS format: Single Drive Edition (SDE), Library Edition (LE), and Enterprise Edition (EE). With Spectrum Archive Enterprise Edition and Spectrum Scale tape can now add savings as a low-cost storage tape tier.
Network-attached unstructured data storage with native tape support using Linear Tape File System, delivers the best mix of performance and lowest cost storage.
Key Capabilities
Spectrum Archive options can support small, medium, and enterprise businesses with:
Seamless virtualization of storage tiers
Policy-based placement of data
Single universal namespace for all file data
Security and protection of assets
Open, non-proprietary, cross platform interchange
Integrated functionality with IBM Spectrum Scale
Benefits
IBM Spectrum Archive enables direct, intuitive, and graphical access to data stored in IBM tape drives and libraries by incorporating the Linear Tape File System (LTFS) format standard for reading, writing, and exchanging descriptive metadata on formatted tape cartridges. Spectrum Archive eliminates the need for additional tape management and software to access data.
Spectrum Archive takes advantage of the low cost of tape storage while making it as easy to use as drag. Here are several of the Spectrum Archive benefits:
Access and manage all data in stand-alone tape environments as simply as though it were on disk
Enable easy-as-disk access to single or multiple cartridges in a tape library
Improve efficiency and reduce costs for long-term, tiered storage
Optimize data placement for cost and performance
Enable data file sharing without proprietary software
Scalable, low cost
IBM Linear Tape File System
LTFS is the first file system that works with LTO generation 5 and 6 tape technology (or
IBM TS1150 and TS1140 tape drives) to set a new standard for ease of use and portability for open systems tape storage. With this application, accessing data that is stored on an IBM tape cartridge is as easy and intuitive as using a USB flash drive. Tapes are self-describing, and you can quickly recall any file from a tape without having to read the whole tape from beginning to end.
Furthermore, any LTFS-capable system can read a tape that is created by any other LTFS-capable system (regardless of the operating system platform). Any LTFS-capable system can identify and retrieve the files that are stored on it. LTFS-capable systems have the following characteristics:
Files and directories are displayed to you as a directory tree listing.
More intuitive searches of cartridge and library content are now possible due to the addition of file tagging.
Files can be moved to and from LTFS tape by using the familiar drag-and-drop metaphor common to many operating systems.
Many applications that were written to use files on disk can now use files on tape without any modification.
All standard File Open, Write, Read, Append, Delete, and Close functions are supported.
IBM Spectrum Archive editions
As shown in Figure 1-25, Spectrum Archive is available in different editions that support small, medium, and enterprise businesses.
Figure 1-25 Spectrum Archive Single Drive Edition, Library Edition, and Enterprise Edition implementations
IBM Spectrum Archive Enterprise Edition
IBM Spectrum Archive Enterprise Edition (EE) gives organizations an easy way to use cost-effective IBM tape drives and libraries within a tiered storage infrastructure. By using tape libraries instead of disks for Tier 2 and Tier 3 data storage, data that is stored for long-term retention, organizations can improve efficiency and reduce costs. In addition, Spectrum Archive EE seamlessly integrates with the scalability, manageability, and performance of IBM Spectrum Scale, an IBM enterprise file management platform that enables organizations to move beyond simply adding storage—to optimizing data management.
Here are some of the Spectrum Archive Enterprise Edition highlights:
Simplify tape storage with LTFS format, combined with the scalability, manageability, and performance of IBM Spectrum Scale
Help reduce IT expenses by replacing tiered disk storage (Tier 2 and Tier 3) with IBM tape libraries
Expand archive capacity simply by adding and provisioning media, without impacting the availability of data already in the pool
Add extensive capacity to IBM Spectrum Scale installations with lower media, floor space, and power costs
Spectrum Archive EE for the IBM TS4500, IBM TS3500, and IBM TS3310 tape libraries provides seamless integration of Spectrum Archive with Spectrum Scale by creating an LTFS tape tier. You can run any application that is designed for disk files on tape by using Spectrum Archive EE. Spectrum Archive EE can play a major role in reducing the cost of storage for data that does not need the access performance of primary disk. Improve efficiency and reduce costs for long-term, tiered storage.
With Spectrum Archive EE, you can enable the use of LTFS for the policy management of tape as a storage tier in a Spectrum Scale environment and use tape as a critical tier in the storage environment. Spectrum Archive EE supports IBM Linear Tape-Open (LTO) Ultrium 7, 6, and 5 tape drives, and IBM TS1150 and TS1140 tape drives that are installed in TS4500 and TS3500 tape libraries or LTO Ultrium 7, 6, and 5 tape drives that are installed in the TS3310 tape libraries.
The use of Spectrum Archive EE to replace disks with tape in Tier 2 and Tier 3 storage can improve data access over other storage solutions because it improves efficiency and streamlines management for files on tape. Spectrum Archive EE simplifies the use of tape by making it not apparent to the user and manageable by the administrator under a single infrastructure. Figure 1-26 shows the integration of Spectrum Archive EE archive solution with Spectrum Scale.
Figure 1-26 Integration of Spectrum Scale and Spectrum Archive Enterprise Edition
The seamless integration offers transparent file access in a continuous namespace. It provides file-level write and read caching with disk staging area, policy-based movement from disk to tape, creation of multiple data copies on different tapes, load balancing and high availability in multi-node clusters, data exchange on LTFS tape using import and export function, fast import of file name space from LTFS tapes without reading data, built-in tape reclamation and reconciliation, and simple administration and management.
 
More information: http://www.ibm.com/systems/storage/tape/ltfs
IBM Spectrum Archive Library Edition
IBM Spectrum Archive Library Edition extends the file manager capability of the
IBM Spectrum Archive SDE. Spectrum Archive LE is introduced with Version 2.0 of LTFS. Enable easy-as-disk access to single or multiple cartridges in a tape library.
LTFS is the first file system that works with IBM tape technology to optimize ease of use and portability for open-systems tape storage. It manages the automation and provides operating system-level access to the contents of the library. Spectrum Archive LE is based on the LTFS format specification, enabling tape library cartridges to be interchangeable with cartridges that are written with the open source SDE version of Spectrum Archive. IBM Spectrum Archive LE supports most IBM tape libraries:
TS2900 tape autoloader
TS3100 tape library
TS3200 tape library
TS3310 tape library
TS3500 tape library
TS4500 tape library
IBM TS1150 and IBM TS1140 tape drives are supported on IBM TS4500 and IBM TS3500 tape libraries only.
Spectrum Archive LE enables the reading, writing, searching, and indexing of user data on tape and access to user metadata. Metadata is the descriptive information about user data that is stored on a cartridge. Metadata enables searching and accessing of files through the graphical user interface (GUI) of the operating system. Spectrum Archive LE supports both Linux and Microsoft Windows.
Spectrum Archive LE provides the following product features:
Direct access and management of data on tape libraries with LTO Ultrium 7 (LTO-7),
LTO Ultrium 6 (LTO-6), and LTO Ultrium 5 (LTO-5) tape drives, and the TS1150 and TS1140 tape drives
Tagging of files with any text, allowing more intuitive searches of cartridge and library content
Exploitation of the partitioning of the media in LTO tape format standard
One-to-one mapping of tape cartridges in tape libraries to file folders
Capability to create a single file system mount point for a logical library that is managed by a single instance of LTFS and runs on a single computer system
Capability to cache tape indexes and to search, query, and display tape content within an IBM tape library without having to mount tape cartridges
The IBM Spectrum Archive LE offers the same basic capabilities as the SDE with additional support of tape libraries. Each LTFS tape cartridge in the library appears as an individual folder within the file space. The user or application can navigate into each of these folders to access the files stored on each tape. The Spectrum Archive LE software automatically controls the tape library robotics to load and unload the necessary LTFS Volumes to provide access to the stored files.
IBM Spectrum Archive Single Drive Edition
The IBM Spectrum Archive Single Drive Edition implements the LTFS Format and allows tapes to be formatted as an LTFS Volume. These LTFS Volumes can then be mounted by using LTFS to allow users and applications direct access to files and directories stored on the tape. No integration with tape libraries exists in this edition. All data can be accessed and managed in stand-alone tape environments as simply as though it were on disk.
1.7 IBM Tape System Library Manager
IBM Tape System Library Manager Version 1.1 (TSLM) expands and simplifies the use of IBM TS3500 tape libraries by providing a consolidated view of multiple libraries that are capable of storing up to 2.7 Exabytes. The tape pathing maintenance and definitions can be reduced by up to 75%.
TSLM provides a resource management layer between applications, such as IBM Spectrum Protect™ and the tape library hardware. Essentially, TSLM decouples tape resources from applications, which simplifies the aggregation and sharing of tape resources.
TSLM provides the following benefits:
Consolidated, mainframe-class media management services
Centralized repository, access control, and administration
Management beyond physical library boundaries:
 – Access multiple TS3500 tape libraries as a single library image
 – TS3500 tape libraries can be separate (at SAN distances) or connected in a shuttle complex
Dynamic sharing of resources across heterogeneous application boundaries
Security features to permit or prevent application access to tapes:
 – Helps to enable common scratch pool and private pools for every application
 – Ensures secure use and visibility
Policy-based drive and cartridge allocation
Policy-based media-lifecycle management
IBM Spectrum Protect:
 – Simplified path management
 – Simplified device sharing
3494 Emulation of an IBM 3494 library on top of an attached TS3500 library
Figure 1-27 shows the architecture.
Figure 1-27 IBM Tape System Library Manager architecture
TSLM is composed of the following modules with specific functions:
Media Manager
The Media Manager (MM) is the central server component which, among other tasks, coordinates access to drives and cartridges, handles volume allocation and deallocation requests, and stores a log of all activities. The MM uses a TSLM bundled (and constrained) version of IBM DB2® for persistent storage.
Library Manager
The Library Manager (LM) provides the MM access to library media changers. The LM reports all slots, tape drives, and cartridges to the MM, controls libraries on behalf of the MM, and encapsulates (that is, virtualizes) the library hardware. Because of this virtualization, new library hardware can be integrated into TSLM without any changes to an installed MM.
Library Manager for CMC
The Library Manager for Connected Media Changer (CMC LM) provides the MM with control of the TS3500 tape library shuttle complex. The CMC LM discovers shuttle connections and controls movement of cartridges from one library to another by using shuttle connections.
Host Drive Manager
The Host Drive Manager (HDM) reports all local device handles to MM, runs mount and unmount commands, checks the path before a cartridge is loaded, and reports statistical data to MM when a cartridge is unloaded.
External Library Manager
The External Library Manager (ELM) serves as a management layer between IBM Spectrum Protect and the TSLM Media Management software. It translates the Spectrum Protect External Media Management Interface (EMMI) API into commands of the IEEE 1244 Media Management Protocol that is understood by TSLM. The ELM executable file must be on the same server that runs the Spectrum Protect server for TSLM because the Spectrum Protect server directly runs the ELM executable file. The ELM executable file communicates with TSLM over a TCP/IP connection.
For more information about this product, see Chapter 14, “Library management” on page 387 and IBM Tape System Library Manager Version 1 Release 1 User’s Guide, GA32-2208.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.17.167.114