13.1. RAPIDIO IN STORAGE SYSTEMS

The challenge of successfully managing more data more easily is central to the professional existence of today's IT manager. A quick Google search of 'data storage management' yields literally thousands of references to storage hardware and software products, consulting services, books, and courses, all offering solutions to help meet this challenge.

Both data storage capacity and data storage creation are growing at tremendous rates. According to a Berkeley Information Study 2003[1], 'Print, film, magnetic, and optical storage media produced about 5 exabytes (260) of new information in 2002. Ninety-two percent of this new information was stored on magnetic media, mostly in hard disks.' Another example of the growth of data comes in the form of a quote from Dwight Beeson, a member of the Library Of Congress' Systems Engineering Group [2], 'Since 1993, we have had a compound data storage growth rate of about 120 percent per year.'

13.1.1. Features of Storage

A good example of a typical storage system architecture is the EMC MOSAIC:2000 architecture [3]. This architecture is comprised of a group of front-end processors referred to as channel directors, a large data cache and a set of back-end processors, referred to as disk directors. The front-end processors manage the communications between the host computers and the storage system. The back-end processors manage the communications between the storage system and the physical disk drives. The cache memory is used to retain recently accessed data and to provide a staging area for disassembling data to be sent to multiple physical drives or to reassemble data coming from multiple drives. A system such as this can have dozens of processors, all cooperating in the task of moving data between the hosts and the physical drives. In addition to the transfer of the data, there is also a translation of protocols. The host connections will often be accomplished by a networking interface such as IP operating over Ethernet or Fibre Channel. The interface to the physical drives might be Serial ATA or SCSI. Each interface has its own packet or frame size and formatting requirements.

Figure 13.1. The storage matrix

One can see that the 'cloud' shown in Figure 13.1 as the 'storage matrix' is actually a collection of channel directors, cache memory and disk directors. Within the 'storage matrix' is a 'physical matrix' that is the actual collection of physical disks and connectivity in the system (Figure 13.2). However, the 'physical matrix' will not typically be an optimal representation of storage to the host computers. For this reason the storage system will always provide a 'functionality matrix' view to the host computers. This functional view can be quite different in character from the actual physical characteristics of the storage.

13.1.2. Capacity

A data storage system has various important features. The first and most obvious is sheer capacity (Table 13.1). With hard disk drive capacities doubling every year, and data storage requirements growing at similar rates, the offered capacity of a storage system is typically one of the main selling points. Systems offered in 2004 can provide total storage measured in petabytes (250) with working cache memories reaching the terabyte (240) range. These are very large systems. Management of systems this large requires many dozens of processors and significant high-speed interconnect technology.

A good analogy for this scale factor is to think of the scale difference between a 2 × 2 doghouse, a new house built today, the size of the Petagon with office space of over 3.7 million square feet, and the island of Martha's Vineyard which comprises approximately 100 square miles or almost 3 billion square feet (Figure 13.3).

13.1.3. Availability

Another very important aspect of data storage is data integrity. This can be translated into at least two very important criteria, availability and reliability. Availability is the real-time accessibility or usefulness of data. With the advent of internet-based commerce and online retailers taking orders at any time of the day or night from anywhere in the world, the unavailability of data translates into lost business. Reliability is also critical. How does one ensure that not only is the data, which is being stored, safe, but that it is also accurate? In other words, not only does one need to be able to retrieve the data that was stored, but it must be able to be verified as being exactly the same data that was originally stored.

Figure 13.2. Inside a storage system[4]

Table 13.1. Total system resources for three Generations of EMC symmetrix
 DisksCache memoryBandwidthCPU
Sym4384, 18–36 GB4 GB130 MB/s> 2k MIPS
Sym5384, 74 GB, 146 GB, 181 GB32 GB330 MB/s> 8k MIPS
Sym6576, 146 GB128 GB2400 MB/s32k MIPS

Figure 13.3. Analogies of storage growth

For Enterprise-class storage systems today, availability means 'highly available.' 'Highly available' means that, even in the event of a failure of a component, the service is uninterrupted. Any component failure and subsequent failover must be completely invisible to the normal user. This also means that during any type of normal service, code or system upgrade or reconfiguration, the service is completely maintained at the normal quality of service.

13.1.4. Density

Enterprise storage systems can contain hundreds of hard disk drives, terabytes of memory and dozens of processors. These systems, to achieve their availability targets are also often fully redundant. These high component counts make it very important that the technologies used in the system, including the interconnect technology, be as efficient and as low power as possible.

13.1.5. Manageability

The cost of managing storage often exceeds the cost of the storage itself. The sheer size of the data being stored is one contributing factor to this cost, but the cost of management is also affected by the growth rate of the use of storage, adding online access methods, performing time-consistent backup, as well as many other aspects of operation.

13.1.6. Storage System Applications of RapidIO

The RapidIO architecture was developed to address the need for high-performance inter-connects in embedded systems. This section explores opportunities for the use of RapidIO in enterprise storage systems. Earlier in this chapter the 'physical matrix' of storage was presented. This 'physical matrix' is composed primarily of a high-density matrix of CPUs, cache memories, IO devices and the connectivity to tie all of these devices together.

13.1.6.1. Storage Requirements

In this section an attempt is made to correlate the major connectivity requirements of storage with the characteristics of RapidIO.

  • Connectivity/density: we painted a picture in the previous sections of this chapter of the accelerating growth of storage capacity and the requirements this brings for increased density of connectivity, power and performance. RapidIO provides arguably the lowest cost-per-port of any system interconnect available today. The RapidIO protocol was developed to be easily realizable in common process technologies. When compared with technologies such as Ethernet, it can significantly reduce component count and protocol processing overhead. Because RapidIO switching technology can be provided in a single-chip package, the required power-per-port can be quite low, based on currently available circuit fabrication technology.

  • Size/scalability: with the extremely dynamic requirements of storage capacity comes the demand for a very scalable system interconnect. Often scalability demands must be met by adding to existing storage systems in the field, often while the system is running. The RapidIO interconnect technology supports up to 64 000 end points in a single fabric, and supports memory addressing modes of up to 66 bits. Sufficient to address exabyte size memory spaces. To help with online scaling demands, RapidIO has built in support for hot-swapping. The destinationID-based routing mechanism provides tremendous topological flexibility to system designers. Topologies can be utilized that would be difficult or impossible to achieve with other interconnect technologies such as PCI Express, Ethernet or HyperTransport. The addressing mechanisms provided by RapidIO also can help with virtualization, another important aspect of storage systems. Virtualization is a mechanism of providing alternate mappings for physical storage. For example a set of five disks can appear as a single virtual disk drive to the host.

  • Time-to-market: with the storage needs of information technology continuing to grow so rapidly, it is obviously important for storage vendors to be able to release new products and updates to existing products to address new and growing requirements. Any technology that is being used as an integral piece of a storage product needs to have a very aggressive performance roadmap. RapidIO being a standards-based technology is available from multiple vendors. The fact that RapidIO is not only an industry standard, but an international standard[5] helps ensure its evolution as a ubiquitous technology. The RapidIO Trade Association has also aligned itself with various other high-performance backplane technology groups such as the OIF CEI[6] in a effort to leverage best-available technologies and standards for its future roadmap. These efforts are expected to lead to even higher performance RapidIO technologies becoming available over the next several years.

  • Reliability: for enterprise storage, reliability is perhaps the most important feature that a system offers. To reach the reliability targets many techniques are used. These techniques include storing multiple copies of the data to separate disk drives, using redundant array of independent drive (RAID) techniques to spread data across multiple disks, using parity and error-correcting codes (ECC) protected communication paths and memories, and offering redundant processing subsystems and power supplies.

In the design of enterprise storage systems, every component and communication channel must have a backup that can take over in case of component failure. In an environment such as this, the reliability characteristics of RapidIO are very attractive. All RapidIO packets have several inherent protection mechanisms. The 8B/10B coding scheme used for the Serial RapidIO physical layer has an inherent ability to detect channel errors. All RapidIO data transfer, are fully CRC-protected. Long packets contain two CRCs. There is a CRC appended to the packet after 80 bytes and another at the end of the packet, which is a continuation of the first. The acknowledge ID fields are used to ensure that all packets are received in the correct sequence. Packets are not acknowledged until they have been positively confirmed to have been received correctly. Transmission buffers are not released until a positive acknowledgement has been received from the receiver.

Another aspect of RapidIO that aids reliability is the error management extensions specification. The error management extensions provide the ability for switches and other RapidIO components to proactively provide notification of failure and maintain information associated with the failure event. This is a tremendous aid to the task of architecting a fault-tolerant and high-availability system. The error management extensions also provides high-level support for live insertion. This is a required functionality for creating system architectures that will allow for field replacement of failed components in running systems without any other system disruption.

13.1.7. Conclusion

RapidIO is a well developed system interconnect and fabric standard. It has been designed from the ground up with cost, performance and reliability in mind. The strong correlation between the technology needs of enterprise storage systems and the characteristics of the RapidIO interconnect technology has been demonstrated. RapidIO is an obvious candidate technology for providing the system connectivity for tomorrow's distributed, high-availability, fault-tolerant storage systems.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.15.61.129