0%

Storage Systems: Organization, Performance, Coding, Reliability and Their Data Processing was motivated by the 1988 Redundant Array of Inexpensive/Independent Disks proposal to replace large form factor mainframe disks with an array of commodity disks. Disk loads are balanced by striping data into strips—with one strip per disk— and storage reliability is enhanced via replication or erasure coding, which at best dedicates k strips per stripe to tolerate k disk failures. Flash memories have resulted in a paradigm shift with Solid State Drives (SSDs) replacing Hard Disk Drives (HDDs) for high performance applications. RAID and Flash have resulted in the emergence of new storage companies, namely EMC, NetApp, SanDisk, and Purestorage, and a multibillion-dollar storage market. Key new conferences and publications are reviewed in this book.

The goal of the book is to expose students, researchers, and IT professionals to the more important developments in storage systems, while covering the evolution of storage technologies, traditional and novel databases, and novel sources of data. We describe several prototypes: FAWN at CMU, RAMCloud at Stanford, and Lightstore at MIT; Oracle's Exadata, AWS' Aurora, Alibaba's PolarDB, Fungible Data Center; and author's paper designs for cloud storage, namely heterogeneous disk arrays and hierarchical RAID.

• Surveys storage technologies and lists sources of data: measurements, text, audio, images, and video

• Familiarizes with paradigms to improve performance: caching, prefetching, log-structured file systems, and merge-trees (LSMs)

• Describes RAID organizations and analyzes their performance and reliability

• Conserves storage via data compression, deduplication, compaction, and secures data via encryption

• Specifies implications of storage technologies on performance and power consumption

• Exemplifies database parallelism for big data, analytics, deep learning via multicore CPUs, GPUs, FPGAs, and ASICs, e.g., Google's Tensor Processing Units

Table of Contents

  1. Cover image
  2. Title page
  3. Table of Contents
  4. Copyright
  5. Dedication
  6. About the author
  7. Preface
    1. Why this book?
    2. Text overview
    3. Intended audience and required background to read the book
    4. Overview of book chapters
    5. Miscellaneous
    6. Bibliography
  8. Acknowledgments
    1. Bibliography
  9. Abbreviations and acronyms
  10. Chapter 1: Introduction
    1. Abstract
    2. 1.1. Computer systems after WW II
    3. 1.2. High level programming languages - Fortran
    4. 1.3. Effect of data representation on storage space requirements
    5. 1.4. Basic computer arithmetic
    6. 1.5. Author's experience with IBM computers in 1970s
    7. 1.6. IBM's System 360 and its successors
    8. 1.7. The IBM S/360 computer family
    9. 1.8. Operating systems associated with IBM mainframes
    10. 1.9. Early computer companies possibly competing with IBM
    11. 1.10. My experience at Burroughs Corp.
    12. 1.11. Computer company revenue rankings
    13. 1.12. Computer structures book
    14. 1.13. Computer family architectures - CFA
    15. 1.14. Virtual memory and page replacement algorithms
    16. 1.15. Memory space fragmentation and dynamic storage allocation
    17. 1.16. Analysis of thrashing in 2-phase locking - 2PL systems
    18. 1.17. CPU caches
    19. 1.18. Multiprogrammed computer systems
    20. 1.19. Timesharing systems
    21. 1.20. Mean response with FCFS and processor-sharing scheduling
    22. 1.21. Analysis of open and closed queueing network models
    23. 1.22. Bottleneck analysis and balanced job bounds
    24. 1.23. Performance analyses of I/O subsystems
    25. 1.24. Vector supercomputers
    26. 1.25. Parallel computers
    27. 1.26. The future of supercomputing
    28. 1.27. Microprocessor CPUs, GPUs, FPGAs, and ASICs
    29. 1.28. RISCV and other microprocessors
    30. 1.29. The IBM PC and its compatibles
    31. 1.30. Storage studies by Alan Jay Smith at Berkeley
    32. 1.31. Prefetching
    33. 1.32. Database buffers
    34. 1.33. Checkpointing in processing large jobs
    35. 1.34. Computer related rule of thumb
    36. 1.35. Conclusions and summary
    37. Bibliography
  11. Chapter 2: Storage technologies and their data
    1. Abstract
    2. 2.1. Evolution of recording material
    3. 2.2. Advertising and e-commerce
    4. 2.3. Computer storage technologies
    5. 2.4. Reliability studies of DRAM, HDDs, & flash SSDs
    6. 2.5. Storage Networking Industry Association - SNIA
    7. 2.6. Big data and its sources
    8. 2.7. Sources of storage content
    9. 2.8. Ranking and description of media companies
    10. The Cisco Cloud Services Stack - CCSS
    11. 2.9. Sources of news: newspapers, radio and TV stations
    12. 2.10. Text editing and formatting languages
    13. 2.11. Online books sources
    14. 2.12. Free book download web sites
    15. 2.13. Data, image, audio and video compression
    16. 2.14. Main memory data compression
    17. 2.15. Data deduplication in storage systems
    18. 2.16. Up and coming data deduplication companies
    19. 2.17. Storage research at IBM's Almaden Research Center in 1990s
    20. 2.18. Cleversafe and its information dispersal technology
    21. 2.19. Recent developments at IBM Research at ARC
    22. 2.20. Storage research at Hewlett-Packard - HP
    23. 2.21. Primary storage vendors and enterprise companies in 2020
    24. 2.22. All-flash upstart storage companies
    25. 2.23. Hyperconverged infrastructure for storage systems
    26. 2.24. Top enterprise storage backup players
    27. 2.25. Data storage companies: up and coming storage vendors
    28. 2.26. Parallel file systems
    29. 2.27. Cloud storage
    30. 2.28. Jai Menon's predictions on the future of clouds
    31. 2.29. Cloud storage companies
    32. 2.30. Distributed systems research related to clouds
    33. 2.31. Data encryption
    34. 2.32. Conclusions - predictions about storage systems
    35. Bibliography
  12. Chapter 3: Disk drive data placement and scheduling
    1. Abstract
    2. 3.1. The organization of Hard Disk Drives - HDDs
    3. 3.2. Internal organization of files in UNIX
    4. 3.3. Review of disk arm scheduling
    5. 3.4. Disk scheduling for mixed workloads
    6. 3.5. Real time disk scheduling for multimedia
    7. 3.6. Storage virtualization
    8. 3.7. File placement on disk
    9. 3.8. Disks with Shingled Magnetic Recording - SMR
    10. 3.9. Review of analyses of disk scheduling methods
    11. 3.10. Analytic studies of disk storage
    12. 3.11. Analysis of a zoned disk with the FCFS scheduling
    13. 3.12. Performance analysis of the SCAN policy
    14. 3.13. Analysis of the SATF policy
    15. 3.14. Conclusions
    16. Bibliography
  13. Chapter 4: Mirrored & hybrid arrays
    1. Abstract
    2. 4.1. Introduction to mirrored and hybrid disk arrays
    3. 4.2. Mirrored and hybrid disk array organizations
    4. 4.3. Routing read requests in mirrored disks
    5. 4.4. Shortening the tail for response times
    6. 4.5. Improving write performance in mirrored disks
    7. 4.6. Disks with multiple R/W heads on a single and multiple arms
    8. 4.7. Seek distances in single and mirrored disks
    9. 4.8. Mirrored disk performance in normal, degraded, rebuild modes
    10. 4.9. Protecting against rare event failures in archival systems
    11. 4.10. RAIDP: ReplicAtion with IntraDisk Parity for cost effective storage of warm data
    12. 4.11. Remote mirroring for disaster recovery
    13. 4.12. RAID reliability analysis
    14. 4.13. Storage reliability research at IBM's Zurich Research Lab
    15. 4.14. Conclusions
    16. Bibliography
  14. Chapter 5: Redundant Arrays of Independent Disks - RAID
    1. Abstract
    2. 5.1. Redundant Arrays of Inexpensive Disks
    3. 5.2. Early RAID products
    4. 5.3. RAID classification and motivation
    5. 5.4. RAID0 and striping
    6. 5.5. RAID2
    7. 5.6. RAID3
    8. 5.7. RAID4
    9. 5.8. RAID5
    10. 5.9. RAID5 performance analysis in normal mode
    11. 5.10. RAID(4+k) disk arrays in normal and degraded mode
    12. 5.11. Rebuild processing in disk arrays
    13. 5.12. Vacationing server model for rebuild processing
    14. 5.13. RAID5 sparing configurations for rebuild
    15. 5.14. IntraDisk Redundancy - IDR for higher reliability rebuild
    16. 5.15. Disk scrubbing for higher reliability rebuild processing
    17. 5.16. Predictive Failure Analysis - PFA
    18. 5.17. Undetected disk errors and Silent Data Corruption - SDC
    19. 5.18. Clustered RAID5 layouts
    20. 5.19. Clustered RAID designs by Walter Burkhard et al. at UCSD
    21. 5.20. Log-structured file systems and arrays
    22. 5.21. RAID6
    23. 5.22. Reed-Solomon coding for higher reliability
    24. 5.23. Parity based MDS codes
    25. 5.24. RDP arrays and their optimal recovery
    26. 5.25. EVENODD defined and efficient rebuild of a single disk
    27. 5.26. Blaum-Roth - BR code
    28. 5.27. X-code disk arrays and rebuild mode with one and two disk failures
    29. 5.28. The RM2 disk array
    30. 5.29. RAID7
    31. 5.30. Erasure coding for distributed storage
    32. 5.31. ReGenerating codes
    33. 5.32. Protection schemes for flash memories
    34. 5.33. Conclusions
    35. Bibliography
  15. Chapter 6: Coding for multiple disk failures
    1. Abstract
    2. 6.1. Introduction
    3. 6.2. 2-Dimensional string layouts
    4. 6.3. Simple data entanglement layouts with high reliability
    5. 6.4. Reed-Solomon codes
    6. 6.5. A family of MDS block array codes with two parities
    7. 6.6. Codes for correcting two erasures with independent parities
    8. 6.7. Row-Diagonal Parity - RDP codes
    9. 6.8. Short write operations
    10. 6.9. Additional reading
    11. Bibliography
  16. Chapter 7: Saving power in disks, flash memories, and servers
    1. Abstract
    2. 7.1. Introduction to power consumption in computer systems
    3. 7.2. Saving battery power in laptop computers
    4. 7.3. Varying spindown threshold based on user behavior
    5. 7.4. Exploiting idleness in storage systems
    6. 7.5. Making enterprise computers greener by protecting them better
    7. 7.6. Policy optimization for dynamic power management
    8. 7.7. Managing energy and server resources in hosting centers
    9. 7.8. Interplay of energy and performance for RAID running OLTP
    10. 7.9. Dynamic speed control for server disk power management
    11. 7.10. Approaches to conserve disk energy in network servers
    12. 7.11. Energy efficiency through burstiness
    13. 7.12. Dempsey: a tool for modeling hard disk power consumption
    14. 7.13. MAID - Massive Arrays of Idle Disks alternative to tape storage
    15. 7.14. Self-tuning power aware storage cache replacement algorithm
    16. 7.15. Popular Data Concentration - PDC
    17. 7.16. Disk layout optimization for reducing energy consumption
    18. 7.17. Managing server energy and operational costs in hosting centers
    19. 7.18. Performance directed energy management for main memory and disks
    20. 7.19. Exploiting redundancy to conserve energy in storage systems
    21. 7.20. Thermal disk drive design: challenges and possible solutions
    22. 7.21. PARAID: the gear-shifting Power-Aware RAID
    23. 7.22. DiskGroup: energy efficient disk layout for RAID1 systems
    24. 7.23. Pergamum: replacing tape with disk-based archival storage
    25. 7.24. Energy efficient RAID - ERAID
    26. 7.25. Power reduction via write-offloading
    27. 7.26. Redundant Arrays of Hybrid Disks - RAHD
    28. 7.27. Achieving power-efficient, erasure-coded storage
    29. 7.28. Effect of energy-saving schemes on disk reliability
    30. 7.29. Mathematical model of disk reliability versus load and temperature
    31. 7.30. Sample-Replicate-Consolidate mapping - SRCMap
    32. 7.31. Power Proportional Distributed File Systems - PPDFS
    33. 7.32. Dynamic locality improvement to increase effective storage performance
    34. 7.33. Disk data reorganization for reducing energy consumption
    35. 7.34. File assignment with minimal variance of service time
    36. 7.35. Striping-based Energy Aware - SEA placement
    37. 7.36. PEARL: Performance, Energy, and ReLiability balanced dynamic data distribution
    38. 7.37. Power proportionality for data center storage
    39. 7.38. Economic evaluation of energy saving with reliability constraint
    40. 7.39. Dynamic server provisioning for data center power management
    41. 7.40. Modeling the energy costs of I/O workloads
    42. 7.41. Energy proportionality is required in addition to energy efficiency
    43. 7.42. SDD design tradeoffs from energy perspective
    44. 7.43. Green AI
    45. 7.44. Conclusions
    46. Bibliography
  17. Chapter 8: Database parallelism, big data and analytics, deep learning
    1. Abstract
    2. 8.1. Stonebraker's classification of computer systems
    3. 8.2. Comparison of systems from the viewpoint of CPU performance
    4. 8.3. High performance network and channel-based interconnects for storage
    5. 8.4. Concurrency and coherency control in data sharing systems
    6. 8.5. Combined shared disk and nothing systems
    7. 8.6. Parallel systems at IBM Research
    8. 8.7. Interconnection networks in IBM's BlueGene/L
    9. 8.8. Data allocation and transaction routing in multicomputers
    10. 8.9. Data allocation with a distributed relational databases
    11. 8.10. Review of multicomputer Data Base Machines - DBMs
    12. 8.11. Benchmarking in various forms
    13. 8.12. Data Base Machines - DBMs and backend processors
    14. 8.13. Head-per-track disks
    15. 8.14. Active disks projects
    16. 8.15. Multidimensional indices on disk, DRAM, and flash
    17. 8.16. Implementing indices in flash memories
    18. 8.17. Redesign of relational databases by Stonebraker et al.
    19. 8.18. Parallel Data Base Machines - DBMs
    20. 8.19. Google File System, Bigtable, and Spanner
    21. 8.20. Microsoft Azure
    22. 8.21. IBM and other cloud service providers
    23. 8.22. Distributed databases in cloud computing
    24. 8.23. SpringFS bridging agility and performance in elastic distributed storage
    25. 8.24. Snowflake cloud based data warehousing with SQL support
    26. 8.25. Review of peer-to-peer computing
    27. 8.26. Fast Array of Wimpy Nodes - FAWN
    28. 8.27. RAMCloud project at Stanford
    29. 8.28. How flash changes the design of database storage engines
    30. 8.29. Hybrid Transaction Analytic Processing - HTAP
    31. 8.30. Intelligent page store for concurrent txn and query processing
    32. 8.31. Oracle Exadata database machine
    33. 8.32. Oracle in memory option or Database in Main Memory - DBIM
    34. 8.33. MemSQL/SingleStore
    35. 8.34. Amazon Aurora
    36. 8.35. Transaction processing in the cloud
    37. 8.36. RAPID and Oracle AutoML: a fast and predictive AutoML pipeline
    38. 8.37. Benchmarking automatic ML frameworks
    39. 8.38. Alibaba's X-engine
    40. 8.39. RocksDB with ultrafast data access
    41. 8.40. LightStore project at MIT
    42. 8.41. PinK: high-speed in-storage key-value store with bounded tails
    43. 8.42. BlueDBM: an appliance for big data analytics
    44. 8.43. WiSer highly available HTAP DBMS for IoT applications
    45. 8.44. Raven RDBMS at Microsoft provides ML
    46. 8.45. Machine Learning data platform - MLdp
    47. 8.46. Databricks
    48. 8.47. Fungible - a new storage architecture for big data
    49. Ranking of networking companies
    50. 8.48. Network requirements for resource disaggregation
    51. 8.49. Deep learning and associated hardware
    52. 8.50. GPU accelerated database systems
    53. 8.51. Graphics Processing Unit - GPU solutions
    54. 8.52. Field Programmable Gate Array - FPGA solutions
    55. 8.53. Multichip modules
    56. 8.54. Unified solutions
    57. 8.55. Power consumption in FPGAs and ASICs
    58. 8.56. Hybrid approaches to acceleration
    59. 8.57. Application Specific Integrated Circuit - ASIC
    60. 8.58. Tensorflow and Tensor Processing Units - TPUs
    61. 8.59. Increasing computational challenges
    62. 8.60. Quantum Neural Nets - QNNs
    63. 8.61. Data acceleration examples
    64. 8.62. Cerebras wafer size chips vs GPUS
    65. 8.63. Conclusions
    66. Bibliography
  18. Chapter 9: Structured, unstructured, and diverse databases
    1. Abstract
    2. 9.1. Categories of file systems
    3. 9.2. Mainframe count-key-data disk organizations
    4. 9.3. Hierarchical and network Data Base Management Systems - DBMSs
    5. 9.4. Relational data model
    6. 9.5. Ranking methodology for database engines
    7. 9.6. Overall ranking of all database types
    8. 9.7. Relational database management systems
    9. 9.8. Object relational databases
    10. 9.9. Data mining
    11. 9.10. Data warehousing and OLAP
    12. 9.11. Distinct schools of thought in data warehouse design
    13. 9.12. Data lakes
    14. 9.13. Open source big data projects
    15. 9.14. Semi-structured data and its model
    16. 9.15. Big data technology and the five Vs
    17. 9.16. Hadoop technology ecosphere
    18. 9.17. Distributed batch vs inline processing
    19. 9.18. NoSQL/non-relational databases
    20. 9.19. Key-value stores
    21. 9.20. Document stores
    22. 9.21. Time-series databases
    23. 9.22. Kubernetes and other containers
    24. 9.23. Graph databases
    25. 9.24. Object-oriented databases
    26. 9.25. Search engines for text
    27. 9.26. Web search engines
    28. 9.27. Resource Description Framework - RDF
    29. 9.28. Wide column stores
    30. 9.29. Multivalue databases
    31. 9.30. Native XML databases
    32. 9.31. Realtime stream processing
    33. 9.32. Event stores
    34. 9.33. Streaming analytics
    35. 9.34. Trill: a high-performance incremental query processor for diverse analytics
    36. 9.35. Summary of Forrester WaveTM streaming analytics, Q3, 2109
    37. 9.36. Content stores
    38. 9.37. Multimodel databases
    39. 9.38. Main memory databases
    40. 9.39. Distributed file systems and object storage
    41. 9.40. Enterprise Backup and recovery software solutions
    42. 9.41. Analytics and Business Intelligence - ABI platforms
    43. 9.42. Blockchain, Bitcoin, Ethereum
    44. Bibliography
  19. Chapter 10: Heterogeneous Disk Arrays - HDAs
    1. Abstract
    2. 10.1. Introduction to RAID
    3. 10.2. Data allocation in a Heterogeneous Disk Array - HDA
    4. 10.3. Analytic justification for HDA
    5. 10.4. HDA data allocation experiment setup
    6. 10.5. Data allocation experiments
    7. 10.6. Rebuild processing in HDA
    8. 10.7. RAID+ data layout based on Latin squares
    9. 10.8. Related work
    10. 10.9. Using utility functions to provision storage systems
    11. 10.10. Conclusions
    12. Bibliography
  20. Chapter 11: Hierarchical RAID - HRAID
    1. Abstract
    2. 11.1. Introduction to HRAID
    3. 11.2. Intranode & internode coding in HRAID
    4. 11.3. Concurrency control in HRAID
    5. 11.4. RAID IOPS with no disk failures
    6. 11.5. RAID IOPS with disk failures
    7. 11.6. HRAID response times
    8. 11.7. HRAID2/2 performance
    9. 11.8. RAID and HRAID reliability
    10. 11.9. Shortcut reliability analysis of HRAID
    11. 11.10. Simulation to estimate the MTTDL
    12. 11.11. Multistep recovery in HRAID
    13. 11.12. Related work
    14. 11.13. Collective Intelligent Bricks - CIB or Icecube project at IBM
    15. 11.14. Conclusions
    16. Bibliography
  21. Chapter 12: Conclusions
    1. Abstract
    2. Bibliography
  22. Appendix
    1. A.1. Books on topics related to storage
    2. A.2. ACM, IEEE, USENIX, and their publications
    3. A.3. Journals, conferences, and workshops dealing with storage systems
    4. A.4. Web sites for trade publications
    5. A.5. Storage research in industry
    6. A.6. Storage research at universities
    7. A.7. Funding agencies, national labs, and research institutes
    8. Bibliography
  23. Bibliography
    1. Bibliography
  24. Index
3.235.145.95