0%

Book Description

Praise for Building Clustered Linux Systems

"The author does an outstanding job of presenting a very complicated subject. I very much commend this work. The author sets the pace and provides vital resources and tips along the way. He also has a very good sense of humor that is crafted in the text in such a way that makes the reading enjoyable just when the subject may demand a break. This book should be a requirement for those that are clustering or considering clustering and especially those considering investing a great deal of financial resource toward that goal."

–Joe Brazeal, Information Technician III, Southwest Power Pool

"This book is for Beginner and Intermediate level system administrators, engineers, and researchers, who want to learn how to build Linux clusters. The book covers everything very well."

–Ibrahim Haddad, Senior Researcher, Ericsson Corporate Unit of Research

"Nothing that I know of exists yet that covers this subject in as much depth and detail. The practical ‘hands-on’ approach of this book on how to build a Linux cluster makes this a very valuable reference for a very popular, highly demanded technology."

–George Vish, II, Linux Curriculum Program Manager and Senior Education Consultant, HP

"In my opinion there is a significant lack of literature on this subject. Most of the currently available books are either dated or do not address the complete picture of the range of decisions that must go into building a Linux cluster. I feel comfortable recommending this to anyone interested in building a Linux cluster to better understand both the technical aspects of building and designing a Linux cluster, but also the business aspects of the same."

–Randall Splinter Ph.D., Senior Solution Architect, HP

"The author has set a precedent in the cluster design and integration process that is lacking in the industry today."

--Stephen Gray, Senior Applications Engineer, Altair Engineering, Inc.

The Practical, Step-by-Step Guide to Building and Running Linux Clusters

Low-cost, high-performance Linux clusters are the best solution for an increasingly wide range of technical and business problems. Until now, however, building and managing Linux clusters has required more specialized knowledge than most IT organizations possess. This book dramatically lowers the learning curve, bringing together all the hands-on knowledge and step-by-step techniques you'll need to get the job done.

Using practical examples, Robert Lucke simplifies every facet of cluster design and integration: networking, hardware, architecture, operating environments, data sharing, applications, and more. Lucke, who helped prototype and implement one of the world's largest Linux clusters, systematically addresses the key issues you'll encounter and the key decisions you'll have to make. Coverage includes:

  • Basic clustering concepts, hardware components, and architectural models

  • A step-by-step cluster creation process: design, installation, and testing

  • Choosing and implementing the optimal hardware configuration for your environment

  • Life in the fast LAN: high-speed cluster interconnects

  • Software issues: distributions, bootup, disks, partitioning, file systems, middleware, and more



  • Table of Contents

    1. Copyright
      1. Dedication
    2. Praise for Building Clustered Linux Systems
    3. Hewlett-Packard® Professional Books
    4. List of Figures
    5. List of Tables
    6. Preface
      1. About This Book
      2. Notation and Conventions
      3. Using This Book
      4. Production Information
    7. Acknowledgments
    8. Introduction
    9. I. Introduction to Cluster Concepts
      1. 1. Parallel Power: Defining the Clustered System Approach
        1. 1.1. Avoiding Difficulties with the Word Cluster
        2. 1.2. Defining a Cluster
        3. 1.3. The Evolution of a Clustered Solution
          1. 1.3.1. Uniprocessor Systems (UPs)
          2. 1.3.2. SMP Systems
          3. 1.3.3. Networks of Independent Systems
            1. 1.3.3.1. The Introduction of Microprocessors
            2. 1.3.3.2. Evolution of Network Connections
            3. 1.3.3.3. Remote Procedure Calls
            4. 1.3.3.4. Tying Everything Together
        4. 1.4. Collapsed Network Computing for Engineering
        5. 1.5. Scientific Cluster Computing
          1. 1.5.1. An Example Parallel Problem
          2. 1.5.2. Refining the Parallel Example
          3. 1.5.3. Software Communication Facilities
          4. 1.5.4. High-Speed Interconnect (HSI)
        6. 1.6. Revisiting the Definition of Cluster
        7. 1.7. Commercial Cluster Computing
        8. 1.8. High Performance, High Throughput, and High Availability
        9. 1.9. A Formal Definition of Cluster
        10. 1.10. The Why and Wherefore of Clusters
        11. 1.11. Summary
      2. 2. One Step at a Time: A Process for Building Clusters
        1. 2.1. Building Clusters as a Complex Endeavor
        2. 2.2. Talking about the “P Word”
        3. 2.3. Presenting a Formal Cluster Creation Process
          1. 2.3.1. Phase 1: Cluster Solution Design
            1. 2.3.1.1. Technical Analysis
            2. 2.3.1.2. Preliminary Solution Design
            3. 2.3.1.3. Final Solution Design
          2. 2.3.2. Phase 2: Cluster Installation
            1. 2.3.2.1. Site Preparation
            2. 2.3.2.2. Physical Hardware Assembly
            3. 2.3.2.3. Software Installation and Configuration
          3. 2.3.3. Phase 3: Cluster Testing
            1. 2.3.3.1. Cluster Operational Testing
            2. 2.3.3.2. Cluster Acceptance
            3. 2.3.3.3. Full Operation and Release to Production
        4. 2.4. Formal Cluster Process Summary
    10. II. Cluster Architecture and Hardware Components
      1. 3. Underneath the Hood: Cluster Hardware Components and Architecture
        1. 3.1. Hardware Categories in a Cluster
          1. 3.1.1. Passive Hardware Elements in a Cluster
          2. 3.1.2. Active Hardware Elements in a Cluster
          3. 3.1.3. Cluster Resources and the “Outside” World
        2. 3.2. A Survey of Cluster Hardware Configurations
        3. 3.3. High-Throughput Cluster Configurations
          1. 3.3.1. A “Carpet” Cluster
          2. 3.3.2. Compute “Farms and Ranches”
        4. 3.4. High-Availability Cluster Configurations
          1. 3.4.1. An Example “Virtual” Web Server
          2. 3.4.2. A Parallel Database Server
        5. 3.5. High-Performance Cluster Configurations
          1. 3.5.1. A Visualization Cluster
          2. 3.5.2. High-Performance Parallel Application Configurations
        6. 3.6. Common Cluster Hardware Architecture
        7. 3.7. Cluster Hardware Architecture Summary
      2. 4. Any Way You Slice It: Work and Master Nodes in a Cluster
        1. 4.1. Criteria for Selecting Compute Slices
        2. 4.2. An Example Compute Slice from Hewlett-Packard
          1. 4.2.1. Analysis of the Example Compute Slice
          2. 4.2.2. Comparing the Example Compute Slice with Similar Systems
          3. 4.2.3. Example Clusters Using Our Compute Slices
        3. 4.3. Thirty-two Bit and 64-Bit Compute Slices
          1. 4.3.1. Physical RAM Addressing
          2. 4.3.2. Process Virtual Address Space
          3. 4.3.3. Software Implications of 64-Bit Hardware
        4. 4.4. Memory Bandwidth
        5. 4.5. Memory and Cache Latency
        6. 4.6. Number of Processors in a Compute Slice
        7. 4.7. I/O Interface Capacity and Performance
          1. 4.7.1. PCI Implementation
          2. 4.7.2. Accelerated Graphics Port
        8. 4.8. Compute Slice Operating System Support
        9. 4.9. Master Node Characteristics
        10. 4.10. Compute Slice and Master Node Summary
      3. 5. Packet In: Cluster Networking Basics and Example Devices
        1. 5.1. A Short View of Ethernet Networking History
        2. 5.2. The Open System Interconnect (OSI) Communication Model
        3. 5.3. Ethernet Network Topologies
          1. 5.3.1. Ethernet Frames
          2. 5.3.2. Ethernet Hubs
          3. 5.3.3. Network Routers
        4. 5.4. Internet Protocol and Addressing
          1. 5.4.1. IP and TCP/UDP
          2. 5.4.2. IP Addressing
          3. 5.4.3. IP Subnetting
          4. 5.4.4. IP Supernetting
          5. 5.4.5. Ethernet Unicast, Multicast, and Broadcast Frames
          6. 5.4.6. Address Resolution Protocol (ARP)
          7. 5.4.7. IPv4 and IPv6
          8. 5.4.8. Private, Nonroutable Network Addresses
        5. 5.5. Ethernet Switching Technology
          1. 5.5.1. Half and Full Duplex Operation
          2. 5.5.2. Store and Forward versus Cut-through Switching
          3. 5.5.3. Collision Domains and Switching
          4. 5.5.4. Link Aggregation
          5. 5.5.5. Virtual LANs
          6. 5.5.6. Jumbo Frames
          7. 5.5.7. Managed versus Unmanaged Switches
        6. 5.6. Example Switches
          1. 5.6.1. A GbE Edge Switch
          2. 5.6.2. Ethernet Core Switches
        7. 5.7. Ethernet Networking Summary
      4. 6. Tying It Together: Cluster Data, Management, and Control Networks
        1. 6.1. Networked System Management and Serial Port Access
          1. 6.1.1. Remote System Management Access
          2. 6.1.2. Keyboard, Video, and Mouse Switches
          3. 6.1.3. Serial Port Concentrators or Switches
        2. 6.2. Cluster Ethernet Network Design
          1. 6.2.1. Choosing a Clusterwide IP Address Scheme
          2. 6.2.2. IP Addressing Conventions
          3. 6.2.3. Using Nonroutable Network Addresses
        3. 6.3. An Example Cluster Ethernet Network Design
          1. 6.3.1. Choosing the Type of Network and Address Ranges
          2. 6.3.2. Device Addressing Schemes
          3. 6.3.3. The Management and Control Networks
          4. 6.3.4. The Data Network
          5. 6.3.5. Example IP Address Assignments
        4. 6.4. Cluster Network Design Summary
      5. 7. Life in the Fast LAN: HSIs and Your Cluster
        1. 7.1. HSIs
        2. 7.2. HSI Latency and Bandwidth
        3. 7.3. Examining HSI Topologies
          1. 7.3.1. Some Common Topologies
          2. 7.3.2. Cross-Sectional Bandwidth
          3. 7.3.3. Clos Networks
          4. 7.3.4. Fat Tree Networks
        4. 7.4. Ethernet for HSI
          1. 7.4.1. An Example Ethernet HSI Network
          2. 7.4.2. Direct Attach Example Bandwidth
          3. 7.4.3. Multilevel Attach Example Bandwidth
          4. 7.4.4. A Larger Ethernet HSI Example
          5. 7.4.5. Other Ethernet HSI Configurations
        5. 7.5. Myricom's Myrinet HSI
        6. 7.6. Infiniband
        7. 7.7. Dolphin
        8. 7.8. Quadrics QsNet
        9. 7.9. HSI Technology Summary and Comparison
    11. III. Cluster Software Architecture
      1. 8. The Right Stuff: Linux as the Basis for Clusters
        1. 8.1. Choosing a Cluster Operating System
          1. 8.1.1. Hardware Support
          2. 8.1.2. Operating System Stability
          3. 8.1.3. Software License Costs
          4. 8.1.4. Manageability
          5. 8.1.5. Software Flexibility
          6. 8.1.6. Openness
          7. 8.1.7. Scalability
          8. 8.1.8. Software Availability and Cost
          9. 8.1.9. Multiple Support Options
        2. 8.2. Introducing the Linux Operating System and Licensing
        3. 8.3. Linux Distributions
        4. 8.4. Managing Open-Source Software “Churn”
        5. 8.5. Commercial Linux Distributions
          1. 8.5.1. Red Hat Linux
          2. 8.5.2. SUSE Linux
          3. 8.5.3. Conclusions about Commercial Linux Distributions
        6. 8.6. Free Linux Distributions
          1. 8.6.1. The Fedora Project
          2. 8.6.2. Debian Linux
          3. 8.6.3. Conclusions about Free Distributions
        7. 8.7. Conclusions about Linux for Clusters
      2. 9. Round and Round It Goes: Booting, Disks, Partitioning, and Local File Systems
        1. 9.1. Disk Partitioning, Booting, and the BIOS
          1. 9.1.1. Default Disk Partitioning
          2. 9.1.2. A Brief Note on IA-64 Disk Partitioning
          3. 9.1.3. Red Hat Linux Boot Loaders
        2. 9.2. Booting the Linux Kernel
        3. 9.3. The Linux Initial RAM Disk Image
        4. 9.4. Linux Local Disk Storage
          1. 9.4.1. Using the Software RAID 5 Facility
          2. 9.4.2. Using Software RAID 1 for System Disks
          3. 9.4.3. RAID Multipath
          4. 9.4.4. Recovering from Software RAID Failures
            1. 9.4.4.1. Saving the Disk Partition Table
            2. 9.4.4.2. Determining Software RAID Array Status
            3. 9.4.4.3. Using mdadm in Place of raidtools
            4. 9.4.4.4. Monitoring Arrays with mdadm
        5. 9.5. Linux File System Types
        6. 9.6. The Linux /proc and devfs Pseudo File Systems
        7. 9.7. The Linux ext2 and ext3 Physical File Systems
          1. 9.7.1. File System Volume Labels
          2. 9.7.2. Creating the Example ext3 File System
          3. 9.7.3. Linux ext3 Journal Behavior and Options
          4. 9.7.4. The ext File System Stride Option for RAID
        8. 9.8. Standard Mount Options for All File Systems
        9. 9.9. The Temporary File System
        10. 9.10. Other Available File System Types
        11. 9.11. Advanced Performance Tuning
        12. 9.12. A Word about SMART Monitoring for Disks
        13. 9.13. Local Disks and File Systems Summary
      3. 10. Supporting Role: Infrastructure Services and Administration
        1. 10.1. The Big Infrastructure Picture
        2. 10.2. Initializing Your Cluster's Software Infrastructure
        3. 10.3. Infrastructure Implementation Recommendations
          1. 10.3.1. Avoiding Service Interference
          2. 10.3.2. Redundant Copies of Essential Services
          3. 10.3.3. Services with Fall-Back Capabilities
          4. 10.3.4. Single-Point Administration
          5. 10.3.5. Choosing Efficient Services
          6. 10.3.6. Management of Configuration Information
        4. 10.4. Protecting Active Configuration Information
        5. 10.5. Preparation for Infrastructure Installation
          1. 10.5.1. Order of Installation
          2. 10.5.2. Steps for Installing Infrastructure Services
          3. 10.5.3. Loading the Linux Operating System Distribution
        6. 10.6. Networking
          1. 10.6.1. Configuring Ethernet Switching Equipment
          2. 10.6.2. Network Aliases
          3. 10.6.3. Channel Bonding
          4. 10.6.4. Setting the Ethernet Link MTU Size
          5. 10.6.5. The Media-Independent Interface (MII) Tool
        7. 10.7. Enabling and Starting Linux Services
        8. 10.8. Time Synchronization
        9. 10.9. Name Services
          1. 10.9.1. Host Naming Conventions
          2. 10.9.2. The Name Service Switch File
          3. 10.9.3. The Hosts File
          4. 10.9.4. The DNS
          5. 10.9.5. The NIS
            1. 10.9.5.1. NIS Server Configuration
            2. 10.9.5.2. Modifying the NIS Slave Server List
            3. 10.9.5.3. NIS Slave Server Configuration
            4. 10.9.5.4. NIS Client Systems
            5. 10.9.5.5. Special NIS Configuration Options
            6. 10.9.5.6. Adding Custom NIS Maps
            7. 10.9.5.7. NIS Testing
            8. 10.9.5.8. NIS Summary
          6. 10.9.6. Name Resolution Recommendations
        10. 10.10. Infrastructure Services Summary
      4. 11. Reach Out and Access Something: Remote Access Services, DHCP, and System Logging
        1. 11.1. Continuing Infrastructure Installation
        2. 11.2. “Traditional” User Login and Authentication
          1. 11.2.1. Using Groups and Directory Permissions
          2. 11.2.2. Distributing Password Information with NIS
          3. 11.2.3. Introducing Kerberos
          4. 11.2.4. Configuring a Kerberos KDC on Linux
          5. 11.2.5. Creating a Kerberos Slave KDC
          6. 11.2.6. Kerberos Summary
        3. 11.3. Remote Access Services
        4. 11.4. Using BSD Remote Access Services
        5. 11.5. Kerberized Versions of BSD/ARPA Remote Services
        6. 11.6. The Secure Shell
          1. 11.6.1. SSH and Public Key Encryption
          2. 11.6.2. Configuring the SSH Client and Server
          3. 11.6.3. Configuring User Identity for SSH
          4. 11.6.4. SSH Host Keys, and Known and Authorized Hosts
          5. 11.6.5. Using the Authorized Keys File
          6. 11.6.6. Fine-Tuning SSH Access
          7. 11.6.7. SSH scp and sftp Commands
          8. 11.6.8. SSH Forwarding
          9. 11.6.9. SSH Summary
        7. 11.7. The Parallel Distributed Shell
          1. 11.7.1. Getting and Installing PDSH
          2. 11.7.2. Compiling PDSH to Use SSH
          3. 11.7.3. Using PDSH in Your Cluster
          4. 11.7.4. PDSH Summary
        8. 11.8. Configuring DHCP
          1. 11.8.1. Client-side DHCP Information
          2. 11.8.2. Configuring the DHCP Server
        9. 11.9. Logging System Activity
          1. 11.9.1. Operation of the System Logging Daemon
          2. 11.9.2. Kernel Message Logging
          3. 11.9.3. Enabling Remote Logging
          4. 11.9.4. Using logrotate to Archive Log Files
          5. 11.9.5. Using logwatch Reporting
          6. 11.9.6. An Example Subsystem Logging Design
          7. 11.9.7. Linux System Logging Summary
        10. 11.10. Access and Logging Services Summary
      5. 12. Installment Plan: Introduction to Compute Slice Configuration and Installation
        1. 12.1. Compute Slice Configuration Considerations
        2. 12.2. One Thousand Pieces Flying in Close Formation
        3. 12.3. The Single-System View
          1. 12.3.1. Shared System Structure, Individual System Personality
          2. 12.3.2. Accomplishing Shared System Structure
          3. 12.3.3. Compute Slice Software Requirements
        4. 12.4. A Generalized Network Boot Facility: pxelinux
          1. 12.4.1. Configuring TFTP for Booting
          2. 12.4.2. Configuring the pxelinux Software
          3. 12.4.3. The pxelinux Configuration Files
        5. 12.5. Configuring Network kickstart
          1. 12.5.1. The kickstart File Format
          2. 12.5.2. Making the Install Media Available for kickstart
          3. 12.5.3. The Network kickstart Directory
        6. 12.6. NFS Diskless Configuration
          1. 12.6.1. The Linux Terminal Server Project (LTSP)
          2. 12.6.2. Cluster NFS
        7. 12.7. Introduction to Compute Slice Installation Summary
      6. 13. Improving Your Images: System Installation with SystemImager
        1. 13.1. Using the SystemImager Software
          1. 13.1.1. Downloading and Installing SI
          2. 13.1.2. Configuring the SI Server
          3. 13.1.3. The SI Cold Installation Boot Process
          4. 13.1.4. SI Server Commands
          5. 13.1.5. Installing and Configuring the SI Client Software
          6. 13.1.6. Capturing a Client Image
          7. 13.1.7. Forcing Hardware-to-Driver Mapping with SystemConfigurator
          8. 13.1.8. Installing a Client Image
          9. 13.1.9. Updating Client Software without Reinstalling
          10. 13.1.10. Image Management and Naming
          11. 13.1.11. Avoiding the Big MAC-Gathering Syndrome
          12. 13.1.12. Summary
        2. 13.2. Multicast Installation
          1. 13.2.1. Multicast Basics
          2. 13.2.2. An Open-Source Multicast Facility: udpcast
          3. 13.2.3. A Simple Multicast Example
          4. 13.2.4. A More Complex Example
          5. 13.2.5. Command-line Prototyping with Multicast
          6. 13.2.6. Prototyping a Network Multicast Installation
          7. 13.2.7. Making More Modifications
          8. 13.2.8. Generalizing the Multicast Installation Prototype
          9. 13.2.9. Triggering a Multicast Installation
        3. 13.3. The SI flamethrower Facility
          1. 13.3.1. Installing flamethrower
          2. 13.3.2. Activating flamethrower
          3. 13.3.3. Additional SI Functionality in Version 3.2.0
        4. 13.4. System Installation with SI Summary
      7. 14. To Protect and Serve: Providing Data to Your Cluster
        1. 14.1. Introduction to Cluster File Systems
          1. 14.1.1. Cluster File System Requirements
          2. 14.1.2. Networked File System Access
          3. 14.1.3. Parallel File System Access
        2. 14.2. The NFS
          1. 14.2.1. Enabling NFS on the Server
          2. 14.2.2. Adjusting NFS Mount Daemon Protocol Behavior
          3. 14.2.3. Tuning the NFS Server Network Parameters
          4. 14.2.4. NFS and TCP Wrappers
          5. 14.2.5. Exporting File Systems on the NFS Server
          6. 14.2.6. Starting the NFS Server Subsystem
          7. 14.2.7. NFS Client Mount Parameters
          8. 14.2.8. Using autofs on NFS Clients
          9. 14.2.9. NFS Summary
        3. 14.3. A Survey of Some Open-Source Parallel File Systems
          1. 14.3.1. The Parallel Virtual File System (PVFS)
          2. 14.3.2. The Open Global File System (OpenGFS)
          3. 14.3.3. The Lustre File System
        4. 14.4. Commercially Available Cluster File Systems
          1. 14.4.1. Red Hat Global File System (GFS)
          2. 14.4.2. The PolyServe Matrix File System
          3. 14.4.3. Oracle Cluster File System (OCFS)
        5. 14.5. Cluster File System Summary
      8. 15. Stuck in the Middle: Cluster Middleware
        1. 15.1. Introduction to Cluster Middleware
          1. 15.1.1. Describing the Parallel Application Execution Environment
          2. 15.1.2. The HSI Message-Passing Facility
          3. 15.1.3. Load Balancing or Job Scheduling
          4. 15.1.4. Cluster Resource Management
          5. 15.1.5. Custom Scheduling
          6. 15.1.6. Monitoring, Measuring, and Managing Your Cluster
        2. 15.2. The MPICH Library
          1. 15.2.1. Introduction to MPICH
          2. 15.2.2. Downloading and Installing MPICH
          3. 15.2.3. Using mpirun
          4. 15.2.4. Special Versions of MPICH
          5. 15.2.5. MPICH Summary
        3. 15.3. The Simple Linux Utility for Resource Management
        4. 15.4. The Maui Scheduler
          1. 15.4.1. Maui Scheduler Software Architecture
          2. 15.4.2. Job Scheduling in Maui
          3. 15.4.3. Maui Scheduler Summary
        5. 15.5. The Ganglia Distributed Monitoring and Execution System
          1. 15.5.1. The Ganglia Software Architecture
          2. 15.5.2. Introducing RRD Software: rrdtool
          3. 15.5.3. Downloading and Installing Ganglia Software
          4. 15.5.4. Ganglia's gmond and gmetad Daemons
          5. 15.5.5. Adding Your Own Ganglia Metrics
          6. 15.5.6. Parallel Authentication with authd and gexec
          7. 15.5.7. Starting Parallel Programs with gexec
          8. 15.5.8. Ganglia Summary
        6. 15.6. Monitoring with Nagios
          1. 15.6.1. Explaining Nagios
          2. 15.6.2. Downloading and Installing Nagios
          3. 15.6.3. Configuring the Web Server for Nagios
          4. 15.6.4. Configuring and Using Nagios
          5. 15.6.5. Nagios Summary
        7. 15.7. Cluster Middleware Summary
        8. 15.8. An Afterword on Linux High-Availability and Open-Source
      9. 16. Put Tab A in Slot C: OSCAR, Rocks, OpenMOSIX, and the Globus Toolkit
        1. 16.1. Introducing Cluster-Building Toolkits
        2. 16.2. General Cluster Toolkit Installation Process
        3. 16.3. Installing a Cluster with OSCAR
          1. 16.3.1. OSCAR Initial Software Installation and Configuration
          2. 16.3.2. The OSCAR Installation Wizard
          3. 16.3.3. OSCAR Package Configuration
          4. 16.3.4. Building an OSCAR Compute Slice Image
          5. 16.3.5. Defining and Installing OSCAR Clients
          6. 16.3.6. Completing the OSCAR Installation
          7. 16.3.7. Adding and Deleting OSCAR Clients
          8. 16.3.8. OSCAR Summary
        4. 16.4. Installing a Cluster with NPACI Rocks
          1. 16.4.1. Getting the Rocks Software
          2. 16.4.2. Installing a Cluster Front-End Node Using Rocks
          3. 16.4.3. Completing the Installation
          4. 16.4.4. Rocks System Administration
          5. 16.4.5. Rocks Summary
        5. 16.5. The OpenMOSIX Project
          1. 16.5.1. Getting and Installing OpenMOSIX
          2. 16.5.2. Configuration of OpenMOSIX Clusters
          3. 16.5.3. OpenMOSIX Summary
        6. 16.6. Introduction to the Grid Concept
        7. 16.7. The Globus Toolkit
        8. 16.8. Cluster-Building Toolkit Summary
    12. IV. Building and Deploying Your Cluster
      1. 17. Dollars and Sense: Cluster Economics
        1. 17.1. Initial Perceptions
        2. 17.2. Setting the Ground Rules
        3. 17.3. Cluster Cabling and Complexity
        4. 17.4. Eight-Compute Slice Cluster Hardware Costs
        5. 17.5. Sixteen-Compute Slice Cluster Hardware Costs
        6. 17.6. Thirty-two-Compute Slice Hardware Costs
        7. 17.7. Sixty-four-Compute Slice Hardware Costs
        8. 17.8. One Hundred Twenty-eight-Compute Slice Hardware Costs
        9. 17.9. The Land beyond 128 Compute Slices
        10. 17.10. Hardware Cost Trends and Analysis
        11. 17.11. Cluster Economics Summary
      2. 18. Racking Your Brains: Example Cluster Rack Assembly Steps
        1. 18.1. Examining the Cluster Assembly Process
        2. 18.2. Assembly Assumptions
        3. 18.3. Some “Rules of Thumb” for Physical Cluster Assembly
        4. 18.4. Detailed Cluster Assembly Steps
          1. 18.4.1. Physical Rack Assembly
          2. 18.4.2. Physical Management Rack Assembly
          3. 18.4.3. Physical Compute Rack Assembly
          4. 18.4.4. Physical Compute Rack System Installation
          5. 18.4.5. Physical Rack Final Assembly and Checkout
          6. 18.4.6. Individual System Checkout
          7. 18.4.7. Physical Rack Cleanup
          8. 18.4.8. Physical Rack Positioning
          9. 18.4.9. Interrack Configuration
          10. 18.4.10. Interrack Cabling
          11. 18.4.11. Final Cluster Hardware Assembly and Checkout
            1. 18.4.11.1. Master Rack Power-on
            2. 18.4.11.2. Compute Rack Power-on
            3. 18.4.11.3. Clusterwide Hardware Verification
        5. 18.5. Learning from the Example Steps
          1. 18.5.1. Finding Efficiencies in Cluster Construction
          2. 18.5.2. Parallelism in Rack Verification and Checkout
          3. 18.5.3. Parallelism in Interrack Cabling
          4. 18.5.4. Types of Teams and Specific Skills
        6. 18.6. Physical Assembly Conclusions
      3. 19. Getting Your Cluster Wired: An Example Cable-Labeling Scheme
        1. 19.1. Defining the Cable Problem
        2. 19.2. Different Classes of Cabling
          1. 19.2.1. Intrarack Cables
          2. 19.2.2. Interrack Cables
        3. 19.3. A First Pass at a Cable-Labeling Scheme
        4. 19.4. Refining the Cable Documentation Scheme
          1. 19.4.1. Labeling Cable Ends
          2. 19.4.2. Tracking and Documenting the Connections
        5. 19.5. Calculating the Work in Cable Installation
        6. 19.6. Minimizing Interrack Cabling
        7. 19.7. Cable Labeling System Summary
      4. 20. Physical Constraints: Heat, Space, and Power
        1. 20.1. Identifying Physical Constraints for Your Cluster
        2. 20.2. Space, the Initial Frontier
        3. 20.3. Power-Up Requirements
        4. 20.4. System Power Utilization
        5. 20.5. Taking the Heat
        6. 20.6. Physical Constraints Summary
      5. A. Acronym List
      6. B. List of URLs and Software Sources
        1. B.1. Cluster Construction Tool Kits
        2. B.2. Cluster Design Tools
        3. B.3. Conversion Factors
        4. B.4. File Systems and Volume Management
        5. B.5. General Linux Software
        6. B.6. Grid Tool Kits and Software
        7. B.7. Hardware Vendors
        8. B.8. High-Availability Software
        9. B.9. High-Performance Graphics
        10. B.10. HSI Technologies
        11. B.11. Java Software for Linux
        12. B.12. Linux Distributions and Open-Source License Examples
        13. B.13. Monitoring and Event Generation Software and Dependencies
        14. B.14. Networking Software, Hardware, and Examples
        15. B.15. Open-Source Databases
        16. B.16. Parallel Applications and Development Tools
        17. B.17. Parallel Application Examples
        18. B.18. Performance Benchmarks and Lists
        19. B.19. Protocols and Messaging Libraries
        20. B.20. Resource Management, Parallel Execution, and Scheduling
        21. B.21. Security and Encryption
        22. B.22. System Installation and Management Tools
    13. Glossary
    14. Bibliography
    18.191.5.239