15.7. The Research Accelerator for Multiple Processors Project

During the 32nd International Symposium on Computer Architecture (ISCA) held in Madison, Wisconsin in June 2005, some of the attendees talking together in the hallways between sessions came to several significant conclusions:

  1. Most computer architects believe that microprocessor cores will be used in large numbers (hundreds or even thousands) in future chip designs. (This conclusion echoes ITRS:2005.)

  2. Straightforward [ad hoc] approaches to the architecture of multiple-core processor systems are adequate for small designs (2 to 4 cores) but little formal research has been conducted on rigorous ways to build, program, or manage large systems containing 64 to 1024 processors.

  3. Current compilers, operating systems, and MP hardware architectures are not ready to handle hundreds or thousands of processors.

  4. The computer-architecture research community lacks the basic infrastructure tools required to carry out such MP research.

These realizations led to the gestation of the RAMP (research accelerator for multiple processors) project, which has broad university and industry support. The purpose of the RAMP project is to develop a standardized FPGA-based hardware platform that allows researchers to experiment with processor arrays containing as many as 1024 processors. Design and development problems with such arrays that might be invisible for 32- or even 128-processor arrays can become significant for 1024-processor arrays. The RAMP hardware will consist of several boards, each carrying several large Xilinx FPGAs.

Although other methods of simulating parallel-processing systems exist, they suffer from the following problems (which RAMP will attempt to solve):

  • Slowness. Software simulations of processors are relatively slow. RAMP systems will have high performance by comparison by emulating processors and processor interconnect in FPGA hardware.

  • Target inflexibility. Simulators based on fixed-core processors emulate processors with different ISAs slowly. The RAMP fabric can more easily model processors with different ISAs (which is common with configurable processors such as the Xtensa processor core).

  • System-level model inaccuracy. RAMP is designed to emulate large processor arrays “in their full glory.” Many other MP simulations trade off simulation speed for model accuracy.

  • Scalability. RAMP can be expanded with additional FPGAs and FPGA boards using a standardized inter-board communications link.

  • Unbalanced computation and communication. Clustered computers (as an example) employ layered, bit-serial communication channels between processors. These channels are relatively slow. The RAMP communication channels, contained in the FPGA matrix, can be wide and fast.

A few of the research problems that could be studied using RAMP hardware include:

  • Development and optimization of thread-scheduling and data-allocation/migration algorithms for large MP systems.

  • Development and evaluation of MP dataflow architectures.

  • Evaluation of NoC architectures for large MP systems.

  • Development and evaluation of fault-tolerant MP hardware and software.

  • Testing the effectiveness of task-specialized processors in MP arrays.

Although PC-based computer clusters can also be used to study some of these problems, clusters cost far more per processor and the inter-processor communications between computers in a cluster are much slower than communications between processors in a closely coupled FPGA matrix.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.15.137.91