If a problem can be solved in n time units on a computer with one processor (von Neumann machine), can it be solved in n/2 time units on a computer with two processors, or n/3 on a computer with three processors? This question has led to the rise of parallel computing architectures.
There are four general forms of parallel computing: bit level, instruction level, data level, and task level.
Bit-level parallelism is based on increasing the word size of a computer. In an 8-bit processor, an operation on a 16-bit data value would require two operations: one for the upper 8 bits and one for the lower 8 bits. A 16-bit processor could do the operation in one instruction. Thus, increasing the word size reduces the number of operations on data values larger than the word size. The current trend is to use 64-bit processors.
Instruction-level parallelism is based on the idea that some instructions in a program can be carried out independently in parallel. For example, if a program requires operations on unrelated data, these operations can be done at the same time. A superscalar is a processor that can recognize this situation and take advantage of it by sending instructions to different functional units of the processor. Note that a superscalar machine does not have multiple processors but does have multiple execution resources. For example, it might contain separate ALUs for working on integers and real numbers, enabling it to simultaneously compute the sum of two integers and the product of two real numbers. Such resources are called execution units.
Data-level parallelism is based on the idea that a single set of instructions can be run on different data sets at the same time. This type of parallelism is called SIMD (single instructions, multiple data) and relies on a control unit directing multiple ALUs to carry out the same operation, such as addition, on different sets of operands. This approach, which is also called synchronous processing, is effective when the same process needs to be applied to many data sets. For example, increasing the brightness of an image involves adding a value to every one of several million pixels. These additions can all be done in parallel. See FIGURE 5.8.
Task-level parallelism is based on the idea that different processors can execute different tasks on the same or different data sets. If the different processors are operating on the same data set, then it is analogous to pipelining in a von Neumann machine. When this organization is applied to data, the first processor does the first task. Then the second processor starts working on the output from the first processor, while the first processor applies its computation to the next data set. Eventually, each processor is working on one phase of the job, each getting material or data from the previous stage of processing, and each in turn handing over its work to the next stage. See FIGURE 5.9.
In a data-level environment, each processor is doing the same thing to a different data set. For example, each processor might be computing the grades for a different class. In the pipelining task-level example, each processor is contributing to the grade for the same class. Another approach to task-level parallelism is to have different processors doing different things with different data. This configuration allows processors to work independently much of the time, but introduces problems of coordination among the processors. This leads to a configuration where each of the processors have both a local memory and a shared memory. The processors use the shared memory for communication, so the configuration is called a shared memory parallel processor. See FIGURE 5.10.
The classes of parallel hardware reflect the various types of parallel computing. Multicore processors have multiple independent cores, usually CPUs. Whereas a superscalar processor can issue multiple instructions to execution units, each multicore processor can issue multiple instructions to multiple execution units. That is, each independent core can have multiple execution units attached to it.
Symmetric multiprocessors (SMPs) have multiple identical cores. They share memory, and a bus connects them. The number of cores in an SMP is usually limited to 32 processors. A distributed computer is one in which multiple memory units are connected through a network. A cluster is a group of stand-alone machines connected through an off-the-shelf network. A massively parallel processor is a computer with many networked processors connected through a specialized network. This kind of device usually has more than 1000 processors.
The distinctions between the classes of parallel hardware are being blurred by modern systems. A typical processor chip today contains two to eight cores that operate as an SMP. These are then connected via a network to form a cluster. Thus, it is common to find a mix of shared and distributed memory in parallel processing. In addition, graphics processors that support general-purpose data-parallel processing may be connected to each of the multicore processors. Given that each of the cores is also applying instruction-level parallelism, you can see that modern parallel computers no longer fall into one or another specific classification. Instead, they typically embody all of the classes at once. They are distinguished by the particular balance that they strike among the different classes of parallel processing they support. A parallel computer that is used for science may emphasize data parallelism, whereas one that is running an Internet search engine may emphasize task-level parallelism.
The components that make up a computer cover a wide range of devices. Each component has characteristics that dictate how fast, large, and efficient it is. Furthermore, each component plays an integral role in the overall processing of the machine.
The world of computing is filled with jargon and acronyms. The speed of a processor is specified in GHz (gigahertz); the amount of memory is specified in MB (megabytes), GB (gigabytes), and TB (terabytes); and a display screen is specified in pixels.
The von Neumann architecture is the underlying architecture of most of today’s computers. It has five main parts: memory, the arithmetic/logic (ALU) unit, input devices, output devices, and the control unit. The fetch–execute cycle, under the direction of the control unit, is the heart of the processing. In this cycle, instructions are fetched from memory, decoded, and executed.
RAM and ROM are acronyms for two types of computer memory. RAM stands for random-access memory; ROM stands for read-only memory. The values stored in RAM can be changed; those in ROM cannot.
Secondary storage devices are essential to a computer system. These devices save data when the computer is not running. Magnetic tape, magnetic disk, and flash drives are three common types of secondary storage.
Touch screens are peripheral devices that serve both input and output functions and are appropriate in specific situations such as restaurants and information kiosks. They respond to a human touching the screen with a finger or stylus, and they can determine the location on the screen where the touch occurred. Several touch screen technologies exist, including resistive, capacitive, infrared, and surface acoustic wave (SAW) touch screens. They have varying characteristics that make them appropriate in particular situations.
Although von Neumann machines are by far the most common, other computer architectures have emerged. For example, there are machines with more than one processor so that calculations can be done in parallel, thereby speeding up the processing.
For Exercises 1–16, match the power of 10 to its name or use.
10−12
10−9
10−6
10−3
103
106
109
1012
1015
For Exercises 17–23, match the acronym with its most accurate definition.
CD-ROM
CD-DA
CD-R
DVD
CD-RW
DL DVD
Blu-Ray
Exercises 24–66 are problems or short-answer exercises.
Core 2 processor
Hertz
Random access memory
512 MB machine
2 GB machine
Has your personal information ever been stolen? Has any member of your family experienced this?
How do you feel about giving up your privacy for the sake of convenience?
All secrets are not equal. How does this statement relate to issues of privacy?
People post all sorts of personal information on social media sites. Does this mean they no longer consider privacy important?
18.188.40.207