Chapter 9. Using Virtualization: The Right Tool for the Job

Highlighting When Use Cases Are Confused with Technology

Media reports often confuse base virtualization technology with specific use cases or instances of use of virtualization technology. So, for example, when virtual machine technology, one of the five types of processing virtualization, is used, media reports might discuss that as either server virtualization or desktop virtualization. The use of application virtualization, processing virtualization, or storage virtualization might be discussed as “clustering.” Let’s examine a few use cases or instances and see what technology is really being used:

  • “Big data”

  • Clusters

  • Desktop virtualization

  • High-performance computing

  • Server virtualization

  • Extreme transaction processing

This chapter is meant to examine industry catchphrases and quickly review which parts of the virtualization model are actually in use. Please refer to the chapter on each of those topics for more information.

Big Data

“Big data” (see Figure 9-1 ), a specific use of a combination of processing virtualization and storage virtualization, is a catchphrase that has been bubbling up from the high-performance computing niche of the IT market (more about that later in this chapter). Yes, this configuration is a cluster (see the next section) that is being used specifically to manage extremely large stores of rapidly changing data.

Big data
Figure 9-1. Big data

Increasingly suppliers of processing virtualization and storage virtualization software have begun to flog “big data” in their presentations. What, exactly, does this phrase mean? If you sit through the presentations of ten suppliers of technology, fifteen different definitions are likely to come forward. Each definition, of course, tends to support the need for that supplier’s products and services. (Imagine that.) In simplest terms, the phrase refers to tools, processes, and procedures that allow an organization to create, manipulate, and manage very large data sets and storage facilities.

Does this mean terabytes, petabytes, or even larger collections of data? The answer offered by these suppliers is “yes.” They would go on to say that you need their products to manage and make best use of that mass of data. Managing huge, dynamic sets of data can be problematic without the appropriate tools and processes.

An example often cited is the mass of weather data collected daily by the U.S. National Oceanic and Atmospheric Administration (NOAA) to aid in climate, ecosystem, weather, and commercial research. Add that to the masses of data collected by the U.S. National Aeronautics and Space Administration (NASA) for its research, and the numbers get pretty big.

The commercial sector has its poster children as well. Energy companies have amassed huge amounts of geophysical data. Pharmaceutical companies routinely munch their way through enormous amounts of drug testing data. Commercial sites, such as Facebook and Twitter maintain huge amounts of data that changes moment by moment, as well. Large organizations increasingly face the need to maintain large amounts of structured and unstructured data to comply with government regulations. Recent court cases have also lead them to keep masses of documents, email messages, and other forms of electronic communication that may be required if they face litigation.

The following virtualization technology is typically in use when people discuss “big data”:

  • Storage virtualization (distributed file systems)

  • Virtual processing (parallel processing monitors, workload management monitors, memory virtualization)

  • Management for virtual environments

If your organization is dealing with extremely large amounts of data or data that changes more rapidly than a typical database can manage, you are clearly in the big data category.

Since the requirements for “big data” applications differ from many other forms of structured data, many suppliers are offering new data management tools. Sometimes the suppliers call their products NoSQL databases. They may also talk about segmenting the database so that pieces of it may be spread over many machines. In this case, many suppliers speak about packaging and supporting open source data management technologies such as Apache’s Hadoop and Cassandra projects.

Clusters

There are many different uses for a configuration that harnesses the power of many (up to thousands of) systems (see Figure 9-2). Although different technology may be in use, all of these configurations are called clusters. The fact that the same word is used to describe different uses of technology creates quite a bit of confusion. Depending upon the requirements, different virtualization technology may be in use.

Clusters
Figure 9-2. Clusters

Here are the typical requirements and technology:

High performance

When huge amounts of computational power must be applied to speed up the processing of a single task, a parallel processing monitor, a form of processing virtualization, is used to harness the computational power of many systems. These clusters may include desktop systems as well as servers. This type of configuration is often used for modeling financial risk, nondestructive testing, rendering digital graphics, or other forms of modeling. Typically, these tasks run directly on the physical systems rather than being encapsulated into virtual machines or run in partitions when operating system virtualization is being used.

Scalability

When large numbers of people need to access the same application, a similar system configuration might be harnessed using workload management monitors, a form of virtual processing software. As transactions come in to be processed, the workload manager sends them to the system that has the most available capacity. Since performance is also a critical requirement in this use case, it is quite likely that the applications will be hosted directly on the physical systems, rather than being hosted on a virtual server. Server-centric application virtualization technology often contains a workload management function. Even though the workload management is happening at the application level, the multiserver configuration might be called a cluster.

Big data storage

Memory virtualization or distributed cache software is used to spread data out among a large number of systems, making it possible to access and update large amounts of data very quickly. Sometimes this configuration is described as a Hadoop or Cassandra cluster.

Storage cluster

In this case, storage servers, not general-purpose computing systems, are clustered together to create a very large high-performance storage environment for general-purpose computing functions. Distributed file system software might be used to access this data. The storage servers might use a special-purpose storage network called a SAN. Storage server suppliers also support a form of memory virtualization among the members of the storage server cluster.

Desktop Virtualization

Desktop virtualization is the use of several virtualization technologies, either together or separately. Let’s look at each of these cases in turn.

When “desktop virtualization” is used to describe making it possible for people to access a physical or virtual system remotely, access virtualization technology is used to capture the user interface portion of an application. It is then converted to a neutral format and projected across the network to a device that can display the user interface and allow the user to enter and access information (see Figure 9-3). This means that just about any type of network-enabled device could be used to access the application. Suppliers such as Citrix, Microsoft, and VMware offer client software for tablets, smartphones, laptops, and PC, making it possible for users of those devices to access the applications running elsewhere on the network.

Desktop virtualization via access virtualization
Figure 9-3. Desktop virtualization via access virtualization

When “desktop virtualization” is used to describe encapsulating an application using client-side application virtualization technology and then projecting it in whole or piecemeal to a remote system for execution, the application could either remain on that client device or be deleted once the user completes the task, depending on the settings used by the IT administrator (see Figure 9-4). This means, of course, that the client system has to run the operating system needed by the application. So, Windows applications, for example, would need to run on Windows executing on a PC or laptop.

Desktop virtualization via application virtualization
Figure 9-4. Desktop virtualization via application virtualization

When “desktop virtualization” is used to describe encapsulating the entire stack of software that runs on a client system, the phase starts to take on a great deal of complexity (see Figure 9-5). That encapsulated virtual client system becomes highly mobile. Here are the possibilities:

  • One or more virtual client systems could execute on a single physical client system. This allows personal applications to run side by side with locked-down corporate applications.

  • Local execution. Virtual client systems could run on a local blade server. The user interface is projected to physical PCs, laptops, or thin client systems using access virtualization technology.

  • Remote execution. Virtual client systems could run on a server that resides in the organization’s data center. The user interface is projected to physical PCs, laptops, or thin client systems using access virtualization technology.

Desktop virtualization via processing virtualization
Figure 9-5. Desktop virtualization via processing virtualization

Since the industry is using the same phrase to describe all of these different approaches, the concept of desktop virtualization can be quite confusing to those unfamiliar with all of the different types of technology that could be pressed into service.

High-Performance Computing

When an application or a workload require more computational power than is available from a single computer because of either technical or financial limitations, organizations harness a large number of computers (yes, a cluster) to work on a single task or a small number of tasks (see Figure 9-6). Typically a parallel processing monitor, a type of processing virtualization software, is used to manage these systems. The monitor sends some work to each system. As systems complete their tasks, they send the results back to the system running the monitor and request another task.

High-performance computing
Figure 9-6. High-performance computing

This approach is used to support financial modeling, geophysical modeling, risk analysis, scientific research, digital content creation, and a number of other tasks that require enormous processing power.

Server Virtualization

Server virtualization is the use of either virtual machine technology or operating system virtualization and partitioning technology to make a single physical server support multiple independent workloads (see Figure 9-7). If operating system virtualization and partitioning technology is used, all of the workloads must be supported by a single operating system. If virtual machine technology is being used, each virtual machine runs a different operating system. This could be different versions of the same operating system (such as Windows 2003, Windows 2008, etc.) or many different operating systems (Windows, Linux, UNIX, etc.).

Server virtualization
Figure 9-7. Server virtualization

This approach more fully utilizes the power of the underlying physical system. Since all of the individual workloads are sharing a single computer, this approach is not selected when the goal is high-performance computing or extreme transaction processing.

Extreme Transaction Processing

Extreme transaction processing (see Figure 9-8) is the use of a number of virtualization technologies—such as workload managers, memory virtualization, and virtual storage technology—to create an environment that can support hundreds of thousands or, perhaps, millions of transactions per second. The work is spread over a large number of computing and storage resources (yes, another version of a cluster).

Extreme transaction processing
Figure 9-8. Extreme transaction processing
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.147.67.25