Chapter 12: Processor Virtualization

This chapter introduces the concepts underlying processor virtualization and explores the many benefits to individual users and large organizations that are achievable through the effective use of virtualization. We will discuss the principal virtualization techniques and the open source and commercial tools that implement them.

Virtualization tools enable the emulation of instruction set–accurate representations of various computer architectures and operating systems on general-purpose computers. Virtualization is used widely in the deployment of real-world software applications in cloud environments.

After completing this chapter, you will understand the technology and benefits associated with hardware virtualization and how modern processors support virtualization at the instruction set level. You will have learned the technical features of several open source and commercial tools providing virtualization capabilities and will understand how virtualization is used to build and deploy scalable applications in cloud computing environments.

The following topics will be presented in this chapter:

  • Introducing virtualization
  • Virtualization challenges
  • Virtualizing modern processors
  • Virtualization tools
  • Virtualization and cloud computing

Technical requirements

The files for this chapter, including the answers to the exercises, are available at https://github.com/PacktPublishing/Modern-Computer-Architecture-and-Organization.

Introducing virtualization

In the domain of computer architecture, virtualization refers to the use of hardware and software to create an emulated version of an environment in which a piece of software runs, as opposed to the real environment in which the code normally expects to run.

We have already looked at one form of virtualization in some depth: virtual memory. Virtual memory uses software, with supporting hardware, to create an environment in which each running application functions as if it has exclusive access to the entire computer, including all the memory it requires at the addresses it expects. Virtual address ranges used by a program can even be the same as those in use by other currently running processes.

Systems using virtual memory create multiple sandboxed environments in which each application runs without interference from other applications, except in competition for shared system resources. In the virtualization context, a sandbox is an isolated environment in which code runs without interference from anything outside its boundaries, and which prevents code inside the sandbox from affecting resources external to it. This isolation between applications is rarely absolute, however. For example, even though a process in a virtual memory system cannot access another process's memory, it may do something else, such as delete a file that is needed by a second process, which may cause problems for the other process.

Our primary focus in this chapter will be on virtualization at the processor level, allowing one or more operating systems to run in a virtualized environment on a computer system, abstracted from the physical layer of the hardware. Several other types of virtualization are also widely used.

The next section will briefly describe the various categories of virtualization you are likely to encounter.

Types of virtualization

The term virtualization is applied in several different computing contexts, especially in larger network environments, such as businesses, universities, government organizations, and cloud service providers. The definitions that follow here will cover the most common types of virtualization you are likely to come across.

Operating system virtualization

We will cover operating system virtualization in detail later in this chapter. A virtualized operating system runs under the control of a hypervisor. A hypervisor is a combination of software and hardware capable of instantiating and running virtual machines. The prefix hyper refers to the fact that the hypervisor supervises the supervisor mode of the operating systems running in its virtual machines. Another term for hypervisor is virtual machine monitor.

There are two general types of hypervisor:

  • A type 1 hypervisor, sometimes referred to as a bare metal hypervisor, includes software for managing virtual machines that runs directly on the hardware of a host computer.
  • A type 2 hypervisor, also called a hosted hypervisor, runs as an application program that manages virtual machines under a host operating system.

    Hypervisor versus virtual machine monitor

    Technically, a virtual machine monitor is not exactly the same as a hypervisor, but for our purposes, we will treat the terms as synonymous. A virtual machine monitor is responsible for virtualizing a processor and other computer system components. A hypervisor combines a virtual machine monitor with an underlying operating system, which may be dedicated to hosting virtual machines (a type 1 hypervisor), or it may be a general-purpose operating system (a type 2 hypervisor).

The computer running the hypervisor is referred to as the host. Operating systems running within hypervisor-managed virtual environments on a host system are called guests.

Regardless of its type, a hypervisor enables guest operating systems and applications running within them to be brought up and executed in virtualized environments. A single hypervisor is capable of supporting multiple virtual machines running on a single processor simultaneously. The hypervisor is responsible for managing all requests for privileged operations initiated by guest operating systems and the applications running within them. Each of these requests requires a transition from user mode to kernel mode, and then back to user mode. All I/O requests from applications on guest operating systems involve privilege level transitions.

Since operating system virtualization in a type 2 hypervisor involves running an operating system under the hypervisor in the host operating system, a natural question is, what happens if you run another copy of the hypervisor within the operating system of the virtual machine? The answer is that this approach is supported in some, but not all, combinations of hypervisor, host OS, and guest OS. This configuration is referred to as nested virtualization.

The next thing you might wonder about nested virtualization is why anyone would want to do such a thing. Here is one scenario where nested virtualization is useful: assume that your business's primary web presence is implemented as a virtualized operating system image containing a variety of installed and custom software components. If your cloud service provider goes offline for some reason, you will need to bring the application up at an alternative provider quickly.

The Google Compute Engine (https://cloud.google.com/compute), for example, provides an execution environment implemented as a virtual machine. Compute Engine allows you to install a hypervisor in this virtual machine and bring up your application virtual machine within it, putting your web presence back online with minimal installation and configuration.

Application virtualization

Instead of creating a virtual environment to encapsulate an entire operating system, it is possible to virtualize at the level of a single application. Application virtualization abstracts the operating system from the application code and provides a degree of sandboxing.

This type of virtualization allows programs to run in an environment that differs from the intended application target environment. For example, Wine (https://www.winehq.org/) is an application compatibility layer allowing programs written for Microsoft Windows to run under POSIX-compliant operating systems, typically Linux variants. The Portable Operating System Interface (POSIX) is a set of IEEE standards providing application programming compatibility between operating systems. Wine translates Windows library and system calls to equivalent POSIX calls in an efficient manner.

Application virtualization replaces portions of the runtime environment with a virtualization layer and performs tasks such as intercepting disk I/O calls and redirecting them to a sandboxed, virtualized disk environment. Application virtualization can encapsulate a complex software installation process, consisting of hundreds of files installed in various directories, as well as numerous Windows registry modifications, in an equivalent virtualized environment contained within a single executable file. Simply copying the executable to a target system and running it brings up the application as if the entire installation process had taken place on the target.

Network virtualization

Network virtualization is the connection of software-based emulations of network components, such as switches, routers, firewalls, and telecommunication networks, in a manner that represents a physical configuration of these components. This allows operating systems and the applications running on them to interact with and communicate over the virtual network in the same manner they would on a physical implementation of the same network architecture.

A single physical network can be subdivided into multiple virtual local area networks (VLANs), each of which appears to be a complete, isolated network to all systems connected on the same VLAN.

Multiple computer systems at the same physical location can be connected to different VLANs, effectively placing them on separate networks. Conversely, computers at distant geographic separations can be placed on the same VLAN, making it appear as if they are interconnected within a small local network.

Storage virtualization

Storage virtualization is the separation of physical data storage from the logical storage structure used by operating systems and applications. A storage virtualization system manages the process of translating logical data requests to physical data transfers. Logical data requests are addressed as block locations within a disk partition. Following the logical-to-physical translation, data transfers may ultimately interact with a storage device that has an organization completely different from the logical disk partition.

The process of accessing physical data given a logical address is similar to the virtual-to-physical address translation process in virtual memory systems. The logical disk I/O request includes information such as a device identifier and a logical block number. This request must be translated to a physical device identifier and block number. The requested read or write operation then takes place on the physical disk.

Storage virtualization in data centers often includes several enhancements that increase the reliability and performance of data storage systems. Some of these improvements are as follows:

  • Centralized management enables monitoring and control of a large collection of storage devices, possibly of different sizes, and from different vendors. Because all virtualized storage appears the same to client applications, any vendor-specific variations in storage devices are hidden from users.
  • Replication provides transparent data backup and disaster recovery capabilities for mission-critical data. When performing real-time replication, writes to the storage array are immediately copied to one or more remote replicas.
  • Data migration allows administrators to move data to a different physical location or switch to a replica while concurrent data I/O operations continue without interruption. Because the storage virtualization management system has full control over disk I/O, it can switch the target of any logical read or write operation to a different physical storage device at any time.

The next section will introduce some of the most common methods of processor virtualization in use today.

Categories of processor virtualization

The ideal mode of operation for a processor virtualization environment is full virtualization. With full virtualization, binary code in operating systems and applications runs in the virtual environment with no modifications whatsoever. Guest operating system code performing privileged operations executes under the illusion that it has complete and sole access to all machine resources and interfaces. The hypervisor manages interactions between guest operating systems and host resources, and takes any steps needed to deconflict access to I/O devices and other system resources for each virtual machine under its control.

Our focus in this chapter is processor virtualization, enabling the execution of complete operating systems and applications running on them in a virtualized environment.

Historically, there have been several different methods used for the implementation of virtualization at the processor level. We'll take a brief look at each of them, beginning with an approach first implemented on systems such as the IBM VM/370, introduced in 1972. VM/370 was the first operating system specifically designed to support the execution of virtual machines.

Trap-and-emulate virtualization

In a 1974 article entitled Formal Requirements for Virtualizable Third Generation Architectures, Gerald J. Popek and Robert P. Goldberg described the three properties a hypervisor must implement to efficiently and fully virtualize a computer system, including the processor, memory, storage, and peripheral devices:

  • Equivalence: Programs (including the guest operating system) running in a hypervisor must exhibit essentially the same behavior as when they run directly on machine hardware, excluding the effects of timing.
  • Resource control: The hypervisor must have complete control over all of the resources used by the virtual machine.
  • Efficiency: A high percentage of instructions executed by the virtual machine must run directly on the physical processor, without hypervisor intervention.

For a hypervisor to satisfy these criteria, the hardware and operating system of the computer on which it is running must grant the hypervisor the power to fully control the virtual machines it manages.

The code within a guest operating system assumes it is running directly on the physical processor hardware and has full control of all the features accessible via system hardware. In particular, guest operating system code executing at the kernel privilege level must be able to execute privileged instructions and access regions of memory reserved for the operating system.

In a hypervisor implementing the trap-and-emulate virtualization method, portions of the hypervisor run with kernel privilege, while all guest operating systems (and, of course, the applications running within them) operate at the user privilege level. Kernel code within the guest operating systems executes normally until a privileged instruction attempts to execute or a memory-access instruction attempts to read or write memory outside the user-space address range available to the guest operating system. When the guest attempts any of these operations, a trap occurs.

Exception types: faults, traps, and aborts

The terms fault, trap, and abort are used to describe similar exception events. The primary differences between each of these exception types are as follows:

A fault is an exception that ends by restarting the instruction that caused the exception. For example, a page fault occurs when a program attempts to access a valid memory location that is currently inaccessible. After the page fault handler completes, the triggering instruction is restarted, and execution continues from that point.

A trap is an exception that ends by continuing the execution with the instruction following the triggering instruction. For example, execution resumes after the exception triggered by a debugger breakpoint by continuing with the next instruction.

An abort represents a serious error condition that may be unrecoverable. Problems such as errors accessing memory may cause aborts.

The fundamental trick (if you want to think of it that way) to enable trap-and-emulate virtualization is in the handling of the exceptions generated by privilege violations. While it is starting up, the hypervisor routes the host operating system exception handlers into its own code. Exception handlers within the hypervisor perform the processing of these exceptions before the host operating system has a chance to handle them.

The hypervisor exception handler examines the source of each exception to determine if it was generated by a guest operating system under the hypervisor's control. If the exception originated from a guest the hypervisor manages, the hypervisor handles the exception, emulating the requested operation, and returns execution control directly to the guest. If the exception did not come from a guest belonging to the hypervisor, the hypervisor passes the exception to the host operating system for processing in the normal manner.

For trap-and-emulate virtualization to work in a comprehensive and reliable manner, the host processor must support the criteria defined by Popek and Goldberg. The most critical of these requirements is that any guest instruction attempting to access privileged resources must generate a trap. This is absolutely necessary because the host system has only one set of privileged resources (we're assuming a single-core system here for simplicity) and the host and guest operating systems cannot share those resources.

As an example of the types of privileged information controlled by the hypervisor, consider the page tables used to manage virtual memory. The host operating system maintains a collection of page tables that oversee the entirety of the system's physical memory. Each guest operating system has its own set of page tables that it believes it is using to manage physical and virtual memory on the system it controls. These two sets of page tables contain substantially different data, even though both of them ultimately interact with the same physical memory regions. Through the trapping mechanism, the hypervisor is able to intercept all guest operating system attempts to interact with page tables and direct those interactions to a guest-specific memory region containing page tables used only by the guest operating system. The hypervisor then manages the necessary translation between addresses used by instructions executing in the guest operating system and the host system's physical memory.

The greatest barrier to the widespread use of virtualization in the late 1990s and early 2000s was the fact that the general-purpose processors in common use at the time (x86 variants) did not support the Popek and Goldberg virtualization criteria. The x86 instruction sets contained a number of instructions that allowed unprivileged code to interact with privileged data without generating a trap. Many of these instructions merely permitted unprivileged code to read selected privileged registers. While this may seem harmless, it caused a severe problem for virtualization because there is only one copy of each of those registers in the entire machine, and each guest OS may need to maintain different values in those registers.

Later versions of the x86, beginning in 2006, added hardware features (Intel virtualization technology (VT-x), and AMD virtualization (AMD-V)) enabling full virtualization under the Popek and Goldberg criteria.

The virtualization requirements defined by Popek and Goldberg assumed the use of the trap-and-emulate technique, which was widely viewed as the only practical virtualization method in the 1970s, was the only feasible method for processor virtualization. In the following sections, we will see how it is possible to perform effective and efficient virtualization on a computer system that does not fully comply with the Popek and Goldberg criteria.

Paravirtualization

Because most, if not all, of the instructions that require special handling in the virtualized environment reside in the guest operating system and its device drivers, one method for rendering the guest virtualizable is to modify the operating system and its drivers to explicitly interface with the hypervisor in a non-trapping manner. This approach can result in substantially better guest OS performance than a system running under a trap-and-emulate hypervisor because the paravirtualized hypervisor interface is composed of optimized code rather than a series of trap handler invocations. In the trap-and-emulate method, the hypervisor must process every trap in a generic handler that begins by determining whether the trap even comes from a guest OS it controls before further processing to determine the desired operation and emulate its effects.

The primary drawback of paravirtualization is the need to modify the guest operating system and its drivers to implement the hypervisor interface. There has been limited interest in fully supporting a paravirtualization interface among the maintainers of major operating system distributions.

Binary translation

One way to deal with problematic instructions within processor architectures that lack full support for virtualization is to scan the binary code prior to execution to detect the presence of nonvirtualizable instructions. Where such instructions are found, the code is translated into virtualization-friendly instructions that produce identical effects.

This has proven to be a popular approach for virtualization in the x86 architecture. The combination of trap-and-emulate with the binary translation of nonvirtualizable instructions permits reasonable guest OS performance. This technique keeps the amount of processing required to deal with nonvirtualizable instructions to a reasonable level.

Binary translation can be performed on a static or dynamic basis. Static binary translation recompiles a set of executable images into a form ready for execution in the virtual environment. This translation takes some time, but it is a one-time process providing a set of system and user images that will continue to work until new image versions are installed, necessitating a recompilation procedure for the new images.

Dynamic binary translation scans sections of code during program execution to locate problematic instructions. When such instructions are encountered, they are replaced with virtualizable instruction sequences. Dynamic binary translation avoids the recompilation step required by static binary translation, but it results in reduced performance due to the ongoing code scanning and translation process for all running code. Each code segment only needs to be scanned and translated once and is then cached—so, for example, code within a loop will not be rescanned on each iteration.

Hardware emulation

All of the virtualization techniques that we have discussed to this point have assumed the guest OS is expecting to run on a processor with the same instruction set architecture as the host processor. There are many situations in which it is desirable to run an operating system and application code on a host processor with a completely different ISA from the guest OS.

When emulating processor hardware, each instruction executing in an emulated guest system must be translated to an equivalent instruction or sequence of instructions in the host ISA. As with binary translation, this process can take place in a static or dynamic manner.

Static translation can produce an efficient executable image capable of running in the target processor ISA. There is some risk in static translation because it may not be straightforward to identify all code paths in the executable file, particularly if branch target addresses are computed in the code rather than being statically defined. This risk also applies to the static binary translation technique described in the previous section.

Dynamic translation avoids potential errors that may occur with static translation, but performance can be quite poor. This is because dynamic translation with hardware emulation involves translating every instruction from one architecture to another. This is in contrast to dynamic binary translation for the same ISA, which, although it must scan every instruction, typically only needs to perform translation for a small percentage of executed instructions.

One example of hardware emulation tools is the open source QEMU (https://www.qemu.org/) machine emulator and virtualizer, which supports the running of operating systems for a wide variety of processor architectures on an impressive list of differing architectures, with reasonably good performance. The Freedom Studio tool suite for the RISC-V processor includes a QEMU implementation of the RV64GC instruction set architecture. This virtualized environment was used to run the code that we worked with in the exercises for Chapter 11, The RISC-V Architecture and Instruction Set.

In the next section, we will discuss the challenges and benefits related to virtualization in the processor families discussed in the preceding chapters.

Virtualization challenges

In simple terms, the goal of processor virtualization is to run an operating system within a hypervisor, which itself either runs on the bare metal of a computer system or runs as an application under the control of another operating system. In this section, we will focus on the hosted (type 2) hypervisor because this mode of operation presents a few added challenges that a bare-metal hypervisor may not face because the type 1 hypervisor has been optimized to support virtualization.

In a type 2 hypervisor, the host operating system supports kernel and user modes, as does the guest operating system (in the guest's perception). As the guest operating system and the applications running within it request system services, the hypervisor must intercept each request and translate it into a suitable call to the host kernel.

In a nonvirtualized system, peripheral devices, such as the keyboard and mouse, interact directly with the host operating system. In a virtualized environment, the hypervisor must manage the interfaces to these devices whenever the user requests interaction with the guest OS.

The degree of difficulty involved in implementing these capabilities depends on the instruction set of the host computer. Even if an instruction set was not designed to facilitate virtualization, it may or may not be possible for that architecture to support virtualization in a straightforward manner. The ease of virtualization on a particular processor ISA is a function of the manner in which the processor handles unsafe instructions.

Unsafe instructions

The name of the trap-and-emulate virtualization method refers to the ability of the hypervisor to take control of processing exceptions that would normally be dealt with by kernel mode handlers in the host operating system. This allows the hypervisor to process privilege violations and system calls from guest operating systems and the applications that run within them.

Each time an application running on a guest operating system requests a system function, such as opening a file, the hypervisor intercepts the request, adjusts the parameters of the request to align with the virtual machine configuration (perhaps by redirecting the file open request from the host filesystem to the guest's virtual disk sandbox), and passes the request on to the host operating system. The process of inspecting and performing the handling of exceptions by the hypervisor is the emulation phase of the trap-and-emulate approach.

In the context of virtualization, processor instructions that either rely on or modify privileged system state information are referred to as unsafe. For the trap-and-emulate method to function in a comprehensively secure and reliable manner, all unsafe instructions must generate exceptions that trap to the hypervisor. If an unsafe instruction is allowed to execute without trapping, the isolation of the virtual machine is compromised and virtualization may fail.

Shadow page tables

Protected data structures used in the allocation and management of virtual and physical memory present an additional challenge to full virtualization. A guest operating system kernel presumes it has full access to the hardware and data structures associated with the system MMU. The hypervisor must translate guest operating system requests for memory allocation and deallocation in a manner that is functionally equivalent to running the guest OS on bare metal.

A particular problem arises in the x86 architecture due to the fact that virtual memory page table configuration data must be stored within the processor to properly configure the system, but that information becomes inaccessible once it has been stored. To resolve this issue, the hypervisor maintains its own copy of the page table configuration data, referred to as shadow page tables. Because the shadow page tables are not actual page tables managing memory for the host OS, it is necessary for the hypervisor to set access permission restrictions on shadow page table memory regions and intercept the resulting traps when the guest OS attempts to access its page tables. The hypervisor then emulates the requested operation by interacting with the physical MMU through calls to the host OS.

The use of shadow page tables incurs a significant performance penalty and has been an area of focus for the development of hardware-assisted virtualization enhancements.

Security

There is nothing inherently insecure about using a hypervisor to virtualize one or more guest applications. It is, however, important to understand the added opportunities for malicious actors to attempt to infiltrate a virtualized environment.

A guest virtual machine presents essentially the same collection of vulnerabilities to remote attackers as an identical operating system and set of applications running directly on hardware. The hypervisor provides an additional avenue that an attacker may attempt to exploit in a virtualized environment. If malicious users manage to penetrate and take control of the hypervisor, this will grant full access to all of the guest operating systems, as well as the applications and data accessible from within the guests. The guests are accessible in this scenario because they operate at a lower privilege level granting the hypervisor full control over them.

When implementing virtualization in a context that permits public access, such as web hosting, it is vital that credentials enabling login to hypervisors, and any other access methods, be strictly limited to a small number of personnel, and all reasonable protective measures must be maintained to prevent unauthorized hypervisor access.

In the next section, we will examine some key technical aspects of virtualization as implemented in modern processor families.

Virtualizing modern processors

The hardware architectures of most general-purpose processor families have matured to the point that they fully support the execution of virtualized guest operating systems, at least in their higher-end variants. The following sections briefly introduce the virtualization capabilities provided by modern general-purpose processor families.

x86 processor virtualization

The x86 architecture was not originally designed to support the execution of virtualized operating systems. As a result, x86 processors, from the earliest days through to the Pentium series, implemented instruction sets containing several unsafe but non-trapping instructions. These instructions caused problems with virtualization by, for example, allowing the guest operating system to access privileged registers that do not contain data corresponding to the state of the virtual machine.

x86 current privilege level and unsafe instructions

In the x86 architecture, the lower two bits of the code segment (CS) register contain the current privilege level (CPL), identifying the currently active protection ring. The CPL is generally 0 for kernel code and 3 for user applications in a nonvirtualized operating system. In most hypervisor implementations, virtual machines run at CPL 3, causing many unsafe x86 instructions to trap upon execution. Unfortunately, for the early adopters of x86 virtualization, not all unsafe x86 instructions in Pentium processors caused traps when executed at CPL 3.

For example, the sidt instruction permits unprivileged code to read the 6-byte interrupt descriptor table register (IDTR) and store it at a location provided as an operand. There is only one IDTR in a physical single-core x86 processor. When a guest operating system executes this instruction, the IDTR contains data associated with the host operating system, which differs from the information the guest operating system expects to retrieve. This will result in erroneous execution of the guest operating system.

Writing to the physical system's IDTR is only possible for code running at CPL 0. When a guest operating system attempts to write to the IDTR while running at CPL 3, a privilege violation occurs and the hypervisor processes the ensuing trap to emulate the write operation by writing to a shadow register instead, which is just a location in memory allocated by the hypervisor. Reads from the IDTR, however, are permitted at CPL 3. User-mode software can read the IDTR and no trap occurs. Without a trap, the hypervisor is unable to intercept the read operation and return data from the shadow register. In short, writes to the IDTR are virtualizable, while reads from the IDTR are not.

Of the hundreds of instructions in the Pentium ISA, 17 were found to be unsafe but non-trapping. In other words, these instructions are nonvirtualizable. For the Pentium x86 architecture, implementing a pure trap-and-emulate virtualization approach is therefore not possible.

The unsafe, but non-trapping, instructions are used frequently in operating systems and device drivers, but are rarely found in application code. The hypervisor must implement a mechanism to detect the presence of unsafe, non-trapping instructions in the code and handle them.

The approach settled on by several popular virtualization engines has been to combine trap-and-emulate virtualization, where possible, with the binary translation of unsafe instructions into functionally equivalent code sequences suitable for the virtualized environment.

Most guest user applications do not attempt to use unsafe instructions at all. This allows them to run at full speed, once the hypervisor has scanned the code to ensure no unsafe instructions are present. Guest kernel code, however, may contain numerous, frequently encountered unsafe instructions. To achieve reasonable performance from binary-translated code, it is necessary to cache the modified code the first time it executes and reuse the cached version on future execution passes.

x86 hardware virtualization

Between 2005 and 2006, Intel and AMD released versions of the x86 processors containing hardware extensions supporting virtualization. These extensions resolved the problems caused by the privileged but non-trapping instructions, enabling full system virtualization under the Popek and Goldberg criteria. The extensions were named AMD-V in AMD processors and VT-x in Intel processors. The virtualization extensions in modern Intel processors are referred to as VT.

The initial implementations of these hardware virtualization technologies removed the requirements for the binary translation of unsafe instructions, but overall virtual machine performance did not improve substantially following the removal of binary translation. This was because page table shadowing was still needed. Page table shadowing had been the cause of most of the performance degradation observed during virtual machine execution.

Later versions of hardware virtualization technology removed many of the performance barriers in virtual machine execution, leading to the widespread adoption of x86 virtualization across a variety of domains. Today, multiple tools and frameworks are available for implementing x86 virtualization solutions within a standalone workstation, with options available to scale up to a fully managed data center with potentially thousands of servers, each capable of running several virtual machines simultaneously.

ARM processor virtualization

The ARMv8-A architecture supports virtualization in both the 32-bit and 64-bit (AArch64) execution states. Hardware support for virtualization includes the following:

  • Full trap-and-emulate virtualization
  • A dedicated exception category for hypervisor use
  • Additional registers supporting hypervisor exceptions and stack pointers

The ARMv8-A architecture provides hardware support for the translation of guest memory access requests to physical system addresses.

Systems running ARM processors offer a comprehensive capability for virtual machine execution using either a type 1 or type 2 hypervisor. 64-bit ARM processor performance is comparable to x64 servers with similar specifications. For many applications, such as large data center deployments, the choice between x64 and ARM as the server processor may revolve around factors unrelated to processor performance, such as system power consumption and cooling requirements.

RISC-V processor virtualization

Unlike the other ISAs discussed in this chapter, the architects of the RISC-V ISA included comprehensive virtualization support as a baseline requirement from the beginning of the ISA design. Although not yet a finalized standard, the proposal for the RISC-V hypervisor extension provides a full set of capabilities to support the efficient implementation of type 1 and type 2 hypervisors.

The RISC-V hypervisor extension fully implements the trap-and-emulate virtualization method and provides hardware support for the translation of guest operating system physical addresses to host physical addresses. RISC-V implements the concept of foreground and background control and status registers, which allows the rapid swapping of supervisor registers in and out of operation as virtual machines transition into and out of the running state.

Each hardware thread in RISC-V runs at one of three privilege levels:

  • User (U): This is the same as user privilege in a traditional operating system.
  • Supervisor (S): This is the same as supervisor or kernel mode in a traditional operating system.
  • Machine(M): The highest privilege level, with access to all system features.

Individual processor designs may implement all three of these modes, or they may implement the M and S mode pair, or M mode alone. Other combinations are not allowed.

In a RISC-V processor supporting the hypervisor extension, an additional configuration bit, the V bit, controls the virtualization mode. The V bit is set to 1 for hardware threads executing in a virtualized guest. Both the user and supervisor privilege levels can execute with the V bit set to 1. These are named the virtual user (VU) and virtual supervisor (VS) modes. In the RISC-V hypervisor context, supervisor mode with V = 0 is renamed the hypervisor-extended supervisor mode (HS). This name indicates HS is the mode in which the hypervisor itself, regardless of whether it is type 1 or type 2, runs. The remaining privilege level, M mode, only functions in a non-virtualized manner (with V = 0).

In both VU and VS modes, RISC-V implements a two-level address translation scheme that converts each guest virtual address first to a guest physical address and then to a supervisor physical address. This procedure efficiently performs the translation from virtual addresses in applications running in guest operating systems to physical addresses in system memory.

The next section provides overviews of a number of popular tools that are available for processor and operating system virtualization.

Virtualization tools

In this section, we will look at several widely available open source and commercial tools that implement different forms of processor virtualization. This information may be useful as a starting point when initiating a project involving virtualization.

VirtualBox

VirtualBox is a free, open source type 2 hypervisor from Oracle Corporation. Supported host operating systems include Windows and several Linux variants. One or more guest operating systems on a single host can simultaneously run Windows and a variety of Linux distributions.

Guest OS licensing requirements

For organizations and individuals to remain in compliance with copyright laws, operating systems requiring licensing, such as Windows, must be properly licensed even when running as guest operating systems.

Individual virtual machines can be started, stopped, and paused under the control of the interactive VirtualBox management program or from the command line. VirtualBox has the ability to capture snapshots of executing virtual machines and save them to disk. At a later time, a snapshot can resume execution from the precise point at which it was taken.

VirtualBox requires hardware-assisted virtualization provided by platforms with the AMD-V or Intel VT extensions. A number of mechanisms are provided enabling virtual machines to communicate with the host OS and with each other. A shared clipboard supports copy-and-paste between host and guest machines or from guest to guest. An internal network can be configured within VirtualBox that allows guests to interact with each other as if they were connected on an isolated local area network.

VMware Workstation

VMware Workstation, first released in 1999, is a type 2 hypervisor that runs on 64-bit versions of Windows and Linux. VMware products are offered commercially and require the purchase of licenses by some users. A version of Workstation called VMware Workstation Player is available at no cost with the provision that it can only be used for non-commercial purposes.

VMware Workstation supports the execution of potentially multiple copies of Windows and Linux operating systems within the host Linux or Windows operating system. Like VirtualBox, Workstation can capture snapshots of the virtual machine state, save that information to disk, and later resume execution from the captured state. Workstation also supports host-to-guest and guest-to-guest communication features, such as a shared clipboard and local network emulation.

VMware ESXi

ESXi is a type 1 hypervisor intended for enterprise-class deployments in data centers and cloud server farms. As a type 1 hypervisor, ESXi runs on the bare metal of the host computer system. It has interfaces with the computer system hardware, each guest operating system, and a management interface called the service console.

From the service console, administrators can oversee and manage the operation of a large-scale data center, bringing up virtual machines and assigning them tasks (referred to as workloads). ESXi provides additional features necessary for large-scale deployments, such as performance monitoring and fault detection. In the event of hardware failure or to enable system maintenance, virtual machine workloads can be transitioned seamlessly to different host computers.

KVM

The kernel-based virtual machine (KVM) is an open source type 2 hypervisor initially released in 2007. KVM supports full virtualization for guest operating systems. When used with x86 or x64 hosts, the system hardware must include the AMD-V or Intel VT virtualization extensions. The KVM hypervisor kernel is included in the main Linux development line.

KVM supports the execution of one or more virtualized instances of Linux and Windows on a host system without any modification of the guest operating systems.

Although originally developed for the 32-bit x86 architecture, KVM has been ported to x64, ARM, and PowerPC. KVM supports paravirtualization for Linux and Windows guests using the VirtIO API. In this mode, paravirtualized device drivers are provided for Ethernet, disk I/O, and the graphics display.

Xen

Xen, first released in 2003, is a free and open source type 1 hypervisor. The current version of Xen runs on x86, x64, and ARM processors. Xen supports guest virtual machines running under hardware-supported virtualization (AMD-V or Intel VT) or as paravirtualized operating systems. Xen is implemented in the mainline Linux kernel.

The Xen hypervisor runs one virtual machine at the most privileged level, referred to as domain 0, or dom0. The dom0 virtual machine is typically a Linux variant and has full access to the system hardware. The dom0 machine provides the user interface for managing the hypervisor.

Some of the largest commercial cloud service providers, including Amazon EC2, IBM SoftLayer, and Rackspace Cloud, use Xen as their primary hypervisor platform.

Xen supports live migration, where a virtual machine can be migrated from one host platform to another without downtime.

QEMU

QEMU, an abbreviation for quick emulator, is a free and open source emulator that performs hardware virtualization. QEMU can emulate at the level of a single application or an entire computer system. At the application level, QEMU can run individual Linux or macOS applications that were built for a different ISA than the execution environment.

When performing system emulation, QEMU represents a complete computer system, including peripherals. The guest system can use an ISA that differs from the host system. QEMU supports the execution of multiple guest operating systems on a single host simultaneously. Supported ISAs include x86, MIPS, ARMv7, ARMv8, PowerPC, and RISC-V.

QEMU supports the setup and migration of KVM machines, performing hardware emulation in conjunction with the virtual machine running under KVM. Similarly, QEMU can provide hardware emulation for virtual machines running under Xen.

QEMU is unique among virtualization tools in that it is not necessary for it to run at elevated privilege because it entirely emulates the guest system in software. The downside of this approach is the performance degradation resulting from the software emulation process.

The next section will discuss the synergistic effects resulting from implementing cloud computing using virtualization.

Virtualization and cloud computing

The terms virtualization and cloud computing are often tossed about with vague, sometimes overlapping meanings. Here is an attempt to highlight the difference between them:

  • Virtualization is a technology for abstracting software systems from the environment in which they operate.
  • Cloud computing is a methodology for employing virtualization and other technologies to enable the deployment, monitoring, and control of large-scale data centers.

The use of virtualization in cloud computing environments enables the flexible deployment of application workloads across an array of generic computing hardware in a controlled, coherent manner. By implementing applications such as web servers within virtual machines, it is possible to dynamically scale online computing capacity to match varying load conditions.

Commercial cloud service providers generally offer the use of their systems on a pay-per-capacity-used basis. A website that normally receives a fairly small amount of traffic may spike substantially if, for instance, it receives a mention on a national news program. If the site is deployed in a scalable cloud environment, the management software will detect the increased load and bring up additional instances of the website and potentially of the backend database as well. This increased resource usage will result in a larger bill from the cloud service provider, which most businesses will happily pay if the result is a website that remains operational and responsive to user input even under a heavy traffic load.

Cloud management environments, such as VMware ESXi and Xen, provide comprehensive tools for the configuration, deployment, management, and maintenance of large-scale cloud operations. These configurations may be intended for local use by an organization, or they may offer public-facing facilities for online service providers such as Amazon Web Services.

Electrical power consumption

Electrical power consumption is a significant expense for cloud service providers. Each computer in a large-scale server farm consumes power whenever it is running, even if it is not performing any useful work. In a facility containing thousands of computers, it is important to the bottom line that servers consume power only when needed by paying customers.

Virtualization helps substantially with the effective utilization of server systems. Since a single server can potentially host several guest virtual machines, customer workloads can be allocated efficiently across server hardware in a manner that avoids low utilization of a large number of computers. Servers that are not needed at a given time can be powered off completely, thereby reducing energy consumption, which, in turn reduces costs to the cloud provider and enables more competitive pricing for end users.

This section has provided a brief introduction to the use of virtualization in the context of cloud computing. Most organizations and individuals that establish a presence on the Internet make use of virtual servers in a cloud computing environment, whether they know it or not.

Summary

This chapter presented the concepts underlying processor virtualization and explained the many benefits to individual users and large organizations achieved through the effective use of virtualization. We examined the principal virtualization techniques and the open source and commercial tools that implement them.

We also saw the benefits of virtualization in the deployment of real-world software applications in cloud environments .

You should now understand the technology and benefits associated with processor virtualization and how modern processor ISAs support virtualization at the instruction set level. We learned about several open source and commercial tools providing virtualization capabilities. You should now understand how virtualization can be used to build and deploy scalable applications in cloud computing environments.

In the next chapter, we will look at the architecture of some specific application categories, including mobile devices, personal computers, gaming systems, systems that process big data, and neural networks.

Exercises

  1. Download and install the current version of VirtualBox. Download, install, and bring up Ubuntu Linux as a virtual machine within VirtualBox. Connect the guest OS to the Internet using a bridged network adapter. Configure and enable clipboard sharing and file sharing between the Ubuntu guest and your host operating system.
  2. Within the Ubuntu operating system you installed in Exercise 1, install VirtualBox and then install and bring up a virtual machine version of FreeDOS, available from https://www.freedos.org/download/. Verify that DOS commands, such as echo Hello World! and mem, perform properly in the FreeDOS virtual machine. After completing this exercise, you will have implemented an instance of nested virtualization.
  3. Create two separate copies of your Ubuntu guest machine in your host system's VirtualBox environment. Configure both Ubuntu guests to connect to the VirtualBox internal network. Set up the two machines with compatible Internet Protocol addresses. Verify each of the machines can receive a response from the other using the ping command. By completing this exercise, you will have configured a virtual network within your virtualized environment.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.201.71