Types of Memory

The kernel deals with multiple types of memory, so understanding the difference is key to implementing a successful driver or kernel extension.

The types of memory can be categorized as:

  • CPU physical address
  • bus physical address
  • user and kernel virtual addresses

In addition to the three types of memory addresses, the amount of addressable memory differs between architectures and can be from 32-bit to 64-bit. Memory may also be ordered differently depending on the architecture and can be of little or big endian.

The following sections will discuss the importance and usage of each type of memory as it applies to kernel programming.

CPU Physical Address

A physical address refers to the addressing system used by the CPU to access physical memory. Typically, physical addresses are hidden behind the Memory Management Unit (MMU) of the CPU. The MMU translates virtual addresses normally used by the kernel and user space into physical addresses. The physical address space is linear and goes from 0 to 0xffffffff (232) for 32-bit systems and 0 to 0xffffffffffffffff (264) for 64-bit systems. Access to physical memory is cached in smaller memory buffers, such as L1 and L2 caches typically contained on the CPU die.

It is generally unnecessary to deal directly with physical addresses, even when writing drivers.

Bus Physical Addresses

The introduction of 64-bit computing presented a challenge as legacy I/O buses such as PCI and PCI-X were unable to access memory addresses over 32-bit. To work around this, PowerPC G5-based Macs had an additional MMU on their north bridge, used for remapping memory from 64-bit addresses into 32-bit addresses the device can read from. This MMU is referred to as the Device Address Resolution Table (DART). The DART presents the translated memory as physical addresses to the device, however these addresses are translated and not the same physical address as the CPU use. Intel-based computers have similar capabilities known as I/O memory management unit (IOMMU), one of the virtualization technologies for directed I/O (VT-d).

A bus physical address appears to be a physical address to a hardware device, though in reality, it is a virtual address translated by the DART. If you are confused, don't worry; you rarely have to deal with these addresses. In fact, if you use I/O Kit, it will do all the required translations for you automatically if you use IOMemoryDescriptor, which is discussed later in this chapter. Drivers can use the IOPhysicalAddress type to handle physical addresses. The size of the type depends on the underlying architecture. Because of PAE, it may be 64-bit, even on 32-bit systems.

User and Kernel Virtual Addresses

Virtual addresses are linear addresses that are translated into physical addresses by a special chip on the CPU called the Memory Management Unit (MMU). Each user space process has its own memory address space, and for all intents and purposes it looks like a process owns all physical memory. It may use any memory location in its address space, even on addresses located beyond the amount of physical memory. The virtual address space appears linear to a process, although the memory that backs it may be fragmented.

In Mac OS X, the entire virtual address space is available for a process to use. On a 32-bit system this includes memory addresses from 0–4 GB. Operating systems such as Microsoft Windows or Linux use a split model, where the kernel is mapped into the virtual address space of each process. For example, on Windows (32-bit), user space virtual memory occupies addresses from 0 to 0x7FFFFFFF, whereas memory addresses reserved for the kernel go from 0x80000000 to 0xFFFFFFFF. Because the kernel is already mapped, the CPU doesn't have to change the page tables when a process context switches into kernel mode (an already expensive operation). The downside of this approach is that the kernel and user space processes have less address space available, and hence in the case of Windows, only 2 GB can be accessed at any given time by either the kernel or a user space process. On Linux, the split is typically 3 GB/1 GB, with only 1 GB available to the kernel (though everything in Linux is configurable and other configurations are also available). If the system has a GPU enabled, this typically comes with onboard memory of up to 1 GB, which has to be mapped into virtual address space and may result in some physical memory being unable to be used as the GPU's large frame buffer shadows it.

To avoid the shadowing problem, Mac OS X has completely separate address spaces for the kernel (4GB) and user space processes (4GB), but as mentioned the downside is more expensive context switching.

The 64-bit kernel introduced in Mac OS X 10.6 Snow Leopard solved the problem of limited address space once for all. In 64-bit kernels, the kernel address space is always mapped in. Mac OS X splits the address space so the upper 128 terabytes (!) are reserved for the kernel, while the lower 128 terabytes belong to the currently running user space task. Though the address space is shared with user space, tasks are not able to access kernel memory due to page protection flags.

A virtual memory address may not always be backed by a physical memory location, as memory may have been migrated to an external backing store, such as a hard drive, because it was infrequently used or because a running process required more memory than was available. If the CPU accesses an address and the memory for the address is not resident, it will result in a page fault exception. The pager, a component of the OS, will attempt to fetch the page containing the given memory address.

The first page (0–4 KB) of the virtual address space is inaccessible to a process and an exception will be generated if access is attempted.

The architecture agnostic type IOVirtualAddress can be used to handle virtual addresses in I/O Kit code. This type is, again, the alias of mach_vm_address_t, the type for virtual memory addresses in the Mach layer.

images Tip For a more detailed discussion about virtual memory, see Chapter 1, or for details about the OS X and iOS implementation, see Chapter 2.

Memory Ordering: Big vs. Little Endian

Endianess refers to the ordering of the components of a binary word in memory. The ordering will be either little-endian or big-endian depending on the CPU architecture that is used. The effects of this can be illustrated with a simple C program, as shown in Listing 6-1.

Listing 6-1. Print the Byte Order of a 32-bit Word

#include <stdio.h>
#include <stdint.h>
int main(int argc, char *argv[])
{
        uint32_t word = 0xaabbccdd;
        uint8_t* ptr = (uint8_t*)&word;
        printf("%02x %02x %02x %02x ", ptr[0], ptr[1], ptr[2], ptr[3]);
        return 0;
}

The result of executing on a system with little endinaness will be:


dd cc bb aa

While on a big endian system:


aa bb cc dd

As you can see, the ordering is reversed on little-endian systems. All current-generation Macs are little-endian, as the Intel x86/x86_64 processors are little-endian; so too are ARM-based iOS devices. The older PowerPC-based Macs were big-endian. Why should you care about big-endian then? Well, some hardware architectures or network protocols, such as TCP/IP, use big-endian; additionally, your driver or kernel extensions may have to be compatible with older Macs that are based on the PowerPC architecture. Furthermore, OS X has support for Rosetta, which emulates PowerPC applications on Intel-based Macs. It is possible your driver will be accessed by a Rosetta client task. Some user space APIs, such as the Carbon File Manager, also work with big-endian data structures.

The C pre-processor macros __LITTLE_ENDIAN__ and __BIG_ENDIAN__ are defined by the compiler and can be used to determine the byte order at compile time.

32-bit vs. 64-bit Memory Addressing

Modern Mac OS X systems are now 64-bit. By 64-bit, we mean the CPU's ability to work with addresses of a 64-bit width, including general-purpose registers, and the ability to use a 64-bit data bus and 64-bit virtual memory addressing.

Table 6-1 shows the supported addressing modes and native pointer sizes of architectures supported by OS X and iOS.

images

Because it is possible for the kernel to be running in 32-bit mode while an application runs in 64-bit mode, great care must be taken when a 64-bit process exchanges data with the kernel, for example, through an ioctl() or an IOUserClient method. The same is true when running a 64-bit kernel and communicating with a 32-bit application. The problem is that 32-bit and 64-bit compilers may define data types differently. For example, the C data type long is 4 bytes wide in 32-bit programs and 8 bytes in a program compiled for a 64-bit instruction set.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.119.118.232