vmalloc and Friends

The next memory allocation function that I’ll show you is vmalloc, which allocates a contiguous memory region in the virtual address space. Although the pages are not necessarily consecutive in physical memory (each page is retrieved with a separate call to __get_free_page), the kernel sees them as a contiguous range of addresses. The allocated space is mapped only to the kernel segments and is not visible from user space--not unlike the other allocation techniques. vmalloc returns 0 (the NULL address) if an error occurred, otherwise it returns a pointer to a linear memory area of size size.

The prototypes of the function, and its relatives, are the following:

void * vmalloc(unsigned long size);
void vfree(void * addr);
void * vremap(unsigned long offset, unsigned long size);

Note that vremap was renamed ioremap in version 2.1. Moreover, Linux 2.1 introduced a new header, <linux/vmalloc.h>, that must be included if you use vmalloc.

vmalloc is different from the other memory allocation functions because it returns ``high'' addresses--addresses that are higher than the top of physical memory. The processor is able to access the returned memory range because vmalloc arranged the processor’s page tables to access the allocated pages through consecutive ``high'' addresses. Kernel code can use addresses returned by vmalloc just like any other address, but the address used by the program is not the same as the one that appears on the electrical data bus.

Addresses allocated by vmalloc can’t be used outside of the microprocessor, because they make sense only on top of the processor’s paging unit. When a driver needs a real physical address (such as a DMA address, used by peripheral hardware to drive the system’s bus), you can’t use vmalloc. The right time to call vmalloc is when you are allocating memory for a large sequential buffer that exists only in software. It’s important to note that vmalloc has more overhead than __get_free_pages because it must both retrieve the memory and build the page tables. Therefore, it doesn’t make sense to call vmalloc to allocate just one page.

An example of a function that uses vmalloc is the create_module system call, which uses vmalloc to get space for the module being created. The module itself is later copied to the allocated space using memcpy_fromfs, after insmod has relocated the code.

Memory allocated with vmalloc is released by vfree, in the same way that kfree releases memory allocated by kmalloc.

Like vmalloc, vremap (or ioremap) builds new page tables, but unlike vmalloc, it doesn’t actually allocate any memory. The return value of vremap is a virtual address that can be used to access the specified physical address range; the virtual address obtained is eventually released by calling vfree.

vremap is most useful for mapping a high-memory PCI buffer to user space. For example, if the frame buffer on the VGA device has been mapped to the address 0xf0000000 (a typical value), vremap can be used to build the correct tables for the processor to access it. System initialization builds page tables only to access memory from address 0 up to the the top of physical memory. System initialization does not probe for PCI buffers, but leaves each driver responsible for managing buffers on its own device; PCI issues are explained in more detail in Section 15.1, in Chapter 15. On the other hand, you don’t need to remap the ISA hole below 1MB, because this memory is accessed by other means, described in the section Section 8.3, in Chapter 8.

If your driver is meant to be portable across different platforms, however, you must be careful when using vremap. Some platforms are unable to directly map PCI memory regions to the processor address space. This happens, for example, for the Alpha. In this case you can’t access remapped regions like conventional memory, and you need to use readb and the other I/O functions (see Section 8.3.1 in Chapter 8). This set of functions is portable across platforms.

There is almost no limit to how much memory vmalloc and vremap can allocate, although vmalloc refuses to allocate more memory than the amount of physical RAM, in order to detect common errors or typos made by programmers. You should remember, however, that requesting too much memory with vmalloc leads to the same problems as it does with kmalloc.

Both vremap and vmalloc are page-oriented (they work by modifying the page tables); thus the relocated or allocated size is rounded up to the nearest page boundary. In addition, vremap won’t even consider remapping a physical address that doesn’t start at a page boundary.

One minor drawback of vmalloc is that it can’t be used at interrupt time because internally it uses kmalloc(GFP_KERNEL) to acquire storage for the page tables. This shouldn’t be a problem--if the use of __get_free_page isn’t good enough for an interrupt handler, then the software design needs some cleaning up.

A scull Using Virtual Addresses: scullv

Sample code using vmalloc is provided in the scullv module. Like scullp, this module is a stripped-down version of scull that uses a different allocation function to obtain space for the device to store data.

The module allocates memory 16 pages at a time (128KB on the Alpha, 64KB on the x86). The allocation is done in large chunks to achieve better performance than scullp and to show something that takes too long with other allocation techniques to be feasible. Allocating more than one page with __get_free_pages is failure-prone, and even when it succeeds, it can be slow. As we saw earlier, vmalloc is faster than other functions in allocating several pages, but somewhat slower when retrieving a single page, due to the overhead of page-table building. scullv is designed exactly like scullp. order specifies the ``order'' of each allocation and defaults to 4. The only difference between scullv and scullp is in the following code:

/* Allocate a quantum using virtual addresses */
if (!dptr->data[s_pos]) {
    dptr->data[s_pos] = (void *)vmalloc(PAGE_SIZE << order);
    if (!dptr->data[s_pos])
        return -ENOMEM;

        /* Release the quantum-set */
        for (i = 0; i < qset; i++)
            if (dptr->data[i])
                vfree(dptr->data[i]);

If you compile both modules with debugging enabled, you can look at their data allocation by reading the files they create in /proc. The following snapshot was taken on my home computer, whose physical addresses go from 0 to 0x1800000 (24MB):

morgana.root# cp /bin/cp /dev/scullp0
morgana.root# cat /proc/scullpmem
Device 0: qset 500, order 0, sz 19652
  item at 0063e598, qset at 006eb018
       0: 150e000
       1:  de6000
       2: 10ca000
       3:  e19000
       4:  bd1000

morgana.root# cp /zImage.last /dev/scullv0
morgana.root# cat /proc/scullvmem

Device 0: qset 500, order 4, sz 289840
  item at 0063ec98, qset at 00b3e810
       0: 2034000
       1: 2045000
       2: 2056000
       3: 2067000
       4: 2078000

It’s apparent from the values shown that scullp allocates physical addresses (within 0x1800000), while scullv uses virtual addresses (but note that the actual values are different with Linux 2.1, as the organization of the virtual address space was changed—see Section 17.9 in Chapter 17).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.127.232