Kernel memory is allocated at different levels, depending on the desired allocation characteristics. At the lowest level is the page allocator, which allocates unmapped pages from the free lists so the pages can then be mapped into the kernel's address space for use by the kernel.
Allocating memory in pages works well for memory allocations that require page-sized chunks, but there are many places where we need memory allocations smaller than one page; for example, an in-kernel inode requires only a few hundred bytes per inode, and allocating one whole page (8 Kbytes) would be wasteful. For this reason, Solaris has an object-level kernel memory allocator in addition to the page-level allocator to allocate arbitrarily sized requests, stacked on top of the page-level allocator. The kernel also needs to manage where pages are mapped, a function that is provided by the resource map allocator. The high-level interaction between the allocators is shown in Figure 6.3.
We access memory in the kernel by acquiring a section of the kernel's virtual address space and then mapping physical pages to that address. We can acquire the physical pages one at a time from the page allocator by calling page_create_va(), but to use these pages, we first need to map them. A section of the kernel's address space, known as the kernel map, is set aside for general-purpose mappings. (See Figure 6.1 for the location of the sun4u kernelmap; see also Appendix B, “Kernel Virtual Address Maps” for kernel maps on other platforms.)
The kernel map is a separate kernel memory segment containing a large area of virtual address space that is available to kernel consumers who require virtual address space for their mappings. Each time a consumer uses a piece of the kernel map, we must record some information about which parts of the kernel map are free and which parts are allocated, so that we know where to satisfy new requests. To record the information, we use a general-purpose allocator to keep track of the start and length of the mappings that are allocated from the kernel map area. The allocator we use is the resource map allocator, which is used almost exclusively for managing the kernel map virtual address space.
The kernel map area is large, up to 8 Gbytes on 64-bit sun4u systems, and can quickly become fragmented if it accommodates many consumers with different-sized requests. It is up to the resource map allocator to try to keep the kernel map area as unfragmented as possible.
Solaris uses the resource map allocator to manage the kernel map. To keep track of the areas of free memory within the map, the resource map allocator uses a simple algorithm to keep a list of start/length pairs that point to each area of free memory within the map. The map entries are sorted in ascending order to make it quicker to find entries, allowing faster allocation. The map entries are shown in the following map structure, which can be found in the <sys/map.h> header file.
struct map { size_t m_size; /* size of this segment of the map */ ulong_t m_addr; /* resource-space addr of start of segment */ }; |
The area managed by the resource map allocator is initially described by just one map entry representing the whole area as one contiguous free chunk. As more allocations are made from the area, more map entries are used to describe the area, and as a result, the map becomes even more fragmented over time.
The resource map allocator uses a first-fit algorithm to find space in the map to satisfy new requests, which means that it attempts to find the first available slot in the map that fits the request. The first-fit algorithm provides a fast find allocation at the expense of map fragmentation after time. For this reason, it is important to ensure that kernel subsystems do not perform an excessive amount of map allocation and freeing. The kernel slab allocator (discussed next) should be used for these types of requests.
Map resource requests are made with the rmalloc() call, and resources are returned to the map by rmfree(). Resource maps are created with the rmallocmap() function and destroyed with the rmfreemap() function. The functions that implement the resource map allocator are shown in Table 6-4.
The segkmem segment driver performs two major functions: It manages the creation of general-purpose memory segments in the kernel address space, and it also provides functions that implement a page-level memory allocator by using one of those segments—the kernel map segment.
The segkmem segment driver implements the segment driver methods described in Section 5.4, “Memory Segments,” to create general-purpose, nonpageable memory segments in the kernel address space. The segment driver does little more than implement the segkmem_create method to simply link segments into the kernel's address space. It also implements protection manipulation methods, which load the correct protection modes via the HAT layer for segkmem segments. The set of methods implemented by the segkmem driver is shown in Table 6-5.
The second function of the segkmem driver is to implement a page-level memory allocator by combined use of the resource map allocator and page allocator. The page-level memory allocator within the segkmem driver is implemented with the function kmem_getpages(). The kmem_getpages() function is the kernel's central allocator for wired-down, page-sized memory requests. Its main client is the second-level memory allocator, the slab allocator, which uses large memory areas allocated from the page-level allocator to allocate arbitrarily sized memory objects. We'll cover more on the slab allocator later in this chapter.
The kmem_getpages() function allocates page-sized chunks of virtual address space from the kernelmap segment. The kernelmap segment is only one of many segments created by the segkmem driver, but it is the only one from which the segkmem driver allocates memory.
The resource map allocator allocates portions of virtual address space within the kernelmap segment but on its own does not allocate any physical memory resources. It is used together with the page allocator, page_create_va(), and the hat_memload() functions to allocate physical mapped memory. The resource map allocator allocates some virtual address space, the page allocator allocates pages, and the hat_memload() function maps those pages into the virtual address space provided by the resource map. A client of the segkmem memory allocator can acquire pages with kmem_getpages and then return them to the map with kmem_freepages, as shown in Table 6-6.
Pages allocated through kmem_getpages are not pageable and are one of the few exceptions in the Solaris environment where a mapped page has no logically associated vnode. To accommodate that case, a special vnode, kvp, is used. All pages created through the segkmem segment have kvp as the vnode in their identity— this allows the kernel to identify wired-down kernel pages.
In this section, we introduce the general-purpose memory allocator, known as the slab allocator. We begin with a quick walk-through of the slab allocator features, then look at how the allocator implements object caching, and follow up with a more detailed discussion on the internal implementation.
Solaris provides a general-purpose memory allocator that provides arbitrarily sized memory allocations. We refer to this allocator as the slab allocator because it consumes large slabs of memory and then allocates smaller requests with portions of each slab. We use the slab allocator for memory requests that are:
Smaller than a page size
Not an even multiple of a page size
Frequently going to be allocated and freed, so would otherwise fragment the kernel map
The slab allocator was introduced in Solaris 2.4, replacing the buddy allocator that was part of the original SVR4 UNIX. The reasons for introducing the slab allocator were as follows:
The SVR4 allocator was slow to satisfy allocation requests.
Significant fragmentation problems arose with use of the SVR4 allocator.
The allocator footprint was large, wasting a lot of memory.
With no clean interfaces for memory allocation, code was duplicated in many places.
The slab allocator solves those problems and dramatically reduces overall system complexity. In fact, when the slab allocator was integrated into Solaris, it resulted in a net reduction of 3,000 lines of code because we could centralize a great deal of the memory allocation code and could remove a lot of the duplicated memory allocator functions from the clients of the memory allocator.
The slab allocator is significantly faster than the SVR4 allocator it replaced. Table 6-7 shows some of the performance measurements that were made when the slab allocator was first introduced.
The slab allocator provides substantial additional functionality, including the following:
General-purpose, variable-sized memory object allocation
A central interface for memory allocation, which simplifies clients of the allocator and reduces duplicated allocation code
Very fast allocation/deallocation of objects
Low fragmentation / small allocator footprint
Full debugging and auditing capability
Coloring to optimize use of CPU caches
Per-processor caching of memory objects to reduce contention
A configurable back-end memory allocator to allocate objects other than regular wired-down memory
The slab allocator uses the term object to describe a single memory allocation unit, cache to refer to a pool of like objects, and slab to refer to a group of objects that reside within the cache. Each object type has one cache, which is constructed from one or more slabs. Figure 6.4 shows the relationship between objects, slabs, and the cache. The example shows 3-Kbyte memory objects within a cache, backed by 8-Kbyte pages.
The slab allocator solves many of the fragmentation issues by grouping different-sized memory objects into separate caches, where each object cache has its own object size and characteristics. Grouping the memory objects into caches of similar size allows the allocator to minimize the amount of free space within each cache by neatly packing objects into slabs, where each slab in the cache represents a contiguous group of pages. Since we have one cache per object type, we would expect to see many caches active at once in the Solaris kernel. For example, we should expect to see one cache with 440 byte objects for UFS inodes, another cache of 56 byte objects for file structures, another cache of 872 bytes for LWP structures, and several other caches.
The allocator has a logical front end and back end. Objects are allocated from the front end, and slabs are allocated from pages supplied by the back-end page allocator. This approach allows the slab allocator to be used for more than regular wired-down memory; in fact, the allocator can allocate almost any type of memory object. The allocator is, however, primarily used to allocate memory objects from physical pages by using kmem_getpages as the back-end allocator.
Caches are created with kmem_cache_create(), once for each type of memory object. Caches are generally created during subsystem initialization, for example, in the init routine of a loadable driver. Similarly, caches are destroyed with the kmem_cache_destroy() function. Caches are named by a string provided as an argument, to allow friendlier statistics and tags for debugging. Once a cache is created, objects can be created within the cache with kmem_cache_alloc(), which creates one object of the size associated with the cache from which the object is created. Objects are returned to the cache with kmem_cache_free().
The slab allocator makes use of the fact that most of the time objects are heavily allocated and deallocated, and many of the slab allocator's benefits arise from resolving the issues surrounding allocation and deallocation. The allocator tries to defer most of the real work associated with allocation and deallocation until it is really necessary, by keeping the objects alive until memory needs to be returned to the back end. It does this by telling the allocator what the object is being used for, so that the allocator remains in control of the object's true state.
So, what do we really mean by keeping the object alive? If we look at what a subsystem uses memory objects for, we find that a memory object typically consists of two common components: the header or description of what resides within the object and associated locks; and the actual payload that resides within the object. A subsystem typically allocates memory for the object, constructs the object in some way (writes a header inside the object or adds it to a list), and then creates any locks required to synchronize access to the object. The subsystem then uses the object. When finished with the object, the subsystem must deconstruct the object, release locks, and then return the memory to the allocator. In short, a subsystem typically allocates, constructs, uses, deallocates, and then frees the object.
If the object is being created and destroyed often, then a great deal of work is expended constructing and deconstructing the object. The slab allocator does away with this extra work by caching the object in its constructed form. When the client asks for a new object, the allocator simply creates or finds an available constructed object. When the client returns an object, the allocator does nothing other than mark the object as free, leaving all of the constructed data (header information and locks) intact. The object can be reused by the client subsystem without the allocator needing to construct or deconstruct—the construction and deconstruction is only done when the cache needs to grow or shrink. Deconstruction is deferred until the allocator needs to free memory back to the back-end allocator.
To allow the slab allocator to take ownership of constructing and deconstructing objects, the client subsystem must provide a constructor and destructor method. This service allows the allocator to construct new objects as required and then to deconstruct objects later, asynchronously to the client's memory requests. The kmem_cache_create() interface supports this feature by providing a constructor and destructor function as part of the create request.
The slab allocator also allows slab caches to be created with no constructor or destructor, to allow simple allocation and deallocation of simple raw memory objects.
The slab allocator moves a lot of the complexity out of the clients and centralizes memory allocation and deallocation policies. At some points, the allocator may need to shrink a cache as a result of being notified of a memory shortage by the VM system. At this time, the allocator can free all unused objects by calling the destructor for each object that is marked free and then returning unused slabs to the back-end allocator. A further callback interface is provided in each cache so that the allocator can let the client subsystem know about the memory pressure. This callback is optionally supplied when the cache is created and is simply a function that the client implements to return, by means of kmem_cache_free(), as many objects to the cache as possible.
A good example is a file system, which uses objects to store the inodes. The slab allocator manages inode objects; the cache management, construction, and deconstruction of inodes are handed over to the slab allocator. The file system simply asks the slab allocator for a “new inode” each time it requires one. For example, a file system could call the slab allocator to create a slab cache, as shown below.
inode_cache = kmem_cache_create("inode_cache", sizeof (struct inode), 0, inode_cache_constructor, inode_cache_destructor, inode_cache_reclaim, NULL, NULL, 0); struct inode *inode = kmem_cache_alloc(inode_cache, 0);
The example shows that we create a cache named inode_cache, with objects of the size of an inode, no alignment enforcement, a constructor and a destructor function, and a reclaim function. The back-end memory allocator is specified as NULL, which by default allocates physical pages from the segkmem page allocator.
We can see from the statistics exported by the slab allocator that the UFS file system uses a similar mechanism to allocate its inodes. We use the netstat -k function to dump the statistics. (We discuss allocator statistics in more detail in “Slab Allocator Statistics”.)
# netstat -k ufs_inode_cache ufs_inode_cache: buf_size 440 align 8 chunk_size 440 slab_size 8192 alloc 20248589 alloc_fail 0 free 20770500 depot_alloc 657344 depot_free 678433 depot_contention 85 global_alloc 602986 global_free 578089 buf_constructed 0 buf_avail 7971 buf_inuse 24897 buf_total 32868 buf_max 41076 slab_create 2802 slab_destroy 976 memory_class 0 hash_size 0 hash_lookup_depth 0 hash_rescale 0 full_magazines 0 empty_magazines 0 magazine_size 31 alloc_from_cpu0 9583811 free_to_cpu0 10344474 buf_avail_cpu0 0 alloc_from_cpu1 9404448 free_to_cpu1 9169504 buf_avail_cpu1 0
The allocator interfaces are shown in Table 6-8.
Caches are created with the kmem_cache_create() function, which can optionally supply callbacks for construction, deletion, and cache reclaim notifications. The callback functions are described in Table 6-9.
In addition to object-based memory allocation, the slab allocator provides backward-compatible, general-purpose memory allocation routines. These routines allocate arbitrary-length memory by providing a method to malloc(). The slab allocator maintains a list of various-sized slabs to accommodate kmem_alloc() requests and simply converts the kmem_alloc() request into a request for an object from the nearest-sized cache. The sizes of the caches used for kmem_alloc() are named kmem_alloc_n, where n is the size of the objects within the cache (see Section 6.2.4.9, “Slab Allocator Statistics,”). The functions are shown in Table 6-10.
The slab allocator implements the allocation and management of objects to the front-end clients, using memory provided by the back-end allocator. In our introduction to the slab allocator, we discussed in some detail the virtual allocation units: the object and the slab. The slab allocator implements several internal layers to provide efficient allocation of objects from slabs. The extra internal layers reduce the amount of contention between allocation requests from multiple threads, which ultimately allows the allocator to provide good scalability on large SMP systems.
Figure 6.5 shows the internal layers of the slab allocator. The additional layers provide a cache of allocated objects for each CPU, so a thread can allocate an object from a local per-CPU object cache without having to hold a lock on the global slab cache. For example, if two threads both want to allocate an inode object from the inode cache, then the first thread's allocation request would hold a lock on the inode cache and would block the second thread until the first thread has its object allocated. The per-CPU cache layers overcome this blocking with an object cache per CPU to try to avoid the contention between two concurrent requests. Each CPU has its own short-term cache of objects, which reduces the amount of time that each request needs to go down into the global slab cache.
The layers shown in Figure 6.5 are separated into the slab layer, the depot layer, and the CPU layer. The upper two layers (which together are known as the magazine layer) are caches of allocated groups of objects and use a military analogy of allocating rifle rounds from magazines. Each per-CPU cache has magazines of allocated objects and can allocate objects (rounds) from its own magazines without having to bother the lower layers. The CPU layer needs to allocate objects from the lower (depot) layer only when its magazines are empty. The depot layer refills magazines from the slab layer by assembling objects, which may reside in many different slabs, into full magazines.
The CPU layer caches groups of objects to minimize the number of times that an allocation will need to go down to the lower layers. This means that we can satisfy the majority of allocation requests without having to hold any global locks, thus dramatically improving the scalability of the allocator.
Continuing the military analogy: Three magazines of objects are kept in the CPU layer to satisfy allocation and deallocation requests—a full, a half-allocated, and an empty magazine are on hand. Objects are allocated from the half-empty magazine, and until the magazine is empty, all allocations are simply satisfied from the magazine. When the magazine empties, an empty magazine is returned to the magazine layer, and objects are allocated from the full magazine that was already available at the CPU layer. The CPU layer keeps the empty and full magazine on hand to prevent the magazine layer from having to construct and deconstruct magazines when on a full or empty magazine boundary. If a client rapidly allocates and deallocates objects when the magazine is on a boundary, then the CPU layer can simply use its full and empty magazines to service the requests, rather than having the magazine layer deconstruct and reconstruct new magazines at each request. The magazine model allows the allocator to guarantee that it can satisfy at least a magazine size of rounds without having to go to the depot layer.
The depot layer assembles groups of objects into magazines. Unlike a slab, a magazine's objects are not necessarily allocated from contiguous memory; rather, a magazine contains a series of pointers to objects within slabs.
The number of rounds per magazine for each cache changes dynamically, depending on the amount of contention that occurs at the depot layer. The more rounds per magazine, the lower the depot contention, but more memory is consumed. Each range of object sizes has an upper and lower magazine size. Table 6-11 shows the magazine size range for each object size.
Object Size Range | Minimum Magazine Size | Maximum Magazine Size |
---|---|---|
0–63 | 15 | 143 |
64–127 | 7 | 95 |
128–255 | 3 | 47 |
256–511 | 1 | 31 |
512–1023 | 1 | 15 |
1024–2047 | 1 | 7 |
2048–16383 | 1 | 3 |
16384– | 1 | 1 |
A slab allocator maintenance thread is scheduled every 15 seconds (controlled by the tunable kmem_update_interval) to recalculate the magazine sizes. If significant contention has occurred at the depot level, then the magazine size is bumped up. Refer to Table 6-12 for the parameters that control magazine resizing.
The global slab layer allocates slabs of objects from contiguous pages of physical memory and hands them up to the magazine layer for allocation. The global slab layer is used only when the upper layers need to allocate or deallocate entire slabs of objects to refill their magazines.
The slab is the primary unit of allocation in the slab layer. When the allocator needs to grow a cache, it acquires an entire slab of objects. When the allocator wants to shrink a cache, it returns unused memory to the back end by deallocating a complete slab. A slab consists of one or more pages of virtually contiguous memory carved up into equal-sized chunks, with a reference count indicating how many of those chunks have been allocated.
The contents of each slab are managed by a kmem_slab data structure that maintains the slab's linkage in the cache, its reference count, and its list of free buffers. In turn, each buffer in the slab is managed by a kmem_bufctl structure that holds the freelist linkage, the buffer address, and a back-pointer to the controlling slab.
For objects smaller than 1/8th of a page, the slab allocator builds a slab by allocating a page, placing the slab data at the end, and dividing the rest into equal-sized buffers. Each buffer serves as its own kmem_bufctl while on the freelist. Only the linkage is actually needed, since everything else is computable. These are essential optimizations for small buffers; otherwise, we would end up allocating almost as much memory for kmem_bufctl as for the buffers themselves. The free-list linkage resides at the end of the buffer, rather than the beginning, to facilitate debugging. This location is driven by the empirical observation that the beginning of a data structure is typically more active than the end. If a buffer is modified after being freed, the problem is easier to diagnose if the heap structure (free-list linkage) is still intact. The allocator reserves an additional word for constructed objects so that the linkage does not overwrite any constructed state.
For objects greater than 1/8th of a page, a different scheme is used. Allocating objects from within a page-sized slab is efficient for small objects but not for large ones. The reason for the inefficiency of large-object allocation is that we could fit only one 4-Kbyte buffer on an 8-Kbyte page—the embedded slab control data takes up a few bytes, and two 4-Kbyte buffers would need just over 8 Kbytes. For large objects, we allocate a separate slab management structure from a separate pool of memory (another slab allocator cache, the kmem_slab_cache). We also allocate a buffer control structure for each page in the cache from another cache, the kmem_bufctl_cache. The slab/bufctl/buffer structures are shown in the slab layer in Figure 6.5.
The slab layer solves another common memory allocation problem by implementing slab coloring. If memory objects all start at a common offset (e.g., at 512-byte boundaries), then accessing data at the start of each object could result in the same cache line being used for all of the objects. The issues are similar to those discussed in “The Page Scanner”. To overcome the cache line problem, the allocator applies an offset to the start of each slab, so that buffers within the slab start at a different offset. This approach is also shown in Figure 6.5 by the color offset segment that resides at the start of each memory allocation unit before the actual buffer. Slab coloring results in much better cache utilization and more evenly balanced memory loading.
The slab allocator parameters are shown in Table 6-12 for reference only. We recommend that none of these values be changed.
Two forms of slab allocator statistics are available: global statistics and per-cache statistics. The global statistics are available through the crash utility and display a summary of the entire cache list managed by the allocator.
# crash dumpfile = /dev/mem, namelist = /dev/ksyms, outfile = stdout > kmastat buf buf buf memory #allocations cache name size avail total in use succeed fail ---------- ----- ----- ----- -------- ------- ---- kmem_magazine_1 16 483 508 8192 6664 0 kmem_magazine_3 32 1123 1270 40960 55225 0 kmem_magazine_7 64 584 762 49152 62794 0 kmem_magazine_15 128 709 945 122880 194764 0 kmem_magazine_31 256 58 62 16384 24915 0 kmem_magazine_47 384 0 0 0 0 0 kmem_magazine_63 512 0 0 0 0 0 kmem_magazine_95 768 0 0 0 0 0 kmem_magazine_143 1152 0 0 0 0 0 kmem_slab_cache 56 308 2159 139264 22146 0 kmem_bufctl_cache 32 2129 6096 196608 54870 0 kmem_bufctl_audit_cache 184 24 16464 3211264 16440 0 kmem_pagectl_cache 32 102 254 8192 406134 0 kmem_alloc_8 8 9888 31527 253952 115432346 0 kmem_alloc_16 16 7642 18288 294912 374733170 0 kmem_alloc_24 24 4432 11187 270336 30957233 0 . . kmem_alloc_12288 12288 2 4 49152 660 0 kmem_alloc_16384 16384 0 42 688128 1845 0 . . streams_mblk 64 3988 5969 385024 31405446 0 streams_dblk_32 128 795 1134 147456 72553829 0 streams_dblk_64 160 716 1650 270336 196660790 0 . . streams_dblk_8096 8192 17 17 139264 356266482 0 streams_dblk_12192 12288 8 8 98304 14848223 0 streams_dblk_esb 96 0 0 0 406326 0 stream_head_cache 328 68 648 221184 492256 0 queue_cache 456 109 1513 729088 1237000 0 syncq_cache 120 48 67 8192 373 0 qband_cache 64 125 635 40960 1303 0 linkinfo_cache 48 156 169 8192 90 0 strevent_cache 48 153 169 8192 5442622 0 as_cache 120 45 201 24576 158778 0 seg_skiplist_cache 32 540 1524 49152 1151455 0 anon_cache 48 1055 71825 3481600 7926946 0 anonmap_cache 48 551 4563 221184 5805027 0 segvn_cache 88 686 6992 622592 9969087 0 flk_edges 48 0 0 0 1 0 physio_buf_cache 224 0 0 0 98535107 0 snode_cache 240 39 594 147456 1457746 0 ufs_inode_cache 440 8304 32868 14958592 20249920 0 . . ---------- ----- ----- ----- -------- ------- ---- permanent - - - 98304 501 0 oversize - - - 9904128 406024 0 ---------- ----- ----- ----- -------- ------- ---- Total - - - 58753024 2753193059 0
The kmastat command shows summary information for each statistic and a systemwide summary at the end. The columns are shown in Table 6-13.
Parameter | Description |
---|---|
Cache name | The name of the cache, as supplied during kmem_cache_create(). |
buf_size | The size of each object within the cache in bytes. |
buf_avail | The number of free objects in the cache. |
buf_total | The total number of objects in the cache. |
Memory in use | The amount of physical memory consumed by the cache in bytes. |
Allocations succeeded | The number of allocations that succeeded. |
Allocations failed | The number of allocations that failed. These are likely to be allocations that specified KM_NOSLEEP during memory pressure. |
A more detailed version of the per-cache statistics is exported by the kstat mechanism. You can use the netstat -k command to display the cache statistics, which are described in Table 6-14.
# netstat -k ufs_inode_cache
ufs_inode_cache:
buf_size 440 align 8 chunk_size 440 slab_size 8192 alloc 20248589
alloc_fail 0 free 20770500 depot_alloc 657344 depot_free 678433
depot_contention 85 global_alloc 602986 global_free 578089
buf_constructed 0 buf_avail 7971 buf_inuse 24897 buf_total 32868
buf_max 41076 slab_create 2802 slab_destroy 976 memory_class 0
hash_size 0 hash_lookup_depth 0 hash_rescale 0 full_magazines 0
empty_magazines 0 magazine_size 31 alloc_from_cpu0 9583811
free_to_cpu0 10344474 buf_avail_cpu0 0 alloc_from_cpu1 9404448
free_to_cpu1 9169504 buf_avail_cpu1 0
The slab allocator includes a general-purpose allocation tracing facility that tracks the allocation history of objects. The facility is switched off by default and can be enabled by setting the system variable kmem_flags. The tracing facility captures the stack and history of allocations into a slab cache, named as the name of the cache being traced, with .DEBUG appended to it. Audit tracing can be enabled by the following:
Setting kmem_flags to indicate the type of tracing desired, usually 0x1F to indicate all tracing
Booting the system with kadb -d and setting kmem_flags before startup
The following simple example shows how to trace a cache that is created on a large system, after the flags have been set. To enable tracing on all caches, the system must be booted with kadb and the kmem_flags variable set. The steps for such booting are shown below.
ok boot kadb -d Resetting ... Sun Ultra 1 UPA/SBus (UltraSPARC 167MHz), No Keyboard OpenBoot 3.1, 128 MB memory installed, Serial #8788108. Ethernet address 8:0:20:86:18:8c, Host ID: 8086188c. Rebooting with command: boot kadb -d Boot device: /sbus/SUNW,fas@e,8800000/sd@0,0 File and args: kadb -d kadb: <return> kadb[0]: kmem_flags/D kmem_flags: kmem_flags: 0 kadb[0]: kmem_flags/W 0x1f kmem_flags: 0x0 = 0x1f kadb[0]: :c SunOS Release 5.7 Version Generic 64-bit Copyright 1983-2000 Sun Microsystems, Inc. All rights reserved.
Note that the total number of allocations traced will be limited by the size of the audit cache parameters, shown in Table 6-12. Table 6-15 shows the parameters that control kernel memory debugging.
3.145.76.250