Much of the power of pointers stems from their ability to track dynamically allocated memory. The management of this memory through pointers forms the basis for many operations, including those used to manipulate complex data structures. To be able to fully exploit these capabilities, we need to understand how dynamic memory management occurs in C.
A C program executes within a runtime system. This is typically the environment provided by an operating system. The runtime system supports the stack and heap along with other program behavior.
Memory management is central to all programs. Sometimes memory is managed by the runtime system implicitly, such as when memory is allocated for automatic variables. In this case, variables are allocated to the enclosing function’s stack frame. In the case of static and global variables, memory is placed in the application’s data segment, where it is zeroed out. This is a separate area from executable code and other data managed by the runtime system.
The ability to allocate and then deallocate memory allows an application to manage its memory more efficiently and with greater flexibility. Instead of having to allocate memory to accommodate the largest possible size for a data structure, only the actual amount required needs to be allocated.
For example, arrays are fixed size in versions of C prior to C99. If we need to hold a variable number of elements, such as employee records, we would be forced to declare an array large enough to hold the maximum number of employees we believe would be needed. If we underestimate the size, we are forced to either recompile the application with a larger size or to take other approaches. If we overestimate the size, then we will waste space. The ability to dynamically allocate memory also helps when dealing with data structures using a variable number of elements, such as a linked list or a queue.
C99 introduced Variable Length Arrays (VLAs). The array’s size is determined at runtime and not at compile time. However, once created, arrays still do not change size.
Languages such as C also support dynamic memory management where objects are allocated memory from the heap. This is done manually using functions to allocate and deallocate memory. The process is referred to as dynamic memory management.
We start this chapter with a quick overview of how memory is allocated
and freed. Next, we present basic allocation functions such as malloc
and realloc
. The free
function is discussed, including the use of
NULL along with such problems as double free.
Dangling pointers are a common problem. We will present examples to illustrate when dangling pointers occur and techniques to handle the problem. The last section presents alternate techniques for managing memory. Improper use of pointers can result in unpredictable behavior. By this we mean the program can produce invalid results, corrupt data, or possibly terminate the program.
The basic steps used for dynamic memory allocation in C are:
Use a malloc
type function to
allocate memory
Use this memory to support the application
Deallocate the memory using the free
function
While there are some minor variations to this approach, this is
the most common technique. In the following example, we allocate memory
for an integer using the malloc
function. The pointer assigns five to the allocated memory, and then the
memory is released using the free
function:
int
*
pi
=
(
int
*
)
malloc
(
sizeof
(
int
));
*
pi
=
5
;
printf
(
"*pi: %d
"
,
*
pi
);
free
(
pi
);
When this sequence is executed, it will display the number 5. Figure 2-1 illustrates how memory is
allocated right before the free
function is executed. For the purposes of this chapter, we will assume
that the example code is found in the main
function unless otherwise noted.
The malloc
function single
argument specifies the number of bytes to allocate. If successful, it
returns a pointer to memory allocated from the heap. If it fails, it
returns a null pointer. Testing the validity of an allocated pointer is
discussed in Using the malloc Function.
The sizeof
operator makes
the application more portable and determines the correct number of bytes
to allocate for the host system.
In this example, we are trying to allocate enough memory for an integer. If we assume its size is 4, we can use:
int
*
pi
=
(
int
*
)
malloc
(
4
));
However, the size of an integer can vary, depending on the memory
model used. A portable approach is to use the sizeof
operator. This will return the correct
size regardless of where the program is executing.
A common error involving the dereference operator is demonstrated below:
int
*
pi
;
*
pi
=
(
int
*
)
malloc
(
sizeof
(
int
));
The problem is with the lefthand side of the assignment operation.
We are dereferencing the pointer. This will assign the address returned
by malloc
to the address stored in
pi
. If this is the first time an
assignment is made to the pointer, then the address contained in the
pointer is probably invalid. The correct approach is shown below:
pi
=
(
int
*
)
malloc
(
sizeof
(
int
));
The dereference operator should not be used in this situation.
The free
function, also
discussed in more detail later, works in conjunction with malloc
to deallocate the
memory when it is no longer needed.
Each time the malloc
function
(or similar function) is called, a corresponding call to the free
function must be made when the
application is done with the memory to avoid memory leaks.
Once memory has been freed, it should not be accessed again.
Normally, you would not intentionally access it after it had been
deallocated. However, this can occur accidentally, as illustrated in the
section Dangling Pointers. The system behaves in an
implementation-dependent manner when this happens. A common practice is to
always assign NULL
to a freed pointer,
as discussed in Assigning NULL to a Freed Pointer.
When memory is allocated, additional information is stored as part of a data structure maintained by the heap manager. This information includes, among other things, the block’s size, and is typically placed immediately adjacent to the allocated block. If the application writes outside of this block of memory, then the data structure can be corrupted. This can lead to strange program behavior or corruption of the heap, as we will see in Chapter 7.
Consider the following code sequence. Memory is allocated for a
string, allowing it to hold up to five characters plus the byte for the
NUL
termination character. The
for
loop writes zeros to each location but does not
stop after writing six bytes. The for
statement’s
terminal condition requires that it write eight bytes. The zeros being
written are binary zeros and not the ASCII value for the character
zero:
char
*
pc
=
(
char
*
)
malloc
(
6
);
for
(
int
i
=
0
;
i
<
8
;
i
++
)
{
*
pc
[
i
]
=
0
;
}
In Figure 2-2, extra memory has been allocated at the end of the six-byte string. This represents the extra memory used by the heap manager to keep track of the memory allocation. If we write past the end of the string, this extra memory will be corrupted. The extra memory is shown following the string in this example. However, its actual placement and its original content depend on the compiler.
A memory leak occurs when allocated memory is never used again but is not freed. This can happen when:
The memory’s address is lost
The free
function is never
invoked though it should be (sometimes called a hidden leak)
A problem with memory leaks is that the memory cannot be reclaimed
and used later. The amount of memory available to the heap manager is
decreased. If memory is repeatedly allocated and then lost,
then the program may terminate when more memory is needed but
malloc
cannot allocate it because it
ran out of memory. In extreme cases, the operating system may
crash.
This is illustrated in the following simple example:
char
*
chunk
;
while
(
1
)
{
chunk
=
(
char
*
)
malloc
(
1000000
);
printf
(
"Allocating
"
);
}
The variable chunk
is assigned
memory from the heap. However, this memory is not freed before another
block of memory is assigned to it. Eventually, the application will run
out of memory and terminate abnormally. At minimum, memory is not being
used efficiently.
An example of losing the address of memory is illustrated
in the following code sequence where pi
is reassigned a new address. The address
of the first allocation of memory is lost when pi
is allocated memory a second time.
int
*
pi
=
(
int
*
)
malloc
(
sizeof
(
int
));
*
pi
=
5
;
...
pi
=
(
int
*
)
malloc
(
sizeof
(
int
));
This is illustrated in Figure 2-3 where
the before and after images refer to the program’s state before and
after the second malloc
’s
execution. The memory at address 500 has not been released, and the
program no longer holds this address anywhere.
Another example allocates memory for a string, initializes it, and then displays the string character by character:
char
*
name
=
(
char
*
)
malloc
(
strlen
(
"Susan"
)
+
1
);
strcpy
(
name
,
"Susan"
);
while
(
*
name
!=
0
)
{
printf
(
"%c"
,
*
name
);
name
++
;
}
However, it increments name
by one with each loop iteration. At the end, name
is left pointing to the string’s
NUL
termination character, as
illustrated in Figure 2-4. The
allocated memory’s starting
address has been lost.
Memory leaks can also occur when the program should release memory but does not. A hidden memory leak occurs when an object is kept in the heap even though the object is no longer needed. This is frequently the result of programmer oversight. The primary problem with this type of leak is that the object is using memory that is no longer needed and should be returned to the heap. In the worst case, the heap manager may not be able to allocate memory when requested, possibly forcing the program to terminate. At best, we are holding unneeded memory.
Memory leaks can also occur when freeing structures created
using the struct
keyword. If the
structure contains pointers to dynamically allocated memory, then
these pointers may need to be freed before the structure is freed. An
example of this is found in Chapter 6.
Several memory allocation functions are available to manage dynamic memory. While what is available may be system dependent, the following functions are found on most systems in the stdlib.h header file:
malloc
realloc
calloc
free
The functions are summarized in Table 2-1.
Dynamic memory is allocated from the heap. With successive memory allocation calls, there is no guarantee regarding the order of the memory or the continuity of memory allocated. However, the memory allocated will be aligned according to the pointer’s data type. For example, a four-byte integer would be allocated on an address boundary evenly divisible by four. The address returned by the heap manager will contain the lowest byte’s address.
In Figure 2-3, the malloc
function allocates four bytes at address
500. The second use of the malloc
function allocates memory at address 600. They both are on four-byte
address boundaries, and they did not allocate memory from consecutive
memory locations.
The function malloc
allocates a
block of memory from the heap. The number of bytes allocated is
specified by its single argument. Its return type is a pointer to void.
If memory is not available, NULL
is
returned. The function does not clear or otherwise modify the memory,
thus the contents of memory should be treated as if it contained
garbage. The function’s prototype follows:
void
*
malloc
(
size_t
);
The function possesses a single argument of type size_t
. This type is discussed in Chapter 1. You need to be careful when passing
variables to this function, as problems can arise if the argument is a
negative number. On some systems, a NULL value is returned if the
argument is negative.
When malloc
is used with an
argument of zero, its behavior is implementation-specific. It may return
a pointer to NULL
or it may return a
pointer to a region with zero bytes allocated. If the malloc
function is used with a NULL
argument, then it will normally generate
a warning and execute returning zero bytes.
The following shows a typical use of the malloc
function:
int
*
pi
=
(
int
*
)
malloc
(
sizeof
(
int
));
The following steps are performed when the malloc
function is executed:
Memory is allocated from the heap
The memory is not modified or otherwise cleared
The first byte’s address is returned
Since the malloc
function may
return a NULL value if it is unable to allocate memory, it is a good
practice to check for a NULL value before using the pointer as
follows:
int
*
pi
=
(
int
*
)
malloc
(
sizeof
(
int
));
if
(
pi
!=
NULL
)
{
// Pointer should be good
}
else
{
// Bad pointer
}
Before the pointer to void was introduced to C, explicit casts
were required with malloc
to stop the generation of
warnings when assignments were made between incompatible pointer
types. Since a pointer to void can be assigned to any other pointer
type, explicit casting is no longer required. Some developers consider
explicit casts to be a good practice because:
They document the intention of the malloc
function
They make the code compatible with C++ (or earlier C compiler), which require explicit casts
Using casts will be a problem if you fail to include the header
file for malloc
. The compiler may
generate warnings. By default, C assumes functions return an integer.
If you fail to include a prototype for malloc
, it will complain when you try to
assign an integer to a pointer.
If you declare a pointer but fail to allocate memory to the address it points to before using it, that memory will usually contain garbage, resulting typically in an invalid memory reference. Consider the following code sequence:
int
*
pi
;
...
printf
(
"%d
"
,
*
pi
);
The allocation of memory is shown in Figure 2-5. This issue is covered in more detail in Chapter 7.
When executed, this can result in a runtime exception. This type of problem is common with strings, as shown below:
char
*
name
;
printf
(
"Enter a name: "
);
scanf
(
"%s"
,
name
);
While it may seem like this would execute correctly, we are
using memory referenced by name
.
However, this memory has not been allocated. This problem can be
illustrated graphically by changing the variable, pi
, in Figure 2-5 to name
.
The malloc
function allocates
the number of bytes specified by its argument. You need to be careful
when using the function to allocate the correct number of bytes. For
example, if we want to allocate space for 10 doubles, then we need to
allocate 80 bytes. This is achieved as shown below:
double
*
pd
=
(
double
*
)
malloc
(
NUMBER_OF_DOUBLES
*
sizeof
(
double
));
Use the sizeof
operator
when specifying the number of bytes to allocate for data types
whenever possible.
In the following example, an attempt is made to allocate memory for 10 doubles:
const
int
NUMBER_OF_DOUBLES
=
10
;
double
*
pd
=
(
double
*
)
malloc
(
NUMBER_OF_DOUBLES
);
However, the code only allocated 10 bytes.
There is no standard way to determine the total amount of memory allocated by the heap. However, some compilers provide extensions for this purpose. In addition, there is no standard way of determining the size of a memory block allocated by the heap manager.
For example, if we allocate 64 bytes for a string, the heap manager will allocate additional memory to manage this block. The total size allocated, and the amount used by the heap manager, is the sum of these two quantities. This was illustrated in Figure 2-2.
The maximum size that can be allocated with malloc
is system dependent. It would seem
like this size should be limited by size_t
. However, limitations can be imposed
by the amount of physical memory present and other operating system
constraints.
When malloc
executes, it is
supposed to allocate the amount of memory requested and then return
the memory’s address. What happens if the underlying operating system
uses “lazy initialization” where it does not actually allocate the
memory until it is accessed? A problem can arise at this point if
there is not enough memory available to allocate. The answer depends
on the runtime and operating systems. A typical developer normally
would not need to deal with this question because such initialization
schemes are quite rare.
You cannot use a function call when initializing a static or
global variable. In the following code sequence, we declare a static
variable and then attempt to initialize it using malloc
:
static
int
*
pi
=
malloc
(
sizeof
(
int
));
This will generate a compile-time error message. The same thing happens with global variables but can be avoided for static variables by using a separate statement to allocate memory to the variable as follows. We cannot use a separate assignment statement with global variables because global variables are declared outside of a function and executable code, such as the assignment statement, must be inside of a function:
static
int
*
pi
;
pi
=
malloc
(
sizeof
(
int
));
The calloc
function will
allocate and clear memory at the same time. Its prototype
follows:
void
*
calloc
(
size_t
numElements
,
size_t
elementSize
);
To clear memory means its contents are set to all binary zeros.
The function will allocate memory determined by the product of the
numElements
and elementSize
parameters. A pointer is returned
to the first byte of memory. If the function is unable to allocate
memory, NULL
is returned. Originally,
this function was used to aid in the allocation of memory for
arrays.
If either numElements
or
elementSize
is zero, then a null
pointer may be returned. If calloc
is
unable to allocate memory, a null pointer is returned and the global
variable, errno
, is set to ENOMEM
(out of memory). This is a POSIX
error code and may not be available on
all systems.
Consider the following example where pi
is allocated a total of 20 bytes, all
containing zeros:
int
*
pi
=
calloc
(
5
,
sizeof
(
int
));
Instead of using
, the calloc
malloc
function along with the memset
function can be used to achieve the
same results, as shown below:
int
*
pi
=
malloc
(
5
*
sizeof
(
int
));
memset
(
pi
,
0
,
5
*
sizeof
(
int
));
The memset
function will fill
a block with a value. The first argument is a pointer to the buffer to
fill. The second is the value used to fill the buffer, and the last
argument is the number of bytes to be set.
Use calloc
when memory needs to
be zeroed out. However, the execution of calloc
may take longer than using malloc
.
Periodically, it may be necessary to increase or decrease the amount of
memory allocated to a pointer. This is particularly useful when a
variable size array is needed, as will be demonstrated in Chapter 4. The realloc
function will reallocate memory. Its
prototype follows:
void
*
realloc
(
void
*
ptr
,
size_t
size
);
The function realloc
returns a
pointer to a block of memory. The function takes two arguments. The
first is a pointer to the original block, and the second is the
requested size. The reallocated block’s size will be different from the
size of the block referenced by the first argument. The return value is
a pointer to the reallocated memory.
The requested size may be smaller or larger than the currently allocated amount. If the size is less than what is currently allocated, then the excess memory is returned to the heap. There is no guarantee that the excess memory will be cleared. If the size is greater than what is currently allocated, then if possible, the memory will be allocated from the region immediately following the current allocation. Otherwise, memory is allocated from a different region of the heap and the old memory is copied to the new region.
If the size is zero and the pointer is not null, then the pointer
will be freed. If space cannot be allocated, then the original block of
memory is retained and is not changed. However, the pointer returned is
a null pointer and the errno
is set
to ENOMEM
.
The function’s behavior is summarized in Table 2-2.
First Parameter | Second Parameter | Behavior |
null | NA | Same as malloc |
Not null | 0 | Original block is freed |
Not null | Less than the original block’s size | A smaller block is allocated using the current block |
Not null | Larger than the original block’s size | A larger block is allocated either from the current location or another region of the heap |
In the following example, we use two variables to allocate memory for a string. Initially, we allocate 16 bytes but only use the first 13 bytes (12 hexadecimal digits and the null termination character (0)):
char
*
string1
;
char
*
string2
;
string1
=
(
char
*
)
malloc
(
16
);
strcpy
(
string1
,
"0123456789AB"
);
Next, we use the realloc
function to specify a smaller region of memory. The address and contents
of these two variables are then displayed:
string2
=
realloc
(
string1
,
8
);
printf
(
"string1 Value: %p [%s]
"
,
string1
,
string1
);
printf
(
"string2 Value: %p [%s]
"
,
string2
,
string2
);
The output follows:
string1 Value: 0x500 [0123456789AB] string2 Value: 0x500 [0123456789AB]
The allocation of memory is illustrated in Figure 2-6.
The heap manager was able to reuse the original block, and it did
not modify its contents. However, the program continued to use more than
the eight bytes requested. That is, we did not change the string to fit
into the eight-byte block. In this example, we should have adjusted the
length of the string so that it fits into the eight reallocated bytes.
The simplest way of doing this is to assign a NUL
character to address 507. Using more space
than allocated is not a good practice and should be avoided, as detailed
in Chapter 7.
In this next example, we will reallocate additional memory:
string1
=
(
char
*
)
malloc
(
16
);
strcpy
(
string1
,
"0123456789AB"
);
string2
=
realloc
(
string1
,
64
);
printf
(
"string1 Value: %p [%s]
"
,
string1
,
string1
);
printf
(
"string2 Value: %p [%s]
"
,
string2
,
string2
);
When executed, you may get results similar to the following:
string1 Value: 0x500 [0123456789AB] string2 Value: 0x600 [0123456789AB]
In this example, realloc
had to
allocate a new block of memory. Figure 2-7 illustrates the allocation of
memory.
The alloca
function
(Microsoft’s malloca
) allocates
memory by placing it in the stack frame for the function. When the function returns,
the memory is automatically freed. This function can be difficult to
implement if the underlying runtime system is not stack-based. As a
result, this function is nonstandard and should be avoided if the
application needs to be
portable.
In C99, Variable Length Arrays (VLAs) were introduced,
allowing the declaration and creation of an array within a function
whose size is based on a variable. In the following example, an array of
char
is allocated for use in a
function:
void
compute
(
int
size
)
{
char
*
buffer
[
size
];
...
}
This means the allocation of memory is done at runtime and memory
is allocated as part of the stack frame. Also, when the sizeof
operator is used with the array, it
will be executed at runtime rather than compile time.
A small runtime penalty will be imposed. Also, when the function
exits, the memory is effectively deallocated. Since we did not use a
malloc
type function to create it, we
should not use the free
function to
deallocate it. The function should not return a pointer to this memory
either. This issue is addressed in Chapter 5.
VLAs do not change size. Their size is fixed once they are
allocated. If you need an array whose size actually changes, then an
approach such as using the realloc
function, as discussed in the section Using the realloc Function, is needed.
With dynamic memory allocation, the programmer is able to return
memory when it is no longer being used, thus freeing it up for other uses.
This is normally performed using the free
function, whose prototype is shown
below:
void
free
(
void
*
ptr
);
The pointer argument should contain the address of memory allocated
by a malloc
type function. This memory
is returned to the heap. While the pointer may still point to the region,
always assume it points to garbage. This region may be reallocated later
and populated with different data.
In the simple example below, pi
is allocated memory and is eventually freed:
int
*
pi
=
(
int
*
)
malloc
(
sizeof
(
int
));
...
free
(
pi
);
Figure 2-8 illustrates
the allocation of memory immediately before and right after the free
function executes. The dashed box at
address 500 that indicates the memory has been freed but still may contain
its value. The variable pi
still
contains the address 500. This is called a dangling pointer and is
discussed in detail in the section Dangling Pointers.
If the free
function is passed a
null pointer, then it normally does nothing. If the pointer passed has
been allocated by other than a malloc
type function, then the function’s behavior is undefined. In the following
example, pi
is allocated the address of
num
. However, this is not a valid heap
address:
int
num
;
int
*
pi
=
&
num
;
free
(
pi
);
// Undefined behavior
Manage memory allocation/deallocation at the same level. For example, if a pointer is allocated within a function, deallocate it in the same function.
Pointers can cause problems even after they have been freed. If we
try to dereference a freed pointer, its behavior is undefined. As a
result, some programmers will explicitly assign NULL
to a pointer to designate the pointer as
invalid. Subsequent use of such a pointer will result in a runtime
exception.
An example of this approach follows:
int
*
pi
=
(
int
*
)
malloc
(
sizeof
(
int
));
...
free
(
pi
);
pi
=
NULL
;
The allocation of memory is illustrated in Figure 2-9.
This technique attempts to address problems like dangling
pointers. However, it is better to spend time addressing the conditions
that caused the problems rather than crudely catching them with a null
pointer. In addition, you cannot assign NULL
to a constant pointer except when it is
initialized.
The term double free refers to an attempt to free a block of memory twice. A simple example follows:
int
*
pi
=
(
int
*
)
malloc
(
sizeof
(
int
));
*
pi
=
5
;
free
(
pi
);
...
free
(
pi
);
The execution of the second free
function will result in a runtime
exception. A less obvious example involves the use of two pointers, both
pointing to the same block of memory. As shown below, the same runtime
exception will result when we accidentally try to free the same memory a
second time:
p1
=
(
int
*
)
malloc
(
sizeof
(
int
));
int
*
p2
=
p1
;
free
(
p1
);
...
free
(
p2
);
This allocation of memory is illustrated in Figure 2-10.
When two pointers reference the same location, it is referred to as aliasing. This concept is discussed in Chapter 8.
Unfortunately, heap managers have a difficult time determining whether a block has already been deallocated. Thus, they don’t attempt to detect the same memory being freed twice. This normally results in a corrupt heap and program termination. Even if the program does not terminate, it represents questionable problem logic. There is no reason to free the same memory twice.
It has been suggested that the free
function should assign a NULL
or some other special value to its
argument when it returns. However, since pointers are passed by value,
the free
function is unable to
explicitly assign NULL
to the
pointer. This is explained in more detail in the section Passing a Pointer to a Pointer.
The heap typically uses operating system functions to manage
its memory. The heap’s size may be fixed when the program is created, or
it may be allowed to grow. However, the heap manager does not
necessarily return memory to the operating system when the free
function is called. The deallocated
memory is simply made available for subsequent use by the application.
Thus, when a program allocates and then frees up memory, the
deallocation of memory is not normally reflected in the application’s
memory usage as seen from the operating system perspective.
The operating system is responsible for maintaining the resources of an application, including its memory. When an application terminates, it is the operating system’s responsibility to reallocate this memory for other applications. The state of the terminated application’s memory, corrupted or uncorrupted, is not an issue. In fact, one of the reasons an application may terminate is because its memory is corrupted. With an abnormal program termination, cleanup may not be possible. Thus, there is no reason to free allocated memory before the application terminates.
With this said, there may be other reasons why this memory should be freed. The conscientious programmer may want to free memory as a quality issue. It is always a good habit to free memory after it is no longer needed, even if the application is terminating. If you use a tool to detect memory leaks or similar problems, then deallocating memory will clean up the output of such tools. In some less complex operating systems, the operating system may not reclaim memory automatically, and it may be the program’s responsibility to reclaim memory before terminating. Also, a later version of the application could add code toward the end of the program. If the previous memory has not been freed, problems could arise.
Thus, ensuring that all memory is free before program termination:
May be more trouble than it’s worth
Can be time consuming and complicated for the deallocation of complex structures
Can add to the application’s size
Results in longer running time
Introduces the opportunity for more programming errors
Whether memory should be deallocated prior to program termination is application-specific.
If a pointer still references the original memory after it has been freed, it is called a dangling pointer. The pointer does not point to a valid object. This is sometimes referred to as a premature free.
The use of dangling pointers can result in a number of different types of problems, including:
Unpredictable behavior if the memory is accessed
Segmentation faults when the memory is no longer accessible
Potential security risks
These types of problems can result when:
Memory is accessed after it has been freed
A pointer is returned to an automatic variable in a previous function call (discussed in the section Pointers to Local Data)
Below is a simple example where we allocate memory for an integer
using the malloc
function. Next, the
memory is released using the free
function:
int
*
pi
=
(
int
*
)
malloc
(
sizeof
(
int
));
*
pi
=
5
;
printf
(
"*pi: %d
"
,
*
pi
);
free
(
pi
);
The variable pi
will still hold
the integer’s address. However, this memory may be reused by the heap
manager and may hold data other than an integer. Figure 2-11 illustrates the program’s state immediately before
and after the free
function is
executed. The pi
variable is assumed
to be part of the main
function and
is located at address 100. The memory allocated using malloc
is found at address 500.
When the free
function is
executed, the memory at address 500 has been deallocated and should not
be used. However, most runtime systems will not prevent subsequent
access or modification. We may still attempt to write to the location as
shown below. The result of this action is unpredictable.
free
(
pi
);
*
pi
=
10
;
A more insidious example occurs when more than one pointer
references the same area of memory and one of them is freed. As shown
below, p1
and p2
both refer to the same area of memory,
which is called pointer aliasing. However, p1
is freed:
int
*
p1
=
(
int
*
)
malloc
(
sizeof
(
int
));
*
p1
=
5
;
...
int
*
p2
;
p2
=
p1
;
...
free
(
p1
);
...
*
p2
=
10
;
// Dangling pointer
Figure 2-12 illustrates the allocation of memory where the dotted box represents freed memory.
A subtle problem can occur when using block statements, as shown
below. Here pi
is assigned the
address of tmp
. The variable pi
may be a global variable or a local
variable. However, when tmp
’s
enclosing block is popped off of the program stack, the address is no
longer valid:
int
*
pi
;
...
{
int
tmp
=
5
;
pi
=
&
tmp
;
}
// pi is now a dangling pointer
foo
();
Most compilers will treat a block statement as a stack frame.
The variable tmp
was allocated on the
stack frame and subsequently popped off the stack when the block
statement was exited. The pointer pi
is now left pointing to a region of memory that may eventually be
overridden by a different activation record, such as the function
foo
. This condition is illustrated in
Figure 2-13.
Debugging pointer-induced problems can be difficult to resolve at times. Several approaches exist for dealing with dangling pointers, including:
Setting a pointer to NULL
after freeing it. Its subsequent use will terminate the application.
However, problems can still persist if multiple copies of the
pointer exist. This is because the assignment will only affect one
of the copies, as illustrated in the section Double Free.
Writing special functions to replace the free
function (see Writing your own free function).
Some systems (runtime/debugger) will overwrite data when it is freed (e.g., 0xDEADBEEF - Visual Studio will use 0xCC, 0xCD, or 0xDD, depending on what is freed). While no exceptions are thrown, when the programmer sees memory containing these values where they are not expected, he knows that the program may be accessing freed memory.
Use third-party tools to detect dangling pointers and other problems.
Displaying pointer values can be helpful in debugging dangling
pointers, but you need to be careful how they are displayed. We have
already discussed how to display pointer values in Displaying Pointer Values. Make sure you display them
consistently to avoid confusion when comparing pointer values. The
assert
macro can also be useful, as
demonstrated in Dealing with Uninitialized Pointers.
Microsoft provides techniques for addressing overwriting of dynamically allocated memory and memory leaks. This approach uses special memory management techniques in debug versions of a program to:
Check the heap’s integrity
Check for memory leaks
Simulate low heap memory situations
Microsoft does this by using a special data structure to manage
memory allocation. This structure maintains debug information, such as
the filename and line number where malloc
is called. In addition, buffers are
allocated before and after the actual memory allocation to detect
overwriting of the actual memory. More information about this technique
can be found at Microsoft Developer
Network.
The Mudflap Libraries provide a similar capability for the GCC compiler. Its runtime library supports the detection of memory leaks, among other things. This detection is accomplished by instrumenting the pointer dereferencing operations.
So far, we have talked about the heap manager’s allocating and deallocating memory. However, the implementation of this technology can vary by compiler. Most heap managers use a heap or data segment as the source for memory. However, this approach is subject to fragmentation and may collide with the program stack. Nevertheless, it is the most common way of implementing the heap.
Heap managers need to address many issues, such as whether heaps are allocated on a per process and/or per thread basis and how to protect the heap from security breaches.
There are a number of heap managers, including OpenBSD’s malloc, Hoard’s malloc, and TCMalloc developed by Google. The GNU C library allocator is based on the general-purpose allocator dlmalloc. It provides facilities for debugging and can help in tracking memory leaks. The dlmalloc’s logging feature tracks memory usage and memory transaction, among other actions.
A manual technique for managing the memory used for structures is presented in Avoiding malloc/free Overhead.
The malloc
and free
functions provide a way of manually
allocating and deallocating memory. However, there are numerous issues
regarding the use of manual memory management in C, such as performance,
achieving good locality of reference, threading problems, and cleaning
up memory gracefully.
Several nonstandard techniques can be used to address some of these issues, and this section explores some of them. A key feature of these techniques is the automatic deallocation of memory. When memory is no longer needed, it is collected and made available for use later in the program. The deallocated memory is referred to as garbage. Hence, the term garbage collection denotes the processing of this memory.
Garbage collection is useful for a number of reasons, including:
Freeing the programmer from having to decide when to deallocate memory
Allowing the programmer to focus on the application’s problem
One alternative to manual memory management is the Boehm-Weiser Collector. However, this is not part of the language.
Resource Acquisition Is Initialization (RAII) is a technique invented by Bjarne Stroustrup. It addresses the allocation and deallocation of resources in C++. The technique is useful for guaranteeing the allocation and subsequent deallocation of a resource in the presence of exceptions. Allocated resources will eventually be released.
There have been several approaches for using RAII in C. The GNU compiler provides a nonstandard extension to support this. We will illustrate this extension by showing how memory can be allocated and then freed within a function. When the variable goes out of scope, the deallocation process occurs automatically.
The GNU extension uses a macro called RAII_VARIABLE
. It declares a variable and
associates with the
variable:
A type
A function to execute when the variable is created
A function to execute when the variable goes out of scope
The macro is shown below:
#define RAII_VARIABLE(vartype,varname,initval,dtor)
void _dtor_ ## varname (vartype * v) { dtor(*v); }
vartype varname __attribute__((cleanup(_dtor_ ## varname))) = (initval)
In the following example, we declare a variable called name
as a pointer to char
. When it is created, the malloc
function is executed, allocating 32
bytes to it. When the function is terminated, name
goes out of scope and the free
function is executed:
void
raiiExample
()
{
RAII_VARIABLE
(
char
*
,
name
,
(
char
*
)
malloc
(
32
),
free
);
strcpy
(
name
,
"RAII Example"
);
printf
(
"%s
"
,
name
);
}
When this function is executed, the string “RAII_Example” will be displayed.
Similar results can be achieved without using the GNU extension.
Another approach to deal with the deallocation of memory is to use exception handling. While exception handling is not a standard part of C, it can be useful if available and possible portability issues are not a concern. The following illustrates the approach using the Microsoft Visual Studio version of the C language.
Here the try block encloses any statements that might cause an
exception to be thrown at runtime. The finally block will be executed
regardless of whether an exception is thrown. The free
function is guaranteed to be
executed.
void
exceptionExample
()
{
int
*
pi
=
NULL
;
__try
{
pi
=
(
int
*
)
malloc
(
sizeof
(
int
));
*
pi
=
5
;
printf
(
"%d
"
,
*
pi
);
}
__finally
{
free
(
pi
);
}
}
You can implement exception handling in C using several other approaches.
Dynamic memory allocation is a significant C language feature. In
this chapter, we focused on the manual allocation of memory using the
malloc
and free
functions. We addressed a number of common
problems involving these functions, including the failure to allocate
memory and dangling pointers.
There are other nonstandard techniques for managing dynamic memory in C. We touched on a few of these garbage collection techniques, including RAII and exception handling.
18.226.93.137