Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

V. SarcarSimple and Efficient Programming with C# https://doi.org/10.1007/978-1-4842-8737-8_12

12. Memory Management

Vaskaran Sarcar¹

(1)

Kolkata, West Bengal, India

Memory management is an important concern for a developer, and it is a very big topic. This chapter will touch on the important points in a simplified manner to help you understand memory management in programming.

Following some design guidelines while making an application is not enough; that is only one part of the equation. An application is truly efficient when there is no memory leak. If a computer program runs over a long time but fails to release memory resources that are no longer needed, you can see the impact of the memory leaks. Here are some common symptoms:

A machine becomes slow over time.
A specific operation in an application takes longer to execute.
The worst case is that an application/system can crash.

But before we discuss memory leaks, it’ll be helpful if you can clarify your understanding of memory allocations and deallocations. A novice C# programmer often believes that the garbage collector (GC) can take care of memory management in every possible scenario. This is not true, and unfortunately, it is a common mistake. This chapter will cover this and also help you understand the cause of memory leaks, which we’ll discuss in Chapter 13.

Overview

In a programming language like C++, you deallocate the memory once the intended job is completed to avoid memory leaks. But .NET always tries to make your programming life easier. It has a garbage collector that clears the objects that do not have any use after a particular point. In programming, they are called dirty objects or unreferenced objects.

How does the garbage collector clear the dirty objects? In C#, the heap memory is managed. This means the CLR takes care of this responsibility. In the managed code, CLR’s garbage collector does this job for you, and you do not have to deallocate the managed memory. It removes the unused stuff on the heap and recollects the memory for further use. The garbage collector program runs in the background as a low-priority thread. It keeps track of the dirty objects for you. The .NET runtime on regular intervals can invoke this program to remove unreferenced or dirty objects from memory. At a given point in time, if an object has no reference, the garbage collector marks this object and reclaims the memory occupied by the object, assuming that it is no longer needed.

Note

In theory, when a local variable references an object, it’s ready for garbage collection at the earliest point at which it is no longer needed. But if you disable the optimization in debug mode, the lifetime of the object extends to the end of the block. But garbage collection may not reclaim the memory immediately. There are various factors that affect this, such as available memory and the time since the last collection. This means an orphaned object can be released immediately, or there may be some delay that may vary.

However, there is a catch. Some objects require special code to release resources. Here are some common examples: you open a file, perform some reading or writing, but forget to close the file. A similar kind of attention is needed when you deal with unmanaged objects, locking mechanisms, the operating system (OS) handles in your programs, and so forth. Programmers explicitly need to release those resources. These are the cases where you need to put in special attention to prevent memory leaks. In general, when programmers themselves clean up (or release) the memory, you say that they dispose of the objects, but when CLR automatically releases the resources, you say that the garbage collector performs its job. The garbage collector uses the finalizers (or, destructors) of the class instance to perform the final cleanup. We’ll discuss them shortly.

POINTS TO REMEMBER

Programmers can release resources by explicitly disposing of the objects, or the CLR automatically releases resources through a garbage collection mechanism. We often refer to them as the disposing and finalizing techniques, respectively.

Stack Memory vs. Heap Memory

To understand the upcoming discussion, it’s important to understand the difference between stack memory and heap memory. If you know the difference, you can skip this section. Otherwise, continue reading.

To execute a program, the operating system gives you a pile of memory. The program splits this into several portions for various uses. There are two major parts; one is stack memory, and the other one is heap memory.
These two kinds of memories store different kinds of data.
For example, the stack is used for local variables and to keep track of the current state of the program. What are local variables? They are the variables that are declared in a method.
By contrast, the instance variables for reference types are stored on the heap. The static variables are stored on the heap too.
For the reference type variable, the variable itself will be stored on the stack, but the contents are stored on the heap.
For example, when you see the line A obA=new A();, you understand that the reference variable obA is stored on the stack, but the object/content is stored on the heap.

The stack follows the last in, first out (LIFO) mechanism. It works like a stack of frames, where one frame is placed on top of another frame. You can also think of it as a set of boxes, where one box is placed on top of another box. All local variables of a particular method can go into a single frame. At a particular moment, you can access the top frame of the stack, but you cannot access the lower frames.

Once the top frame is removed from a stack and discarded, the immediate lower frame can be accessed as it becomes the top frame. The process can continue until the stack is empty. But, in between, the stack size can further increase or decrease during the program execution.

But the most important point is that the stack-allocated memory blocks are discarded when a method finishes its execution.

To help you visualize this with simple diagrams, let’s consider the following code segment:

// The previous code skipped

public void SomeMethod()

{

int a=1;// Line-1

double b=2.5; // Line-2

int c=3;// Line-3

// Some other code, if any

}

Figure 12-1 shows you four different stages in a single snapshot.

Assume that the control entered into the method called SomeMethod. The top three lines of this method have been executed, but it does not reach the end of the method body. You can see that the stack is growing in this stage in the first part of this diagram.
The next parts of the diagram show that the cleaning up of the stack is in progress. It is true that when the control leaves the method body, all the variables a, b, and c are deleted. But following the LIFO structure, I have shown you the intermediate deletions one by one.

Figure 12-1
The different statuses of the stack memory when a program runs

In short, for a stack allocation, you know that once you return from a method, the allocated frame is discarded, and you can use the space immediately.

On the other hand, heap memory is used for object/reference types. Here the tracking of a program state is not the concern. Instead, it focuses on storing the data. A program can easily allocate some space in the heap and start using the space to store the information.

Note

In Visual Studio, in debug mode, you can see the call stack and analyze the stack trace. In addition, once you learn multithreaded programming, you’ll see that each thread can have its own stack, but they share the same heap space among them.

For a heap, you can add or remove allocated space in any order. See Figure 12-2.

Figure 12-2
A sample figure that represents a heap memory with different allocations

In this case, you need to remember the allocation, and before you reuse the space, someone needs to clear the old allocation. But what happens if you forget to delete the space? Or what happens if you use an already created reference to point to a different object in the heap, but later you make it null? These kinds of allocated memory spaces will keep increasing (which becomes garbage), and you’ll see the impact of the memory leaks. This is the point where the garbage collector (GC) in C# helps you. Periodically, the GC checks the status and tries to help you by freeing unused spaces.

Each time you create an object, the CLR allocates memory in the managed heap. It can keep allocating the memory until the address space in the managed heap is available. The GC has an optimizing engine to determine when to reclaim unused memories.

Q&A Session

12.1 What is a managed heap?

Answer:

The managed code is the code that is managed by a runtime, e.g., the Common Language Runtime (CLR). This CLR provides many services, and automatic memory management is one of them. When you initialize a process, the runtime reserves a contiguous address space for it. This reserved space is called the managed heap.

This managed heap has a pointer that points to the address where the next object will be allocated. You can surely guess that the allocation process for the first object starts with the managed heap’s base address. The allocation for the next object will occur to the address that immediately follows the previous object. The garbage collector repeats the process until the address space is available for use.

12.2 I have a solution in my mind. I can allocate memory on the heap, and once my job is done, I’ll delete it immediately. This way I can prevent the garbage from growing. Is my understanding correct?

Answer:

Yes, the proposed solution can work and help you prevent leaks. But this is not that easy. There are situations where the objects need to stay alive for a while. Consider an example: using an advanced printer, you simultaneously send multiple emails and faxes to different recipients. At the same time, you start printing some large documents. It is very unlikely that all the recipients receive the data at the same time or a document with a big number of pages is printed instantly. So, an immediate deletion is not a clever solution in these scenarios.

12.3 Let us assume there is a class, called Test. I understand that for the line Test testObj=new Test(); the space for the object will be allocated in the heap memory. But what about the reference variable?

Answer:

The reference variable will stay in the stack memory. See Figure 12-3.

Figure 12-3
An object reference on the stack points to the actual memory in the heap

12.4 In many discussions, people say that the struct is on a heap. But my understanding is that the content of a struct should be in the stack. Am I missing something?

Answer:

This is interesting. You have to understand the context. For example, instance variables for a value type are stored in the same context as the variable that declares the value type. So, the struct variable that is declared within a method will always be on the stack, whereas a struct variable that is an instance field of a class will be stored on the heap.

12.5 Sometimes I wonder about these references. Are they similar to the pointers in C/C++?

Answer:

The concept is similar, but not the same. Before I answer your question, let me explain something for a better understanding. I already mentioned that the GC manages the heap memory for you. How does it manage this stuff?

First, it frees up the garbage’s/unused spaces for you so that you can reuse the space.
Second, it can apply the compaction technique, which means it can remove all allocated space to one side of the memory and all the free space to the other side of the memory. It results in contiguous free space that helps you to allocate a large block of memory.

The first point is important and covered in this chapter. The second point is also important because the heap may contain scattered objects (see Figure 12-2). In many situations, you may need to have a big chunk of a contiguous memory that may not available at a particular time, though there is enough space in the heap. In these scenarios, the compaction helps to get enough space. These references are maintained by the garbage collector, and when this kind of shuffling is done, you are not aware of it.

Note

Actually, you have two different types of heap; one is a large object heap (LOH), and another one is a small object heap (SOH). The objects of sizes 85,000 bytes and above are placed in a large object heap. Usually, these are array objects. To make the discussion easy, I simply use the word heap, instead of categorizing it. The SOH is used for three different generations, which you’ll read in the following section.

To elaborate on these with simple figures, let’s assume we have a heap. After the garbage collector’s cleanup operation, it may look like Figure 12-4 (white blocks are represented as free/available blocks).

Figure 12-4
Scattered allocations in the memory before the compaction

You can see that if you need to allocate five contiguous memory blocks in our heap, you cannot allocate them now, although collectively there are enough spaces. To deal with a similar situation, the garbage collector can apply the compaction technique, which moves all remaining objects (live objects) to one end to form one continuous block of memory. So, after compaction, it may look like Figure 12-5.

Figure 12-5
Revised allocations in the memory after the compaction

Now you can easily allocate five contiguous blocks of memory in the heap. What is the benefit? A new object can be allocated at the end of the contiguous allocation. In programming, you can do this by adding a value to the heap pointer. As a result, you do not need to iterate through a linked list of addresses to find spaces for the new object. In this way, a managed heap is different from an unmanaged heap.

What do I mean by an unmanaged heap? Consider a case when you manage the heap and you are responsible for allocating and deallocating spaces. In simple words, when an object is allocated in a managed heap, instead of getting the actual pointer, you get a “handle” to represent an indirection to a memory address. This is helpful because the actual memory location can be changed after the GC’s compaction. But for a native code (say when you use the malloc() function in the C/C++ code to allocate a space), you get pointers, not handles.

After the compaction, objects generally stay in the same area, so accessing them also becomes easier and faster (because page swapping happens less). The compaction technique is costly, but the overall gain can be greater. The Microsoft documentation says the following:

Memory is compacted only if a collection discovers a significant number of unreachable objects. If all the objects in the managed heap survive a collection, then there is no need for memory compaction.
To improve performance, the runtime allocates memory for large objects in a separate heap. The garbage collector automatically releases the memory for large objects. However, to avoid moving large objects in memory, this memory is usually not compacted.

Note

If you are interested in more details, I encourage you to read the following .NET blog article: https://devblogs.microsoft.com/dotnet/large-object-heap-uncovered-from-an-old-msdn-article/

Now I return to the original question. It is important how you interpret the word pointer. In C/C++, using a pointer, you point to an address that is nothing but a number slot in the memory. But the problem is, if you point to an invalid address, you encounter surprises! So, a pointer in an “unsafe” context is tricky.

On the other hand, a reference in C# points to a valid address in the managed heap, or it is null. This is the kind of assurance you receive from C#. In addition, references are useful because when the data moves around the memory, you can access that data using these references.

The Garbage Collector in Action

A generational garbage collector (GC) is used to collect short-lived objects more frequently than longer-lived objects. We have three generations here: 0, 1, and 2. Short-lived objects (for example, temporary variables) are stored in generation 0. The longer-lived objects are pushed into the higher generations—either 1 or 2. The garbage collector works more frequently in the lower generations than in the higher ones.

Once you create an object, it resides in generation 0. When generation 0 is filled up, the garbage collector is invoked. The objects that survive generation 0 garbage collection are transferred to the next higher generation—generation 1. The objects that survive garbage collection in generation 1 enter the highest generation—generation 2. The objects that survive the generation 2 garbage collection stay in the same generation.

POINTS TO NOTE

Sometimes you create a very large object. This kind of object directly goes to the large object heap (LOH). It is often referred to as generation 3. Generation 3 is a physical generation that’s logically collected as part of generation 2. In this context, I encourage you to read the online Microsoft documentation at https://docs.microsoft.com/en-us/dotnet/standard/garbage-collection/large-object-heap that says the following:

In the future, .NET may decide to compact the LOH automatically. This means that, if you allocate large objects and want to make sure that they don't move, you should still pin them.

I suggest you use the 3-3 rule to remember the different phases of a garbage collection and the different ways to invoke the GC.

Different Phases of Garbage Collection

The following are the three phases of garbage collection:

Phase 1: This is the marking phase, in which the live objects are marked or identified.
Phase 2: This is the relocating phase, in which it updates the references of the objects that will be compacted in phase 3.
Phase 3: This is the compacting phase, which reclaims memory from dead (or unreferenced) objects, and the compaction operation is performed on the live objects. It moves the live objects (that survived until this point) to the older end of the segment.

Different Cases of Invoking the Garbage Collector

The following are three different cases of invoking the garbage collector:

Case 1: You have low memory.
Case 2: The allocated objects (in a managed heap) surpass a defined threshold limit.
Case 3: You call the System.GC() method. There are many overloaded versions of GC.Collect(). The GC is a static class and is defined in the System namespace.

The following program demonstrates a simple case study. I have used the GetTotalMemory() method in this example. I am using the summary from Visual Studio for your immediate reference. The explanation is clear.

// Summary:

// Retrieves the number of bytes currently thought to be allocated. A

// parameter indicates whether this

// method can wait for a short interval before

// returning, to allow the system to collect garbage

// and finalize objects.

// Parameters:

// forceFullCollection:

// true to indicate that this method can wait for

// garbage collection to occur before

// returning; otherwise, false.

// Returns:

// A number that is the best available

// approximation of the number of bytes currently

// allocated in managed memory.

Similarly, you can see the descriptions of any method from Visual Studio. Here are some brief descriptions of additional methods. I use them in the upcoming example:

GC.Collect(Int32) forces an immediate garbage collection from generation 0 through a specified generation. This means that when you call Gc.Collect(0), the garbage collection will happen at generation 0. If you call Gc.Collect(1), the garbage collection will happen both at generation 0 and at generation 1, and so forth.
The CollectionCount method returns the number of times garbage collection has occurred for the specified generation of objects.
After I invoke the GC, I invoke the WaitForPendingFinalizers() method. This method definition says that this method suspends the current thread until the thread that is processing the queue of finalizers has emptied that queue.
Starting from C# 9.0, you can use a new syntax for a null check. This is shown here. So, the following block of code does not create any compile-time error:

if (sample is not null){// some code}

In this program, you’ll see the following line:

GC.Collect(i, GCCollectionMode.Forced, false,

true);

At the time of this writing, there are five overloaded methods for Collect():

public static void Collect();

public static void Collect(int generation);

public static void Collect(int generation,

GCCollectionMode mode);

public static void Collect(int generation,

GCCollectionMode mode, bool blocking);

public static void Collect(int generation,

GCCollectionMode mode, bool blocking, bool

compacting);

You can see their definitions easily in Visual Studio. For your immediate reference, I present the descriptions here:

generation: This is the number of the oldest generation to be garbage collected.
mode: This is an enumeration value that specifies whether the garbage collection is forced (System.GCCollectionMode.Default or System.GCCollectionMode.Forced) or optimized (System.GCCollectionMode.Optimized).
blocking: You set this to true to perform a blocking garbage collection; set it to false to perform a background garbage collection where possible.
compacting: You set it to true to compact the small object heap; set it to false to sweep only.

The purpose of this example is as follows:

To show you different generations of garbage collection
To demonstrate that an object can enter from one generation to the next generation if the garbage is not collected

Demonstration 1

Here is the complete demonstration:

Console.WriteLine("***Exploring Garbage Collections.***");

try

{

Console.WriteLine($"Maximum GC Generation is {GC.MaxGeneration}");

Sample sample = new();

GCHelper.CheckObjectStatus(sample);

for (int i = 0; i < 3; i++)

{

Console.WriteLine($" After GC.Collect({i})");

GC.Collect(i, GCCollectionMode.Forced, false, true);

System.Threading.Thread.Sleep(10000);

GC.WaitForPendingFinalizers();

GCHelper.ShowAllocationStatus();

GCHelper.CheckObjectStatus(sample);

}

catch (Exception ex)

{

Console.WriteLine("Error:" + ex.Message);

}

class Sample

{

public Sample()

{

// Some code

}

class GCHelper

{

public static void CheckObjectStatus(

Sample sample)

{

if (sample is not null) // C# 9.0 onwards OK

{

Console.WriteLine($" The {sample} object is in Generation:

{GC.GetGeneration(sample)}");

}

public static void ShowAllocationStatus()

{

Console.WriteLine("---------");

Console.WriteLine($"Gen-0 collection

count:{GC.CollectionCount(0)}");

Console.WriteLine($"Gen-1 collection

count:{GC.CollectionCount(1)}");

Console.WriteLine($"Gen-2 collection

count:{GC.CollectionCount(2)}");

Console.WriteLine($"Total Memory

allocation:{GC.GetTotalMemory(false)}");

Console.WriteLine("---------");

}

Output

Here is one possible output. I have highlighted some important lines in bold. On your computer, you may see different outputs. Read the “Analysis” section to learn more about this difference.

***Exploring Garbage Collections.***

Maximum GC Generation is 2

The Sample object is in Generation:0

After GC.Collect(0)

---------

Gen-0 collection count:1

Gen-1 collection count:0

Gen-2 collection count:0

Total Memory allocation:154960

---------

The Sample object is in Generation:1

After GC.Collect(1)

---------

Gen-0 collection count:2

Gen-1 collection count:1

Gen-2 collection count:0

Total Memory allocation:147624

---------

The Sample object is in Generation:2

After GC.Collect(2)

---------

Gen-0 collection count:3

Gen-1 collection count:2

Gen-2 collection count:1

Total Memory allocation:146848

---------

The Sample object is in Generation:2

POINTS TO NOTE

It is possible to see the different counters if additional garbage collection happens in between these calls. In this possible output, you can see that the sample instance was not collected in any of the GC invocation calls. So, it survived and gradually moved to generation 2.

The total memory allocations in this output seem to be logical because, after each GC invocation, you see that the total allocations are reducing. This may not happen in every possible output because you may not allow the GC to complete its job before you show the memory status. So, to get a more consistent result, I also introduced a sleep time, after I invoke the GC, and I also invoke WaitForPendingFinalizers(). This allows the GC to have more time to complete its job. Yes, it causes some performance penalties, but in my system, it produces a more consistent result. Based on your system configuration, you may need to vary the sleep time accordingly.

Notice that I have used the following overloaded version: GC.Collect(i, GCCollectionMode.Forced, false, true). You understand that I make the third parameter false to perform a background garbage collection if possible.

Another important point to note: before a garbage collection starts, all the managed threads are suspended, except the thread that invokes the GC. So, once the GC finishes its task, other threads can start allocating spaces again. If you know the concept of multithreading, understanding the previous line is easy for you.

One last point: these generations are a logical view of the GC heap. Physically these objects reside on the managed heap, which is a chunk of memory. The GC reserves this from the OS via calling VirtualAlloc. We are not going to discuss it in that detail.

Analysis

This is only sample output that can vary on every run. If needed, you can go through the theory in the previous sections again and then try to understand how the garbage collection happened. Here are some important observations:

There are different generations of the GC.
You can see that once you called GC.Collect(2), the other generations are also called. Notice that the counters have increased. Similarly, when you called GC.Collect(1), generation 1 and generation 0 both are called.
You can also see the object that I created was originally placed in generation 0.