Garbage collection in the .NET Framework: how to optimize memory in your applications
This article provides an overview of garbage collection in the .NET Framework. It describes how the process works, clarifies what application roots are, and explains the difference between the finalization queue and F-reachable queue.
The author of this article is EPAM Lead Software Engineer Prodipto Banerji.
What is garbage collection in the .NET Framework?
Garbage collection refers to the automatic memory management process involved in allocating and deallocating memory for the objects created by applications. Whenever memory is required by applications, the garbage collector allocates memory for the object from the managed heap.
In the .NET Framework, there exists a runtime environment called the common language runtime (CLR) that executes code and offers services to facilitate development. Within the CLR, the garbage collector (GC) is responsible for ensuring the process of automatic memory management.
The above picture shows that in the managed heap, at present there are 2 objects (Object A and Object B) stored, and the pointer NextObjPtr indicates the next available location for object storage in the heap.
Application roots
In every .NET application, a collection of roots is present. These roots serve as references to storage locations, pointing to the objects in the managed heap or to the objects that are currently null. Essentially, application roots act as indexes specific to that application, providing access to the managed heap.
The application's roots consist of all global and static object pointers within the application. Any local variable or parameter object pointer residing on a thread's stack is also considered a part of the application's roots. Even CPU registers that hold pointers to objects in the heap are included in the application roots. The CLR maintains and manages the list of active roots, which is accessible to the garbage collector's algorithm.
The image above illustrates how objects are stored in the managed heap while the application roots serve as references pinpointing the storage locations of these objects, which are essential for the functioning of the application.
Once the managed heap reaches its full capacity, the garbage collector (GC) is triggered to release memory. During its execution, the GC initially treats all objects in the heap as garbage. It proceeds by traversing the roots and constructing a graph that encompasses all objects reachable from these roots. This process is reiterated for all subsequent roots until the entire graph is formed.
Notably, the GC ensures that it does not duplicate any object already present in the graph. This approach enhances performance and prevents infinite loops, such as those found in circular linked lists.
Once the managed heap becomes densely populated, the garbage collector (GC) is activated to distinguish between garbage and non-garbage objects. The GC relocates the non-garbage objects to a new location by compacting the used space and eliminating any unused gaps. The GC also updates the application roots, ensuring that the pointers now refer to the objects' new locations.
Finalization queue and F-reachable queue
When an application generates a new object, the garbage collector (GC) allocates memory from the heap. If the object's type includes a finalize method, a reference to the object is placed in the finalization queue. This queue is an internal data structure managed by the GC. Each entry within the queue indicates an object that requires its finalize method to be invoked before the memory allocated to that object can be reclaimed.
Suppose the garbage collector (GC) identifies Objects G, D and C as garbage. During the GC execution, it scans the finalization queue to search for pointers to these objects. Upon finding a pointer, it removes it from the finalization queue and transfers it to the F-reachable queue.
The F-reachable queue is another internal data structure under the control of the garbage collector.
Each entry in the F-reachable queue corresponds to an object that is ready to undergo its finalize method invocation. Objects B and H lack finalize methods, so their memory should be reclaimed. However, the memory of Object G is not reclaimed since its finalize method is yet to be called.
A dedicated runtime thread is assigned to execute the finalize methods. When the F-reachable queue is empty, this thread goes into a sleeping state. When an entry appears in the F-reachable queue, however, the thread wakes up and proceeds to execute the object's finalize method. Subsequently, the object is removed from the F-reachable queue. As a result, it is advised that you not execute any code within the finalize method to avoid potential issues.
Conclusion
The common language runtime initiates garbage collection, an automatic process that employs built-in queues to execute the finalize method linked with objects. For developers, it is essential that no code execution is included in the finalize method to avoid potential problems.
Additionally, it is recommended that an object’s instance is promptly removed after it has served its intended purpose or completed its task. Following this approach ensures that the managed heap will not become cluttered with abandoned objects.