.Net GC – Finalization Mechanism & the Dispose Pattern




Abstract

This article is a direct extension of the .Net Garbage Collection (GC) article I posted.

Please review it prior to reading this article.


In the .Net Garbage Collection (GC) article I wrote about .Net Garbage Collector (GC), its characteristics, algorithm, different mechanisms and its significant contribution to .Net applications.

Most of this information is from a presentation I prepared & presented in the past, in my previous work places, as part of a learning series I created about .Net CLR Internals.

The aim of this series was to enrich the teams with advanced topics while creating practical and concise presentations.

Hope you'll achieve benefit from this presentation, and most importantly enjoy the ride… J


Remark: Framework Version
In this article I'll describe the general behavior of the .Net Garbage Collector; some information could be modified from one CLR version to another.



Related articles:

          1.      .Net Garbage Collection (GC)
          2.      .NetType Internals
          3.      SyncBlock Index (SBI) \ Object Header Word
          4.      .Net debugging – Get started




Contents

1.     Finalization
-        Finalization list
-        F-reachable queue
-        Finalizer thread
-        The Finalization mechanism drawback
2.     Dispose pattern
-        Dispose Pattern Implementation
-        The Using Statement
3.     Summary





Finalization

As we know .Net Reference Types consume memory and are stored in the Managed (GC) Heap.
Some of these types need more than just memory; they need to use native (unmanaged) resources.   

Native resources (e.g. a file, network connection) do NOT reside on the managed heap and are handled by the operating system.

The GC doesn't manage the native resources.

Since the GC does not manage the native resources we should write custom code to manually release them, otherwise their memory would be leaked.


The finalization mechanism allows developers to 'free' the native resource before the garbage collectors frees (reclaims) the objects' memory.

The cleanup code is written in the 'Finalize' method.

When an object is no longer needed by the application, and the garbage collector decided to collect it and to reclaim its memory, its 'Finalize' method is executed before the object is freed.




Finalize method implementation

In C# we could create a Finalize method by adding a tilde (~) prior to the class name. It looks like a C++ destructor method; however, its purpose is different and doesn't work like the old classic C++ destructor method in any form.


Example:


namespace TestPerfExample
{
    // The 'NativeResourceWrapper' class wraps a native
    // resource, such as: a File, Network Connection, and
    // provides a custom cleanup code in the Finalize method.
    public class NativeResourceWrapper
    {
        // Implementation...


        // Finalize method

        ~NativeResourceWrapper()
        {
            // Implement custom cleanup code to
            // release native (unmanaged) resources.   
        }
    }
}


If we'll examine the assembly MSIL code, we'll notice that the compiler added:
-        A method named 'Finalize()',
-        Our cleanup code was added inside a try block,
-        A call to the base Finalize method was added inside a finally block.





To conclude up to now…

-        Sometimes we need to use native resources (such as: DB Connections) in our .Net Reference Types.
-        The native resources do NOT reside on the managed heap and are not handled by the GC.
-        The finalization mechanism provides a proper solution for releasing the native resources.
-        The developers need to write custom cleanup code (finalize method) to manually release the native resources, before the GC could reclaim the objects that used them.
-        We simulated a finalized method implementation and reviewed its corresponding MSIL code.

Next, we'll focus on the finalization mechanism learn how the GC knows when to collect objects that are using native resources and implemented a finalize method.




How does the finalization mechanism work?


1.      The GC uses 2 internal data structures to manage all the objects that implemented a finalization method:
            -        Finalization list – Contains all the objects that implemented a finalize method.
            -        F-reachable queue – Contains all the objects that are ready for finalization.

2.      The CLR uses the finalizer thread to execute the finalize methods after garbage collection completed.


New 'finalizable' object

When an object that implemented a finalize method is created the GC store a reference to that object in the finalization-list, before the object's instance constructor is executed.

This means that all the objects that are referenced from the finalization-list could not be released by the GC, only after their finalize methods were executed.


'Finalizable' object collection

When it's time for garbage collection, the GC examines all the 'garbage' objects (unreachable from the application) and search for a reference to those objects in the finalization-list.

In case a reference was found in the finalization-list to an object that was considered as garbage, its reference is removed from the finalization-list and is stored in the f-reachable queue.

This means that all objects in the f-reachable queue are not needed by the application anymore and their finalize method could be executed.

However, the GC still can't reclaim these objects memory yet, and they are NOT considered as garbage at this point.


The F-reachable queue serves as a GC Root

This means that, even though those objects are not needed nor reachable from the application, they are still referenced from the F-reachable queue and could not be released, meaning the objects in the f-reachable queue serves as GC Roots for their own internal objects.
(F-reachable => finalization reachable)


Executing the finalize methods

The CLR uses a special high-priority thread finalizer thread to execute the finalize methods after the GC completed its work.

The reason the CLR uses a dedicated high-priority thread is to avoid thread synchronization issues that could effect on the credibility of the application.

When the f-reachable queue contains references to 'finalizable objects' the finalizer thread wakes up, removes these references and execute each object's finalize method.


'Finalizable' objects are promoted to an older generation

As stated above, when it's time for garbage collection, the GC examines all the 'garbage' objects and verifies whether some of them implemented a finalize method, meaning they have a reference in the finalization-list.

Next, the GC transfers these references from the finalization-list to the f-reachable queue and completes its work.

Meaning these objects and their internal reference objects survived this collection, and are promoted to an older generation as part of the GC Sweep & Compact phase.

Meaning, if those objects resided on Generation 0 for example, and now have survived this collection, since they are still referenced from the f-reachable queue, they would be promoted to Generation 1.

Then, after the GC completed its work, the finalizer thread wakes up, remove those objects from the f-reachable queue and execute their finalize method implementation.

The next time a garbage collection will occur these objects are not referenced by the application nor by the f-reachable queue, their finalize method were executed, thus they are considered as garbage and their memory would be reclaimed.



Example:

The following images illustrate the Managed Heap, Finalization-list and the F-reachable Queue in an application that contains finalizable objects, and it's time for garbage collection.







Description:

The managed heap contains the following 6 objects: objects 1- 6.

Object 1, Object 2, Object 4 and Object 6 are not reachable from the application, and are ready for collection.

Object 1, Object 2, Object 5 and Object 6 have implemented a finalize method, thus they have a reference from the Finalization-list.

Meaning, when it's time for collection, the GC would remove their reference from the finalization-list into the f-reachable queue, as illustrated in the following image.








Description:

After garbage collection the Managed Heap contains: Object 1, Object 2, Object 3, Object 5 and Object 6.

Since Object 4 wasn't needed by the application anymore, and it didn't implement a finalize method (thus wasn't in the finalization-list), the GC released it, and its memory was reclaimed and compacted.

Object 1, Object 2 and Object 6 were removed from the finalization-list and were stored on the f-reachable queue.

Their memory wasn't reclaimed by the GC since they are still referenced from the f-reachable queue, and the finalizer thread needs to execute their finalize method.


The finalizer thread
When the GC completed its work, the finalizer thread wakes up and scan all the objects in the f-reachable queue, execute their finalize methods and removes their references from the f-reachable queue.

Thus, in the next garbage collection, those objects would be considered as garbage, and their memory would be reclaimed by the GC.


Remark: Generations
As indicated above the objects that were in the finalization-list and were transferred to the f-reachable queue, weren’t released and are still in the managed heap, however, now that they have survived a garbage collection, they were promoted to an older generation.

For simplicity, the images above don't illustrate the GC generations.




The Finalization mechanism drawback


As the finalization mechanism is very efficient and important when using native resources in our applications, it has one main simple disadvantage, which is that we can't control and don't know when the finalize method would be executed.

This means that when we need to use a native resource for a simple operation, for a short period of time (E.g. writing few bytes to a file), we need to implement a finalize method for that native resource, and it's not sure when the native resource would be released.

This, off course, is not very efficient, since as part of the finalization mechanism, the wrapping objects (that used the native resource) would create additional overhead in the finalization-list and the f-reachable queue, and still need to survive few garbage collections which naturally effects on the application's performance, only to write few bytes to a file on the disk.


In order to be able to deterministically release native resources, we could use the Dispose Pattern
_________________________________________________________________________________________





Dispose Pattern

As we learned, the main disadvantage of the GC Finalization Mechanism is that the developer doesn't control nor know when the finalize method would be executed, and when the native resource would be released.

The dispose pattern comes to aid…

Using the dispose pattern, developers could control and deterministically release (dispose/close) native resources when desired, meaning when the native resources are not needed anymore.

The dispose pattern is an improvement approach to handle native resources, which revoked the main weakness of the finalization mechanism.
However, as developers not always know when to properly release the native resource, both the Finalization Mechanism and the Dispose Pattern should be implemented.



Dispose Pattern Implementation

In order to be able to control the disposal of native resources, the developer should implement the System.IDisposable interface on the type that is using the native resource.


Remark: The IDisposable interface contains only one method: void Dispose();


using System.Runtime.InteropServices;

namespace System
{
    //
    // Summary:
    //     Provides a mechanism for releasing unmanaged resources.
    //     To browse the .NET Framework source code for this type,
    //     see the Reference Source.
    [ComVisible(true)]
    public interface IDisposable
    {
        //
        // Summary:
        //     Performs application-defined tasks
        //     associated with freeing, releasing,
        //     or resetting unmanaged resources.
        void Dispose();
    }
}


Example:

The following is an example one could use to implement the IDisposable interface.
This NativeResourceWrapper class also implemented the finalization mechanism (Finalize method).



   
    // NativeResourceWrapper - A class that wraps a Native
    // Resource and implemented the 'IDisposable' interface.
    public class NativeResourceWrapper : IDisposable
    {
        #region Private variables

        private bool isAlreadyDisposed;

        #endregion Private variables


        #region Public methods

        // The public method of the 'IDisposable' interface.

        // Users of the 'NativeResourceWrapper' class could
        // call this method to dispose the native resource.
        public void Dispose()
        {
            // Invoking the private 'Dispose(disposing=true)' method and indicating that
            // we are disposing the native resource, and not the finalizer thread.
            Dispose(true);
        }


        // For some implementations and for convenience, users
        // prefer to call a 'Close()' method, instead of a
        // 'Dispose()' method, thus providing both APIs.
        public void Close()
        {
            Dispose(true);
        }

        #endregion Public methods


        #region Private / Protected methods

        // In this method we are releasing the native resource.
        //      - If the (disposing == true) => the user invoked this method.
        //        (by calling the public 'Dispose()' or 'Close()' methods)
        //
        //      - Otherwise, the finalizer thread executed the finalize method.
        protected virtual void Dispose(Boolean disposing)
        {
            // It's advisable:
            // To verify whether the resource was already released (using a simple
            // flag), since we could get here from the 'Dispose()' and 'Close()' methods.

            if (isAlreadyDisposed) return;


            if (disposing)
            {
                // We could safely access other data members in here,
                // since this code is not executed from the finalize method.
            }

            // Now we could release the native resource.


            // It's important:
            // To Call the GC.SuppressFinalize(this) method,
            // in order to prevent the Finalization when not needed.
            // Meaning when the user manually
            // disposed the native resource.

            // However, we could use this code here even though
            // this code could also be called from the finalize
            // method, since it as no affect in such case.
            GC.SuppressFinalize(this);

            isAlreadyDisposed = true;
        }

        #endregion Private / Protected methods


        #region The Finalize method

        // When the GC completed its work, the
        // finalizer thread is executing this method.
        ~NativeResourceWrapper()
        {
            // Invoking the private 'Dispose(disposing=false)' method and indicating
            // that the finalizer thread is disposing the native resource.
            Dispose(false);
        }

        #endregion The Finalize method
    }



Description:

      -        We added the IDisposable Dispose() method.
      -        We added another public method - Close() for convenience.
      -        We added a protected virtual void Dispose(Boolean disposing) method to actually release the native resource, with a distinction between a user call to the finalization mechanism.
      -        We added a Finalize method.




Suppressing Finalization

We added a call to  GC.SuppressFinalize(this) in order to prevent finalization in case we already manually released the native resource.

However, we could also use this code in case executing the Finalize method, since it as no effect in the finalization process.

Using the GC.SuppressFinalize(this) method on the current object's instance instruct the CLR to NOT move this object’s reference from the finalization-list into the f-reachable queue, in order to prevent its finalization, and it would be released by the GC immediately.




Implement both the Dispose Pattern & a Finalize method

In the example above we implemented both the dispose pattern and a finalize method (finalization mechanism); however, we don’t always need to implement both.

Meaning, if we are using an object that implemented a dispose pattern we should also implement the dispose pattern, and call this object's dispose method in our own.

However, in such case, we don’t need to implement the Finalize method!



Best Practice: the ObjectDisposedException exception

As we saw in the example above the Dispose(Boolean disposing) method verifies whether the native resource was already disposed and in such case, returns without throwing an exception.

It's advisable to NOT throw an exception from this method, since this method could be invoked multiple times from both the public 'Dispose()' and 'Close()' methods!

However, we should throw the ObjectDisposedException exception from other methods that are trying to access the native resource, to indicate that it was already released.




The Using Statement

In order to use objects that implemented the dispose pattern we should use the using statement.


Example:

The following is a simple example of using the 'Using Statement' when creating a file and using the FileStream class that implemented the Dispose Pattern.



public static void CreateFileExample()
{
    string fileName = "tempFile.txt";
    string data = "This is my file...!";

    byte[] dataBytes = new UTF8Encoding(true).GetBytes(data);


    using (var fileStream = new FileStream(fileName, FileMode.Create))
    {
       fileStream.Write(dataBytes, 0, dataBytes.Length);
    }
}



The using statement is actually a syntactic sugar for the following code:


var fileStream = new FileStream(fileName, FileMode.Create);

try
{
   fileStream.Write(dataBytes, 0, dataBytes.Length);
}
finally
{
   if (fileStream != null)
       ((IDisposable)fileStream).Dispose();
}



If we'll use a de-compiler to examine the MSIL code, we'll see the following:

.method public hidebysig static
    void CreateFileExample () cil managed
{
    // Method begins at RVA 0x20d4
    // Code size 62 (0x3e)
    .maxstack 4
    .locals init (
        [0] string fileName,
        [1] string data,
        [2] uint8[] dataBytes,
        [3] class [mscorlib]System.IO.FileStream fileStream
    )

    IL_0000: nop
    IL_0001: ldstr "tempFile.txt"
    IL_0006: stloc.0
    IL_0007: ldstr "This is my file...!"
    IL_000c: stloc.1
    IL_000d: ldc.i4.1
    IL_000e: newobj instance void [mscorlib]System.Text.UTF8Encoding::.ctor(bool)
    IL_0013: ldloc.1
    IL_0014: callvirt instance uint8[] [mscorlib]System.Text.Encoding::GetBytes(string)
    IL_0019: stloc.2
    IL_001a: ldloc.0
    IL_001b: ldc.i4.2
    IL_001c: newobj instance void [mscorlib]System.IO.FileStream::.ctor(string, valuetype [mscorlib]System.IO.FileMode)
    IL_0021: stloc.3
    .try
    {
        IL_0022: nop
        IL_0023: ldloc.3
        IL_0024: ldloc.2
        IL_0025: ldc.i4.0
        IL_0026: ldloc.2
        IL_0027: ldlen
        IL_0028: conv.i4
        IL_0029: callvirt instance void [mscorlib]System.IO.Stream::Write(uint8[], int32, int32)
        IL_002e: nop
        IL_002f: nop
        IL_0030: leave.s IL_003d
    } // end .try
    finally
    {
        IL_0032: ldloc.3
        IL_0033: brfalse.s IL_003c

        IL_0035: ldloc.3
        IL_0036: callvirt instance void [mscorlib]System.IDisposable::Dispose()
        IL_003b: nop

        IL_003c: endfinally
    } // end handler

    IL_003d: ret
} // end of method Program::CreateFileExample





Summary

In this article we extensively explored the .Net Garbage Collection finalization mechanism, and the dispose pattern.

As all developers, I'm positive we appreciated the GC role in our applications, however, after reading this article, acquiring a deeper understanding on the GC work, I believe we could also use it wisely with the ‘proper respect’.



Related articles:

          1.      .Net Garbage Collection (GC)
          2.      .Net Type Internals
          3.      Sync Block Index (SBI) \ Object Header Word
          4.      .Net debugging – Get started




The End

Hope you enjoyed!
Appreciate your comments…

Yonatan Fedaeli

No comments: