您的位置:首页 > 其它

How does garbage collector work?

2011-02-16 15:23 525 查看
The Garbage Collector (GC) can be considered the heart of the .NET Framework. It manages the allocation and release of memory for any .NET application. In order to create good .NET applications, we must know how the Garbage Collector (GC) works.

Basic rules

The Garbage Collector (GC) can’t be controlled by the application.

All garbage-collectable objects are allocated from one contiguous range of address space and are grouped by age.

There are never any gaps between objects in the managed heap.

The order of objects in memory remains the order in which they were created.

The oldest objects are at the lowest addresses (managed heap bottom), while new objects are created at increasing addresses (managed heap top).

Periodically the managed heap is compacted by removing dead objects and sliding the live objects up toward the low-address end of the heap.

Determine which object is dead

Don’t confuse the created objects with the references that point them (pointers)! Let’s consider the following code sample:

string s1 = "STRING";
string s2 = "STRING";

Here, "s1" and "s2" are not the created string objects! The code above creates only one string object ("STRING") due the intern mechanism. Please read more about interned strings in this article: How to: Optimize the memory usage with strings. The "s1" and "s2" are just references to the same string object (just pointers).

The object remains on the heap until it's no longer referenced by any active code, at which point the memory it's using is reclaimed by the Garbage Collector (GC).

Even if one of the two references is set to null, the Garbage Collector (GC) will be still considering the "STRING" object to be alive because the other reference is pointing to it.

The GC generations

All the living objects from the managed heap are divided in three groups by their age. Those groups are generically called "Generations". Those generations are very useful to prevent memory fragmentation on the managed heap. The Garbage Collector (GC) can search for dead object on each generation at a time (partial collections), to improve the collecting performance.

Now let’s see what the Garbage Collector (GC) is using each generation for:

Generation 0 (Gen0) contains all the newly created objects and it is located on the top of the heap zone (higher memory addresses). All the objects contained by Gen0 are considered short-lived object and the Garbage Collector (GC) is expecting to them to be quickly destroyed in order to release the used memory space. Due to this presumption, the Garbage Collector (GC) will try to collect dead objects most often from Gen0 because it is cheapest.

Generation 1 (Gen1) contains all the living objects from Gen0 that have survived to several Gen0 collects. So those objects are upgraded from Generation 0 to Generation 1. Gen1 is defined in the middle of the heap zone and it is exposed to fewer garbage collects than Gen0. Gen1’s collects are more expensive than the Gen0’s so the Garbage Collector (GC) will try to avoid them if it is not really necessary.

Generation 2 (Gen2) contains all the living objects from Gen1 that have survived to several Gen2 collects. Those objects are considered long-lived objects and destroying them is very expensive. Because of this, the Garbage Collector (GC) will hardly try to collect them. The Gen2 zone is located on the bottom of the managed heap zone (lowest memory addresses).Now let’s go deeper to understand how the Garbage Collector (GC) is actually collecting the dead objects and how this may affect the performance.

Collecting the Garbage

The GC is able to collect the garbage in two ways: full collections (searching the entire managed heap for dead objects) and partial collections (searching only a single generation zone). When the GS starts collecting the garbage, performing a full or partial collection, the first thing it does is to stop the application execution. So, at least from this point of view, collecting dead objects is an expensive task! For a full collection, the application can be stopped for a very long time!

The second step is to identify the root objects. A root object is an object having no references from other objects. For example the global members of an application are suitable to be root objects. Starting with these roots, the GC follows each reference contained by them inspecting recursively all the child objects. In this way the GC will have found every reachable or live object. The other objects, the unreachable ones, are now condemned to be collected.

If a partial collection is performed, the GC will iterate only thru objects having same age or younger. For example, a Gen1 root may have child objects from Gen1 or Gen0. Considering this, inspecting Gen2 roots is equivalent to perform a full collection, which is very expensive, because Gen2 objects may have references to children from Gen1 and Gen0.

All the live objects have been found will have their age incremented and be upgraded to the next generation, if necessary. Upgrading an object to the next generation involves moving its data on a different memory location of the managed heap. In order to not affect the performance too much, an object must have been survived more than one collection on its current generation to be upgraded on the next one.

All the condemned objects are checked for a finalizer. A finalizer is an optional special class method than can be called by the framework only in order to release any unmanaged resources that the object may use. In C# you use the ~Class syntax to specify the finalizer (the destructor).

The objects without a finalizer are immediately killed and the memory released. For the others, the things are a little bit more complicated.

How finalization affects performance

When the garbage collector first encounters an object that is otherwise dead but still needs to be finalized it must abandon its attempt to reclaim the space for that object at that time. The object is instead added to a list of objects needing finalization and, furthermore, the collector must then ensure that all of the pointers within the object remain valid until finalization is complete.

This means that no child object referred by an object with finalizer method can be killed until the finalizer has been executed. This is bad mostly if the finalizable object creates a lot of temporary objects (Gen0 objects). Normally, killing Gen0 objects is cheap and the memory is released immediately, but in this case all the temporary objects must live until the parent object is finalized. A lot of memory is locked and can’t be released!
Once the collection is complete, the finalization thread will go through the list of objects needing finalization and invoke the finalizers. When this is done the objects once again become dead and will be naturally collected in the normal way on the next collection.

As a conclusion about destructors:

The finalizable objects live a lot longer than the regular ones.

The things are getting worse if the finalizable object is a Gen1 or even a Gen2 object.

The finalizer method should do as little work as possible, otherwise finalization thread will take longer to execute and this will affect the application’s performance.

So, think twice before adding a destructor to your classes!
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: