Mark-compact algorithm
Encyclopedia
In computer science
Computer science
Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...

, a mark-compact algorithm is a type of garbage collection
Garbage collection (computer science)
In computer science, garbage collection is a form of automatic memory management. The garbage collector, or just collector, attempts to reclaim garbage, or memory occupied by objects that are no longer in use by the program...

 algorithm
Algorithm
In mathematics and computer science, an algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Algorithms are used for calculation, data processing, and automated reasoning...

 used to reclaim unreachable memory
Unreachable memory
In computer science, unreachable memory is a block of memory allocated dynamically where the program that allocated the memory no longer has any reachable pointer that refers to it. Similarly, an unreachable object is a dynamically allocated object that has no reachable reference to it...

. Mark-compact algorithms can be regarded as a combination of the mark-sweep algorithm and Cheney's copying algorithm
Cheney's algorithm
Cheney's algorithm, first described in a 1970 ACM paper by C.J. Cheney, is a method of garbage collection in computer software systems. In this scheme, the heap is divided into two equal halves, only one of which is in use at any one time...

. First, reachable objects are marked, then a compacting step relocates the reachable (marked) objects towards the beginning of the heap area. Compacting garbage collection is used by Microsoft
Microsoft
Microsoft Corporation is an American public multinational corporation headquartered in Redmond, Washington, USA that develops, manufactures, licenses, and supports a wide range of products and services predominantly related to computing through its various product divisions...

's Common Language Runtime
Common Language Runtime
The Common Language Runtime is the virtual machine component of Microsoft's .NET framework and is responsible for managing the execution of .NET programs. In a process known as just-in-time compilation, the CLR compiles the intermediate language code known as CIL into the machine instructions...

 and by the Glasgow Haskell Compiler
Glasgow Haskell Compiler
The Glorious Glasgow Haskell Compilation System, more commonly known as the Glasgow Haskell Compiler or GHC, is an open source native code compiler for the functional programming language Haskell. The lead developers are Simon Peyton Jones and Simon Marlow...

.

Algorithms

After marking the live objects in the heap in the same fashion as the mark-sweep algorithm, the heap will often be fragmented
Fragmentation (computer)
In computer storage, fragmentation is a phenomenon in which storage space is used inefficiently, reducing storage capacity and in most cases reducing the performance. The term is also used to denote the wasted space itself....

. The goal of mark-compact algorithms is to shift the live objects in memory together so the fragmentation is eliminated. The challenge is to correctly update all pointers to the moved objects, most of which will have new memory addresses after the compaction. The issue of handling pointer updates is handled in different ways.

Table-based compaction

A table-based algorithm was first described by Haddon and Waite in 1967. It preserves the relative placement of the live objects in the heap, and requires only a constant amount of overhead.

Compaction proceeds from the bottom of the heap (low addresses) to the top (high addresses). As live (that is, marked) objects are encountered, they are moved to the first available low address, and a record is appended to a break table of relocation information. For each live object, a record in the break table consists of the object's original address before the compaction and the difference between the original address and the new address after compaction. The break table is stored in the heap that is being compacted, but in an area that are marked as unused. To ensure that compaction will always succeed, the minimum object size in the heap must larger than or the same size as a break table record.

As compaction progresses, relocated objects are copied towards the bottom of the heap. Eventually an object will need to be copied to the space occupied by the break table, which now must be relocated elsewhere. These movements of the break table, (called rolling the table by the authors) cause the relocation records to become disordered, requiring the break table to be sorted
Comparison sort
A comparison sort is a type of sorting algorithm that only reads the list elements through a single abstract comparison operation that determines which of two elements should occur first in the final sorted list...

 after the compaction is complete. The cost of sorting the break table is O
Big O notation
In mathematics, big O notation is used to describe the limiting behavior of a function when the argument tends towards a particular value or infinity, usually in terms of simpler functions. It is a member of a larger family of notations that is called Landau notation, Bachmann-Landau notation, or...

(n log n), where n is the number of live objects that were found in the mark stage of the algorithm.

Finally, the break table relocation records are used to adjust pointer fields inside the relocated objects. The live objects are examined for pointers, which can be looked up in the sorted break table of size n in O(log n) time if the break table is sorted, for a total running time of O(n log n). Pointers are then adjusted by the amount specified in the relocation table.

LISP2 Algorithm

In order to avoid O(n log n) complexity, the LISP2 algorithm uses 3 different passes over the heap. In addition, heap objects must have a separate forwarding pointer slot that is not used outside of garbage collection.

After standard marking, the algorithm proceeds in the following 3 passes:
  1. Compute the forwarding location for live objects.
    • Keep track of a free and live pointer and initialize both to the start of heap.
    • If the live pointer points to a live object, update that object's forwarding pointer to the current free pointer and increment the free pointer according to the object's size.
    • Move the live pointer to the next object
    • End when the live pointer reaches the end of heap.
  2. Update all pointers
    • For each live object, updates its pointers according to the forwarding pointers of the objects they point to.
  3. Move objects
    • For each live object, move its data to its forwarding location.


This algorithm is O(n) on the size of the heap; it has a better complexity than the table-based approach, but the table-based approach's n is the size of the used space only, not the entire heap space as in the LISP2 algorithm. However, the LISP2 algorithm is simpler to implement.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK