Editing Tracing garbage collection (section)

== Implementation strategies ==

=== Moving vs. non-moving ===
Once the unreachable set has been determined, the garbage collector may simply release the [[unreachable object]]s and leave everything else as it is, or it may copy some or all of the reachable objects into a new area of memory, updating all references to those objects as needed. These are called "non-moving" and "moving" (or, alternatively, "non-compacting" and "compacting") garbage collectors, respectively.

At first, a moving algorithm may seem inefficient compared to a non-moving one, since much more work would appear to be required on each cycle. But the moving algorithm leads to several performance advantages, both during the garbage collection cycle itself and during program execution:
* No additional work is required to reclaim the space freed by dead objects; the entire region of memory from which reachable objects were moved can be considered free space. In contrast, a non-moving GC must visit each unreachable object and record that the memory it occupied is available.
* Similarly, new objects can be allocated very quickly. Since large contiguous regions of memory are usually made available by a moving GC, new objects can be allocated by simply incrementing a 'free memory' pointer. A non-moving strategy may, after some time, lead to a heavily [[fragmentation (computer)|fragmented]] heap, requiring expensive consultation of "free lists" of small available blocks of memory in order to allocate new objects.
* If an appropriate traversal order is used (such as cdr-first for list [[cons]]es), objects can be moved very close to the objects they refer to in memory, increasing the chance that they will be located in the same [[cache line]] or [[virtual memory]] page. This can significantly speed up access to these objects through these references.

One disadvantage of a moving garbage collector is that it only allows access through references that are managed by the garbage collected environment, and does not allow [[pointer arithmetic]]. This is because any pointers to objects will be invalidated if the garbage collector moves those objects (they become [[dangling pointer]]s). For [[interoperability]] with native code, the garbage collector must copy the object contents to a location outside of the garbage collected region of memory. An alternative approach is to ''pin'' the object in memory, preventing the garbage collector from moving it and allowing the memory to be directly shared with native pointers (and possibly allowing pointer arithmetic).<ref>{{cite web |url=https://docs.microsoft.com/en-us/dotnet/framework/interop/copying-and-pinning |title=Copying and Pinning |work=[[Microsoft Docs]] |access-date=2022-04-25}}</ref>

=== Copying vs. mark-and-sweep vs. mark-and-don't-sweep ===
Not only do collectors differ in whether they are moving or non-moving, they can also be categorized by how they treat the white, grey and black object sets during a collection cycle.

The most straightforward approach is the ''semi-space collector'', which dates to 1969. In this moving collector, memory is partitioned into an equally sized "from space" and "to space". Initially, objects are allocated in "to space" until it becomes full and a collection cycle is triggered. At the start of the cycle, the "to space" becomes the "from space", and vice versa. The objects reachable from the root set are copied from the "from space" to the "to space". These objects are scanned in turn, and all objects that they point to are copied into "to space", until all reachable objects have been copied into "to space". Once the program continues execution, new objects are once again allocated in the "to space" until it is once again full and the process is repeated. 

This approach is very simple, but since only one semi space is used for allocating objects, the memory usage is twice as high compared to other algorithms. The technique is also known as ''stop-and-copy''. [[Cheney's algorithm]] is an improvement on the semi-space collector.

A mark and sweep garbage collector keeps a bit or two with each object to record if it is white or black. The grey set is kept as a separate list or using another bit. As the reference tree is traversed during a collection cycle (the "mark" phase), these bits are manipulated by the collector. A final "sweep" of the memory areas then frees white objects. The mark and sweep strategy has the advantage that, once the condemned set is determined, either a moving or non-moving collection strategy can be pursued. This choice of strategy can be made at runtime, as available memory permits. It has the disadvantage of "bloating" objects by a small amount, as in, every object has a small hidden memory cost because of the list/extra bit. This can be somewhat mitigated if the collector also handles allocation, since then it could potentially use unused bits in the allocation data structures. Or, this "hidden memory" can be eliminated by using a [[Tagged pointer]], trading the memory cost for CPU time. However, the mark and sweep is the only strategy that readily cooperates with external allocators in the first place.

A ''mark and don't sweep'' garbage collector, like the mark-and-sweep, keeps a bit with each object to record if it is white or black; the grey set is kept as a separate list or using another bit. There are two key differences here. First, black and white mean different things than they do in the mark and sweep collector. In a "mark and don't sweep" collector, all reachable objects are always black. An object is marked black at the time it is allocated, and it will stay black even if it becomes unreachable. A white object is unused memory and may be allocated. Second, the interpretation of the black/white bit can change. Initially, the black/white bit may have the sense of (0=white, 1=black). If an allocation operation ever fails to find any available (white) memory, that means all objects are marked used (black). The sense of the black/white bit is then inverted (for example, 0=black, 1=white). Everything becomes white. This momentarily breaks the invariant that reachable objects are black, but a full marking phase follows immediately, to mark them black again. Once this is done, all unreachable memory is white. No "sweep" phase is necessary. 

The mark and don't sweep strategy requires cooperation between the allocator and collector, but is incredibly space efficient since it only requires one bit per allocated pointer (which most allocation algorithms require anyway). However, this upside is somewhat mitigated, since most of the time large portions of memory are wrongfully marked black (used), making it hard to give resources back to the system (for use by other allocators, threads, or processes) in times of low memory usage. 

The mark and don't sweep strategy can therefore be seen as a compromise between the upsides and downsides of the mark and sweep and the stop and copy strategies.

=== Generational GC (ephemeral GC) ===
It has been empirically observed that in many programs, the most recently created objects are also those most likely to become unreachable quickly (known as ''infant mortality'' or the ''generational hypothesis''). <!--
the following sentence is embarrassing without a reference, someone please find one

Experimental evidence shows that around 90% of objects die before their first garbage collection. --> A generational GC (also known as ephemeral GC) divides objects into generations and, on most cycles, will place only the objects of a subset of generations into the initial white (condemned) set. Furthermore, the runtime system maintains knowledge of when references cross generations by observing the creation and overwriting of references. When the garbage collector runs, it may be able to use this knowledge to prove that some objects in the initial white set are unreachable without having to traverse the entire reference tree. If the generational hypothesis holds, this results in much faster collection cycles while still reclaiming most unreachable objects.

In order to implement this concept, many generational garbage collectors use separate memory regions for different ages of objects. When a region becomes full, the objects in it are traced, using the references from the older generation(s) as roots.  This usually results in most objects in the generation being collected (by the hypothesis), leaving it to be used to allocate new objects.  When a collection doesn't collect many objects (the hypothesis doesn't hold, for example because the program has computed a large collection of new objects it does want to retain), some or all of the surviving objects that are referenced from older memory regions are promoted to the next highest region, and the entire region can then be overwritten with fresh objects. This technique permits very fast incremental garbage collection, since the garbage collection of only one region at a time is all that is typically required.

[[David Ungar|Ungar]]'s classic generation scavenger has two generations. It divides the youngest generation, called "new space", into a large "eden" in which new objects are created and two smaller "survivor spaces", past survivor space and future survivor space. The objects in the older generation that may reference objects in new space are kept in a "remembered set". On each scavenge, the objects in new space are traced from the roots in the remembered set and copied to future survivor space. If future survivor space fills up, the objects that do not fit are promoted to old space, a process called "tenuring". At the end of the scavenge, some objects reside in future survivor space, and eden and past survivor space are empty. Then future survivor space and past survivor space are exchanged and the program continues, allocating objects in eden. In Ungar's original system, eden is 5 times larger than each survivor space.

Generational garbage collection is a [[heuristic (computer science)|heuristic]] approach, and some unreachable objects may not be reclaimed on each cycle. It may therefore occasionally be necessary to perform a full mark and sweep or copying garbage collection to reclaim all available space. In fact, runtime systems for modern programming languages (such as [[Java (programming language)|Java]] and the [[.NET Framework]]) usually use some hybrid of the various strategies that have been described thus far; for example, most collection cycles might look only at a few generations, while occasionally a mark-and-sweep is performed, and even more rarely a full copying is performed to combat fragmentation. The terms "minor cycle" and "major cycle" are sometimes used to describe these different levels of collector aggression.

=== Stop-the-world vs. incremental vs. concurrent ===

Simple ''stop-the-world'' garbage collectors completely halt execution of the program to run a collection cycle, thus guaranteeing that new objects are not allocated and objects do not suddenly become unreachable while the collector is running.

This has the disadvantage that the program can perform no useful work while a collection cycle is running (sometimes called the "embarrassing pause"<ref>{{cite journal |last1=Steele |first1=Guy L. |author-link=Guy L. Steele Jr. |title=Multiprocessing Compactifying Garbage Collection |journal=Communications of the ACM |date=September 1975 |volume=18 |issue=9 |pages=495–508 |doi=10.1145/361002.361005 |doi-access=free |s2cid=29412244}}</ref>). Stop-the-world garbage collection is therefore mainly suitable for non-interactive programs. Its advantage is that it is both simpler to implement and faster than incremental garbage collection.

''Incremental'' and ''concurrent'' garbage collectors are designed to reduce this disruption by interleaving their work with activity from the main program. ''Incremental'' garbage collectors perform the garbage collection cycle in discrete phases, with program execution permitted between each phase (and sometimes during some phases). ''Concurrent'' garbage collectors do not stop program execution at all, except perhaps briefly when the program's execution stack is scanned. However, the sum of the incremental phases takes longer to complete than one batch garbage collection pass, so these garbage collectors may yield lower total throughput.

Careful design is necessary with these techniques to ensure that the main program does not interfere with the garbage collector and vice versa; for example, when the program needs to allocate a new object, the runtime system may either need to suspend it until the collection cycle is complete, or somehow notify the garbage collector that there exists a new, reachable object.

=== Precise vs. conservative and internal pointers ===
Some collectors can correctly identify all pointers (references) in an object; these are called ''precise'' (also ''exact'' or ''accurate'') collectors, the opposite being a ''conservative'' or ''partly conservative'' collector. Conservative collectors assume that any bit pattern in memory could be a pointer if, interpreted as a pointer, it would point into an allocated object. Conservative collectors may produce false positives, where unused memory is not released because of improper pointer identification. This is not always a problem in practice unless the program handles a lot of data that could easily be misidentified as a pointer. False positives are generally less problematic on [[64-bit computing|64-bit]] systems than on [[32-bit]] systems because the range of valid memory addresses tends to be a tiny fraction of the range of 64-bit values. Thus, an arbitrary 64-bit pattern is unlikely to mimic a valid pointer. A false negative can also happen if pointers are "hidden", for example using an [[XOR linked list]]. Whether a precise collector is practical usually depends on the type safety properties of the programming language in question. An example for which a conservative garbage collector would be needed is the [[C (programming language)|C language]], which allows typed (non-void) pointers to be type cast into untyped (void) pointers, and vice versa.

A related issue concerns ''internal pointers'', or pointers to fields within an object. If the semantics of a language allow internal pointers, then there may be many different addresses that can refer to parts of the same object, which complicates determining whether an object is garbage or not. An example for this is the [[C++]] language, in which multiple inheritance can cause pointers to base objects to have different addresses. In a tightly optimized program, the corresponding pointer to the object itself may have been overwritten in its register, so such internal pointers need to be scanned.