This page describes our research on automatic management of computer storage via garbage collection. For an introduction and more background on this technology, see
We have collected about a dozen garbage collection benchmarks written in Scheme.
Generational garbage collection usually outperforms non-generational garbage collection. Why?
One pop explanation is that "most objects die young", but this explanation is incorrect. In a radioactive decay model of object lifetimes, each live object has a 50% probability of dying during the next interval whose duration is the half-life that parameterizes the model. If this half-life is short, then most objects die young, but a conventional generational collector will actually perform worse than a non-generational collector.
A more sophisticated explanation for the effectiveness of generational garbage collectors is that they try to predict which objects are likely to die soon, and how well they work depends upon the accuracy of these heuristic predictions. This explanation is also incorrect. The radioactive decay model makes it impossible to predict which objects will die soon. No heuristic predictor can do better or worse than chance, so if this explanation were correct we would expect generational collectors to perform the same as non-generational collectors for this model. In fact, conventional generational collectors perform worse.
How do they manage to do that? By collecting the wrong generations. Conventional generational collectors are younger-first in the sense that they collect young objects more often than old. These young objects haven't had much time to die, so collecting them doesn't recover much storage.
An older-first generational collector, which collects old objects more often than young, would recover more storage for a similar amount of effort.
Will Clinger and Lars Hansen invented an older-first algorithm for generational garbage collection, which they now refer to as the renewal-older-first algorithm. (Its original name was "non-predictive".)
Although they invented older-first collection for the theoretical purpose of showing that a generational collector can outperform non-generational collection even for the radioactive decay model, they reasoned that older-first collection might work well for the long-lived objects of real programs. They designed a 3-generational hybrid collector that combines the nursery of a conventional younger-first collector with a renewal-older-first collector for objects that are promoted out of the nursery.
Hansen implemented this modified renewal-older-first collector in Larceny and used gc-intensive benchmarks to compare its performance against Larceny's conventional younger-first generational collector. On most benchmarks, the renewal-older-first collector performed better than Larceny's 3-generational younger-first collector, and was competitive with Larceny's 2-generational younger-first collector.
Why did two younger-first generations work better than three on most of our benchmarks? We have identified several reasons, but the most important is the benchmarks themselves. Most Scheme programs spend very little time in the garbage collector, so gc-intensive programs are atypical. In particular, they often have an unusually high fraction of fairly long-lived but not permanent objects. Larceny's 2-generational collector promotes these objects by copying them just once, whereas the 3-generational younger-first collector must copy them at least twice before they enter the oldest dynamic generation. The renewal-older-first collector promotes these objects into the intermediate generation by copying them just once, and promotes them into the older generation without copying them, where they often died without being copied a second time.
Compared to Larceny's 3-generational younger-first collector,
the renewal-older-first collector generally copied fewer words
of storage and required less gc time, mutator time, and overall
time. The renewal-older-first collector performed better on
all but one benchmark,
Compared to Larceny's 2-generational younger-first collector,
the renewal-older-first collector generally copied fewer words
of storage, but often required slightly more gc time because
the renewal-older-first collector is more complex, with a higher
cost per word copied.
The extra gc time was generally offset by lower mutator time,
however, so the overall time was usually comparable.
In terms of the average overall time over all runs, the
renewal-older-first collector performed about 15% worse
than the 2-generational older-first collector on the
gcbench benchmark, but performed at least 10%
better on three benchmarks.
The renewal-older-first collector tended to use slightly more space than the 2-generational younger-first collector, with a very high correlation between relative space and relative gc time. In other words, the renewal-older-first collector performed better in terms of space on the same benchmarks for which it performed better in terms of time, and performed worse in terms of space on the benchmarks for which it performed worse in terms of time. The natural space/time tradeoff would create an anti-correlation, so the effects of that tradeoff were small compared to the algorithmic effects.
Larceny's renewal-older-first collector does not perform any full collections unless it degrades into a 2-generational younger-first collector. This improves its storage locality, which probably accounts for the improvements in mutator time.
Hansen also implemented a modified form of the deferred-older-first algorithm invented by Darko Stefanovic, and reported on its performance in his thesis.
Building upon experience with the renewal-older-first collector described above and with the garbage-first collector developed by Sun Microsystems, Will Clinger and Felix S Klock II designed and implemented a regional garbage collector that usually performs like a conventional youngest-first generational collector but has much better worst-case performance.
The regional collector never performs a full collection. Every garbage collection is guaranteed to take less time than a fixed maximum pause time, which is independent of the mutator and independent of the volume of reachable data (up to the limit imposed by physical RAM). Garbage collections are guaranteed to be spaced far enough apart so the minimum mutator utilization remains above a fixed positive constant, which is also independent of the mutator and volume of live data. The total storage used by the regional collector is guaranteed to remain below a fixed constant times the volume of reachable data.
Those three properties of regional collection are the best possible asymptotic bounds a garbage collector can have.
A prototype of the regional the collector has been implemented for Larceny. In the current version of Larceny, the worst-case performance of regional collection in Larceny is compromised by large objects. To achieve the regional collector's performance guarantees, object sizes must be bounded. Larceny's representations have not yet been redesigned to split large objects (such as long arrays) into a collection of smaller pieces.
Generational garbage collection works well when the collector can identify regions of heap storage that contain a higher proportion of garbage than for the heap as a whole. Most generational collectors do this heuristically.
Younger-first collectors assume some form of the so-called generational hypothesis, which asserts that young objects are more likely to die soon than older objects. This hypothesis is true for the objects of most but not all programs.
Older-first collectors rely on the fact that the amount of garbage within a region of allocated storage cannot decrease until the collector disturbs it. They assume that the longer a region is left undisturbed, the more profitable it becomes to collect it.
The regional collector is younger-first for very young objects, but older-first for objects that live long enough to be promoted out of the nursery.
Many variations and combinations of younger-first and older-first collectors are possible. We continue to explore them.
William D Clinger and Lars T Hansen. Generational garbage collection and the radioactive decay model. In the Proceedings of the 1997 ACM Conference on Programming Language Design and Implementation, June 1997, pages 97-108.
Lars T Hansen Older-first garbage collection in practice. PhD thesis, Northeastern University, November 2000.
David Detlefs, Ross Knippel, William D Clinger, and Matthias Jacob. Concurrent Remembered Set Refinement in Generational Garbage Collection. (Also available as HTML.) In the Proceedings of the 2002 USENIX Java VM Research and Technology Symposium, August 2002.
Lars T Hansen and William D Clinger. An experimental study of renewal-older-first garbage collection. In the Proceedings of the 2002 ACM International Conference on Functional Programming, October 2002. Some larger benchmark data are now available:
William D Clinger and Fabio V Rojas. Linear combinations of radioactive decay models for generational garbage collection. In Science of Computer Programming, Volume 62, Issue 2, 1 October 2006, pages 184-203.
William D Clinger and Felix S Klock II. Scalable garbage collection with guaranteed MMU. In the Proceedings of the 2009 Workshop on Scheme and Functional Programming, Northeastern University, 22 August 2009, pages 14-25.
Felix S Klock II. Scalable Garbage Collection via Remembered Set Summarization and Refinement. PhD dissertation, Northeastern University, January 2011.
Felix S Klock II and William D Clinger. Bounded-latency regional garbage collection. Proceedings of the 2011 Dynamic Languages Symposium (DLS 2011), 24 October 2011, pages 73-83. (Preprint and slides from the presentation.)
During tests of a new garbage collector, we look to see whether any garbage bits remain in the hardware:
From left to right: Arthur Nunes, Mitch Wand, Will Clinger, and Igor Siveroni. (This was Lars Hansen's machine, but he isn't in the photograph.)
Content last updated 13 March 2015.