The Garbage Collection Handbook by Jones, Richard -- Read -- Imperial Library of Trantor

Index

Cover Half Title Title Page Copyright Page Dedication Table of Contents List of Algorithms List of Figures List of Tables Preface Acknowledgements Authors 1 Introduction

1.1 Explicit deallocation 1.2 Automatic dynamic memory management 1.3 Comparing garbage collection algorithms

Safety Throughput Completeness and promptness Pause time Space overhead Optimisations for specific languages Scalability and portability

1.4 A performance disadvantage? 1.5 Experimental methodology 1.6 Terminology and notation

The heap The mutator and the collector The mutator roots References, fields and addresses Liveness, correctness and reachability Pseudo-code The allocator Mutator read and write operations Atomic operations Sets, multisets, sequences and tuples

2 Mark-sweep garbage collection

2.1 The mark-sweep algorithm 2.2 The tricolour abstraction 2.3 Improving mark-sweep 2.4 Bitmap marking 2.5 Lazy sweeping 2.6 Cache misses in the marking loop 2.7 Issues to consider

Mutator overhead Throughput Space usage To move or not to move?

3 Mark-compact garbage collection

3.1 Two-finger compaction 3.2 The Lisp 2 algorithm 3.3 Threaded compaction 3.4 One-pass algorithms 3.5 Issues to consider

Is compaction necessary? Throughput costs of compaction Long-lived data Locality Limitations of mark-compact algorithms

4 Copying garbage collection

4.1 Semispace copying collection

Work list implementations An example

4.2 Traversal order and locality 4.3 Issues to consider

Allocation Space and locality Moving objects

5 Reference counting

5.1 Advantages and disadvantages of reference counting 5.2 Improving efficiency 5.3 Deferred reference counting 5.4 Coalesced reference counting 5.5 Cyclic reference counting 5.6 Limited-field reference counting 5.7 Issues to consider

The environment Advanced solutions

6 Comparing garbage collectors

6.1 Throughput 6.2 Pause time 6.3 Space 6.4 Implementation 6.5 Adaptive systems 6.6 A unified theory of garbage collection

Abstract garbage collection Tracing garbage collection Reference counting garbage collection

7 Allocation

7.1 Sequential allocation 7.2 Free-list allocation

First-fit allocation Next-fit allocation Best-fit allocation Speeding free-list allocation

7.3 Fragmentation 7.4 Segregated-fits allocation

Fragmentation Populating size classes

7.5 Combining segregated-fits with first-, best- and next-fit 7.6 Additional considerations

Alignment Size constraints Boundary tags Heap parsability Locality Wilderness preservation Crossing maps

7.7 Allocation in concurrent systems 7.8 Issues to consider

8 Partitioning the heap

8.1 Terminology 8.2 Why to partition

Partitioning by mobility Partitioning by size Partitioning for space Partitioning by kind Partitioning for yield Partitioning for responsiveness Partitioning for locality Partitioning by thread Partitioning by availability Partitioning by mutability

8.3 How to partition 8.4 When to partition

9 Generational garbage collection

9.1 Example 9.2 Measuring time 9.3 Generational hypotheses 9.4 Generations and heap layout 9.5 Multiple generations 9.6 Age recording

En masse promotion Aging semispaces Survivor spaces and flexibility

9.7 Adapting to program behaviour

Appel-style garbage collection Feedback controlled promotion

9.8 Inter-generational pointers

Remembered sets Pointer direction

9.9 Space management 9.10 Older-first garbage collection 9.11 Beltway 9.12 Analytic support for generational collection 9.13 Issues to consider 9.14 Abstract generational garbage collection

10 Other partitioned schemes

10.1 Large object spaces

The Treadmill garbage collector Moving objects with operating system support Pointer-free objects

10.2 Topological collectors

Mature object space garbage collection Connectivity-based garbage collection Thread-local garbage collection Stack allocation Region inferencing

10.3 Hybrid mark-sweep, copying collectors

Garbage-First Immix and others Copying collection in a constrained memory space

10.4 Bookmarking garbage collection 10.5 Ulterior reference counting 10.6 Issues to consider

11 Run-time interface

11.1 Interface to allocation

Speeding allocation Zeroing

11.2 Finding pointers

Conservative pointer finding Accurate pointer finding using tagged values Accurate pointer finding in objects Accurate pointer finding in global roots Accurate pointer finding in stacks and registers Accurate pointer finding in code Handling interior pointers Handling derived pointers

11.3 Object tables 11.4 References from external code 11.5 Stack barriers 11.6 GC-safe points and mutator suspension 11.7 Garbage collecting code 11.8 Read and write barriers

Engineering Precision of write barriers Hash tables Sequential store buffers Overflow action Card tables Crossing maps Summarising cards Hardware and virtual memory techniques Write barrier mechanisms: in summary Chunked lists

11.9 Managing address space 11.10 Applications of virtual memory page protection

Double mapping Applications of no-access pages

11.11 Choosing heap size 11.12 Issues to consider

12 Language-specific concerns

12.1 Finalisation

When do finalisers run? Which thread runs a finaliser? Can finalisers run concurrently with each other? Can finalisers access the object that became unreachable? When are finalised objects reclaimed? What happens if there is an error in a finaliser? Is there any guaranteed order to finalisation? The finalisation race problem Finalisers and locks Finalisation in particular languages For further study

12.2 Weak references

Additional motivations Supporting multiple pointer strengths Using Phantom objects to control finalisation order Race in weak pointer clearing Notification of weak pointer clearing Weak pointers in other languages

12.3 Issues to consider

13 Concurrency preliminaries

13.1 Hardware

Processors and threads Interconnect Memory Caches Coherence Cache coherence performance example: spin locks

13.2 Hardware memory consistency

Fences and happens-before Consistency models

13.3 Hardware primitives

Compare-and-swap Load-linked/store-conditionally Atomic arithmetic primitives Test then test-and-set More powerful primitives Overheads of atomic primitives

13.4 Progress guarantees

Progress guarantees and concurrent collection

13.5 Notation used for concurrent algorithms 13.6 Mutual exclusion 13.7 Work sharing and termination detection

Rendezvous barriers

13.8 Concurrent data structures

Concurrent stacks Concurrent queue implemented with singly linked list Concurrent queue implemented with array A concurrent deque for work stealing

13.9 Transactional memory

What is transactional memory? Using transactional memory to help implement collection Supporting transactional memory in the presence of garbage collection

13.10 Issues to consider

14 Parallel garbage collection

14.1 Is there sufficient work to parallelise? 14.2 Load balancing 14.3 Synchronisation 14.4 Taxonomy 14.5 Parallel marking

Processor-centric techniques

14.6 Parallel copying

Processor-centric techniques Memory-centric techniques

14.7 Parallel sweeping 14.8 Parallel compaction 14.9 Issues to consider

Terminology Is parallel collection worthwhile? Strategies for balancing loads Managing tracing Low-level synchronisation Sweeping and compaction Termination

15 Concurrent garbage collection

15.1 Correctness of concurrent collection

The tricolour abstraction, revisited The lost object problem The strong and weak tricolour invariants Precision Mutator colour Allocation colour Incremental update solutions Snapshot-at-the-beginning solutions

15.2 Barrier techniques for concurrent collection

Grey mutator techniques Black mutator techniques Completeness of barrier techniques Concurrent write barrier mechanisms One-level card tables Two-level card tables Reducing work

15.3 Issues to consider

16 Concurrent mark-sweep

16.1 Initialisation 16.2 Termination 16.3 Allocation 16.4 Concurrent marking and sweeping 16.5 On-the-fly marking

Write barriers for on-the-fly collection Doligez-Leroy-Gonthier Doligez-Leroy-Gonthier for Java Sliding views

16.6 Abstract concurrent collection

The collector wavefront Adding origins Mutator barriers Precision Instantiating collectors

16.7 Issues to consider

17 Concurrent copying & compaction

17.1 Mostly-concurrent copying: Baker’s algorithm

Mostly-concurrent, mostly-copying collection

17.2 Brooks’s indirection barrier 17.3 Self-erasing read barriers 17.4 Replication copying 17.5 Multi-version copying

Extensions to avoid copy-on-write

17.6 Sapphire

Collector phases Merging phases Volatile fields

17.7 Concurrent compaction

Compressor Pauseless

17.8 Issues to consider

18 Concurrent reference counting

18.1 Simple reference counting revisited 18.2 Buffered reference counting 18.3 Concurrent, cyclic reference counting 18.4 Taking a snapshot of the heap 18.5 Sliding views reference counting

Age-oriented collection The algorithm Sliding views cycle reclamation Memory consistency

18.6 Issues to consider

19 Real-time garbage collection

19.1 Real-time systems 19.2 Scheduling real-time collection 19.3 Work-based real-time collection

Parallel, concurrent replication Uneven work and its impact on work-based scheduling

19.4 Slack-based real-time collection

Scheduling the collector work Execution overheads Programmer input

19.5 Time-based real-time collection: Metronome

Mutator utilisation Supporting predictability Analysis Robustness

19.6 Combining scheduling approaches: Tax-and-Spend

Tax-and-Spend scheduling Tax-and-Spend prerequisites

19.7 Controlling fragmentation

Incremental compaction in Metronome Incremental replication on uniprocessors Stopless: lock-free garbage collection Staccato: best-effort compaction with mutator wait-freedom Chicken: best-effort compaction with mutator wait-freedom for x86 Clover: guaranteed compaction with probabilistic mutator lock-freedom Stopless versus Chicken versus Clover Fragmented allocation

19.8 Issues to consider

Glossary Bibliography Index

← Prev
Back
Next →

← Prev
Back
Next →