Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GC proposal #50

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
100 changes: 100 additions & 0 deletions rfcs/gc-proposal.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
Proposal for the GC
===================

I have proposal which is twofold (the two are relatively independant, but I thought about the first one while reasonning about correctness of the implementation of the first one):

- split the minor heap in slices
- replace the greyval list by a reference counter

These ideas are quite natural and I am probably not the first one to think about this.

Split the minor into slices
---------------------------

The idea is that the minor heap is splitted in sevaral blocks, like 8x64k
blocks. There is a current block. lets us call $n_m$ the number of minor
heap blocks and $c_m$ the index of the current block. Index are modulo $n_m$,
but we compare block relatice to the current block: we say the a block index
$k$ is older that a block index $l$ if $k < l$ when $k$ and $l$ are
represented by the smallest integer strictly greater than $c_m$.

- Allocation is performed in the current minor heap block

- At the end of a minor GC, only the block $c_m + 1$ is promoted before
incrementing $c_m$: older blocks are promoted first.

- There are several ways to handle scanning/greyval:

1. Consider block in the block $k$ pointed from a older block as a greyval
and add them to the list.

2. Scan the whole minor heap, with the argument that old blocks should be
sparse and the complexity of marking is likely smaller that the size of the
minor heap. But the worst case is still proportional to the full size of
the minor heap.

3. Intermediate: consider greyval only when the age difference of the block
is greater than some constant $C$. This is actually not more complex than
1., checking the address of the block to know if we are in the current block
is not much more instructions than checking the age difference using modular
arithmetic.

I think 3 is the best idea with C=2 or 3.

Benefits:

- Less latency of allocation if the blocks are small and if we use
solution 1. or 3. above.

- No more early promotion. All recently allocated block in the minor heap are
likely wrongly promoted in the current approach while here only blocks that
have resisted $n_m - 1$ complete minor GCs are promoted.

- Cost of allocation is the minor heap is more or less the same as now

- All values: n_m, size of blocks and C can be parameterised by the user,
allowing to tune the GC for the need of your application.


Cost:

- Wasted memory (but we can afford much larger minor heap ?), and the user can
choose how much memory is waster, at worst it can fallback to the actual
solution with $n_m = 1$.

- More costly test in mutation for proposal 1. and 3.

- The maximum size that we can allocate in the minor-heap is bounded by the
size of one block and not the size of the minor heap.

- more ?

Reference counter for rememebered set
-------------------------------------

The title is almost self sufficient: why not add in each block in the minor
heap a reference counter that count the number of references from the major
heap or from a block that is too old in the minor heap ?

Benefit:

- This replace adding and removing from the remembered set by incrementing and
decrementing a counter in the minor heap. This seems much less costly with
domains ?

- The problem which cyclic data structure and reference counter does not occur
here as pointer are oriented by the age of the blocks (if we consider the
major heap as infinitely old).

- We can check the counter in the compacting GC for debugging, as
ref counter are more complex to debug (easier to check if a value is well
marked as a greyval than verify the value of a counter).

Cost:

- Nothing ?

Questions
---------

I don't know how both proposals affect ephemeron ?