-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Systematic approach to clear/set side metadata #1256
Comments
I think the appropriate API should be: fn initialize_metadata_for_region(start: Address, size: usize); (or This function would be called whenever we go into the slow-path to allocate a block, TLAB, or line. And that should be it I think. Then we should set all the relevant metadata either when the object is allocated or during tracing time. VO-bits might have to be handled separately in this however. |
This is only suitable for unlog bits and pinning bits. The common part between them is that they only need to be 0 for newly allocated objects, and stale bits are benign. It doesn't work, for example, for the mark bits because (1) free regions are certainly not marked, and (2) occupied regions still needs to be cleared before tracing (before full-heap tracing for StickyImmix). |
Stale unlog bits are definitely not benign as per the StickyImmix bug.
I don't get the first point, but yes I agree with the second one. |
If a region contains mark bits, it must contain live objects, and we can't allocate new objects into it. |
I still a bit confused. I'm not sure why is that relevant. Do you mean if you don't reset the mark bits for StickyImmix then you will not know which regions are empty? That is subsumed by the second point, no? |
My point is, the time "whenever we go into the slow-path to allocate a block, TLAB, or line" is not the right time to reset the mark bits. Mark bits must be set at the beginning of a major GC in StickyImmix, and may be at the beginning or the end of a major GC for GenImmix. The API |
Right. But we already have the |
// TODO: Currently only ImmortalSpace uses this struct. Any policy that needs mark bit can use this (immix, mark compact, mark sweep).
// We should do some refactoring for other policies as well.
pub struct MarkState { ... }
impl MarkState
/// This has to be called when a space resets its memory regions. This can be either called before the GC tracing, or
/// after a GC tracing (eagerly). This method will reset the mark bit. The policy should not use the mark bit before
/// doing another tracing.
pub fn on_block_reset<VM: VMBinding>(&self, start: Address, size: usize) {
if let crate::util::metadata::MetadataSpec::OnSide(side) =
*VM::VMObjectModel::LOCAL_MARK_BIT_SPEC
{
side.bzero_metadata(start, size);
}
} The comment on None the less, |
Idk. I feel like having a generic API for all possible metadata will be too complex given the extremely different semantics between all of them. |
TL;DR: No scattering side metadata clearing/setting operations everywhere! Don't blindly insert those operations before GC / before nursery GC / after GC / when allocating / etc., and per-space / per-chunk / per-block / per-object / etc. Do it systematically. Make a framework for it.
This issue is only about side metadata. Because free spaces are always zeroed, in-header metadata are always 0 upon allocation/copying of an object, and we don't need to worry about bulk-clearing. But we may start worrying about cyclic mark bits if we start to support in-header mark bits for ImmixSpace. (No. We still don't support in-header mark bits now!)
Problem
Take ImmixSpace as an example.
PrepareBlockState
(only in major GCs), itSweepChunk
, itBLOCK_ONLY
) andSome metadata have undergone several refactoring. The forwarding bits, for example, was
prepare
for every block in the beginning of every major GC when the forwarding bits were first allowed to be on the side (Allow forwarding bits to be on the side for Immix #753)This example just shows how difficult it is to get one metadata right. The ImmixSpace has (1) local mark bits, (2) local forwarding bits, (3) local pinning bits (optional), (4) global unlog bits (conditionally needed), and (5) global VO bits (optional). Properly taking care of all of those metadata bits take a lot of effort if we do them one by one.
Characteristics of metadata
Each kind of metadata has several properties that dictate their implementation.
And those properties may change for for different plans and different spaces.
StickyImmix
StickyImmix only has an ImmixSpace, and both young and mature objects are allocated into it. But each line either contains only young objects or only old objects, but not a mixture of them. The constraints can be summarized in the following table.
Let's look at those metadata one by one.
Given those constraints,
GenImmix
GenImmix is also generational, but we don't use mark bits as sticky bits. Young objects are never allocated into the ImmixSpace. And because GenImmix doesn't support pinning, the pin bits are useless for GenImmix.
For the CopySpace:
| Metadata | forwarding bits | VO bits |
|---|---|---|---|
| When are they set? | during copying GC | obj alloc |
| Observer | GC | both |
| Persists across GC? | no | yes |
| When must it be observed as clean? | before every GC |during mutator time |
| Where must it be observed as clean? | whole space |where there's no object |
| Are stale bits OK to mutators? | yes | no |
The CopySpace only has the forwarding bits and the VO bits (optional). The forwarding bits can be cleared at the end of a GC or at the beginning of a GC. It doesn't matter. In the current code base, we do bulk clearing at the end of a GC (if on the side).
For the ImmixSpace:
pinning bitsIt's mostly the same as StickyImmix, except that:
Solution 1: Declarative approach
Programmers provide properties
We may declare the properties of each metadata. For example, in StickyImmix,
prepare
for the entire space.But this may require some kind of constraint solvers, which may be too general, given that we only have 5 kinds of metadata to deal with for StickyImmix.
Programmers decide a time and range
A simpler but still declarative approach is simply letting the programmer specify a time each metadata is cleared. Possible times can be:
and possible ranges can be
The framework will insert hooks and do something like:
In practice, due to the cost of iterating through all side metadata specs, we may reorder the above loops to skip unneeded metadata.
Solution 2: Aspect-oriented programming (AOP)
This approach simply provides a trait that include callbacks, for example,
fn on_prepare_block(block: Block, is_defrag_source: bool, is_nursery_gc: bool, is_copying_gc: bool)
fn on_release_block(block: Block, is_defrag_source: bool, is_nursery_gc: bool, is_copying_gc: bool)
fn on_release_line(line: Line, is_free: bool, is_nursery_gc: bool, is_copying_gc: bool)
And the programmer implements this trait for each Space. Inside each function, it will clear the metadata needed to clear For example,
This simply moves the code into one place, but doesn't change the fact that those code are hand-written. Given that we only have 5 different metadata, this not-so-intelligent approach may still be the most practical one for now.
The text was updated successfully, but these errors were encountered: