You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
TL;DR: MMTk has both is_live and is_reachable, and the only difference is that is_live always returns true for objects in ImmortalSpace. This is wrong, and has consequences. We should remove is_live.
Definition
According to the GC Handbook, the precise definition of live is that "an object is live if it will be used by a mutator". Because it is undecidable, GC systems use reachability instead, i.e. "an object is reachable if there is a path from roots to that object following references". in GC, "live" and "reachable" are used interchangeably, and "dead" is a synonym of "unreachable".
Actual behavior
However, in MMTk, Objects in the ImmortalSpace are erroneously considered "immortal", i.e. always live. The object.is_live() function always returns true if object is in the ImmortalSpace. Historically, this behavior can be traced back to the very first commit in JikesRVM when the ImmortalSpace was introduced. A subsequent commit in JikesRVM introduced isReachable which checks the mark bit for ImmortalSpace, and it really checks reachability. Since then, JikesRVM had both isLive and isReachable. Both of them were ported to the Rust MMTk we have today, and they behave just like in JikesRVM.
This behavior contradicts with the definition of "live" and "reachable". Objects in the ImmortalSpace can become both unused by the mutator and unreachable from roots. Such objects are not "live" in either sense, yet is_live() still returns true.
Consequences
If an object is not traced during tracing, it will not be scanned, and its fields will not be forwarded. If the GC is a copying GC (such as SemiSpace), the object will contain dangling references. This is OK if the object dies (in which case the VM will never touch the object again). But since ImmortalSpace erroneously considers such objects "live", there will be consequences.
Weak reference processing
In the built-in ReferenceProcessor and FinalizableProcessor, is_live() is used to test if a weak reference or a finalizable reference is reached by stronger references.
Suppose in a JVM there is a WeakReference named a which refers to an object b in the ImmortalSpace. In a GC, a is traced, but b is not traced, and b contains references to objects that have been moved. In the WeakRefClosure stage, ReferenceProcessor will inspect a. is_live() will show that b is "live", and it will retain the weak reference from a to b. After GC, the mutator can call a.get() and upgrade the weak reference to a strong reference. Now the strong reference points to an object b that contains dangling references. When the mutator attempts to follow those references, it will crash.
It will be similar if the VM binding uses the Scanning::process_weak_ref and uses is_live to test if the referent is live.
The is_reachable method is unaffected because it actually checks the mark bit. Unreachable objects will be unmarked, and is_reachable will return false.
VO bits
This is not related to the is_live method, but the VO bits in ImmortalSpace is never cleared, as if the objects never die. It mainly impacts conservative stack scanning. It has several solutions and I have elaborated in #1274
What should we do?
We should just remove is_live. It's simply wrong.
We should use is_reachable in places where is_live is used. Particularly, is_reachable should be used when processing weak references.
And we should clarify that is_reachable does not return true root-reachability because that's too expensive to compute. It returns true if the current plan/space considers the object is reachable at the time it is called. It will return true if the object is marked and/or forwarded, and it is the same for ImmortalSpace. It will also return true for objects in the mature space and objects in the nursery that are reachable from the remembered set.
I previously considered renaming is_reachable and is_live in #1271 But after discussion, I think we don't need to rename is_reachable. We just need to clarify its semantics. More discussions can be found in the comments of #1271
The text was updated successfully, but these errors were encountered:
In today's meeting, @steveblackburn suggested we should not just use the definition of "live" in the GC Handbook, but instead make our own definition. One possible definition is:
An object is "live" if one of the following is true:
It is directly referenced by a root.
It is conservatively considered "live" by the plan or policy. Particularly, generational plans consider all mature objects as "live", and the ImmortalSpace considers all objects in it as "live".
It is referenced by another live object.
A live object is not necessarily root-reachable, and may not necessarily be used by the mutator. This gives the plans and policies a certain degree of freedom to define the liveness.
An invariant must be maintained that "live objects must only contain valid references (i.e. no dangling references)". Otherwise, the problem described here (WeakReference pointing to an immortal object that contain dangling references) and in #1274 (an immortal object that contain dangling references is picked up by the conservative stack scanner due to the presence of VO bit) will manifest and result in crash. Generational GCs maintain this invariant using the remembered set. ImmortalSpace needs to maintain this invariant, too. We will discuss this in #1274
TL;DR: MMTk has both
is_live
andis_reachable
, and the only difference is thatis_live
always returns true for objects in ImmortalSpace. This is wrong, and has consequences. We should removeis_live
.Definition
According to the GC Handbook, the precise definition of live is that "an object is live if it will be used by a mutator". Because it is undecidable, GC systems use reachability instead, i.e. "an object is reachable if there is a path from roots to that object following references". in GC, "live" and "reachable" are used interchangeably, and "dead" is a synonym of "unreachable".
Actual behavior
However, in MMTk, Objects in the ImmortalSpace are erroneously considered "immortal", i.e. always live. The
object.is_live()
function always returns true ifobject
is in the ImmortalSpace. Historically, this behavior can be traced back to the very first commit in JikesRVM when the ImmortalSpace was introduced. A subsequent commit in JikesRVM introducedisReachable
which checks the mark bit for ImmortalSpace, and it really checks reachability. Since then, JikesRVM had bothisLive
andisReachable
. Both of them were ported to the Rust MMTk we have today, and they behave just like in JikesRVM.This behavior contradicts with the definition of "live" and "reachable". Objects in the ImmortalSpace can become both unused by the mutator and unreachable from roots. Such objects are not "live" in either sense, yet
is_live()
still returns true.Consequences
If an object is not traced during tracing, it will not be scanned, and its fields will not be forwarded. If the GC is a copying GC (such as SemiSpace), the object will contain dangling references. This is OK if the object dies (in which case the VM will never touch the object again). But since ImmortalSpace erroneously considers such objects "live", there will be consequences.
Weak reference processing
In the built-in
ReferenceProcessor
andFinalizableProcessor
,is_live()
is used to test if a weak reference or a finalizable reference is reached by stronger references.Suppose in a JVM there is a
WeakReference
nameda
which refers to an objectb
in the ImmortalSpace. In a GC,a
is traced, butb
is not traced, andb
contains references to objects that have been moved. In theWeakRefClosure
stage,ReferenceProcessor
will inspecta
.is_live()
will show thatb
is "live", and it will retain the weak reference froma
tob
. After GC, the mutator can calla.get()
and upgrade the weak reference to a strong reference. Now the strong reference points to an objectb
that contains dangling references. When the mutator attempts to follow those references, it will crash.It will be similar if the VM binding uses the
Scanning::process_weak_ref
and usesis_live
to test if the referent is live.The
is_reachable
method is unaffected because it actually checks the mark bit. Unreachable objects will be unmarked, andis_reachable
will return false.VO bits
This is not related to the
is_live
method, but the VO bits in ImmortalSpace is never cleared, as if the objects never die. It mainly impacts conservative stack scanning. It has several solutions and I have elaborated in #1274What should we do?
We should just remove
is_live
. It's simply wrong.We should use
is_reachable
in places whereis_live
is used. Particularly,is_reachable
should be used when processing weak references.And we should clarify that
is_reachable
does not return true root-reachability because that's too expensive to compute. It returnstrue
if the current plan/space considers the object is reachable at the time it is called. It will return true if the object is marked and/or forwarded, and it is the same for ImmortalSpace. It will also return true for objects in the mature space and objects in the nursery that are reachable from the remembered set.Related issues
VO bits and Immortal spaces: #1274
I previously considered renaming
is_reachable
andis_live
in #1271 But after discussion, I think we don't need to renameis_reachable
. We just need to clarify its semantics. More discussions can be found in the comments of #1271The text was updated successfully, but these errors were encountered: