Remove `is_live` in favor for `is_reachable`. #1275

wks · 2025-02-17T09:47:14Z

TL;DR: MMTk has both is_live and is_reachable, and the only difference is that is_live always returns true for objects in ImmortalSpace. This is wrong, and has consequences. We should remove is_live.

Definition

According to the GC Handbook, the precise definition of live is that "an object is live if it will be used by a mutator". Because it is undecidable, GC systems use reachability instead, i.e. "an object is reachable if there is a path from roots to that object following references". in GC, "live" and "reachable" are used interchangeably, and "dead" is a synonym of "unreachable".

Actual behavior

However, in MMTk, Objects in the ImmortalSpace are erroneously considered "immortal", i.e. always live. The object.is_live() function always returns true if object is in the ImmortalSpace. Historically, this behavior can be traced back to the very first commit in JikesRVM when the ImmortalSpace was introduced. A subsequent commit in JikesRVM introduced isReachable which checks the mark bit for ImmortalSpace, and it really checks reachability. Since then, JikesRVM had both isLive and isReachable. Both of them were ported to the Rust MMTk we have today, and they behave just like in JikesRVM.

This behavior contradicts with the definition of "live" and "reachable". Objects in the ImmortalSpace can become both unused by the mutator and unreachable from roots. Such objects are not "live" in either sense, yet is_live() still returns true.

Consequences

If an object is not traced during tracing, it will not be scanned, and its fields will not be forwarded. If the GC is a copying GC (such as SemiSpace), the object will contain dangling references. This is OK if the object dies (in which case the VM will never touch the object again). But since ImmortalSpace erroneously considers such objects "live", there will be consequences.

Weak reference processing

In the built-in ReferenceProcessor and FinalizableProcessor, is_live() is used to test if a weak reference or a finalizable reference is reached by stronger references.

Suppose in a JVM there is a WeakReference named a which refers to an object b in the ImmortalSpace. In a GC, a is traced, but b is not traced, and b contains references to objects that have been moved. In the WeakRefClosure stage, ReferenceProcessor will inspect a. is_live() will show that b is "live", and it will retain the weak reference from a to b. After GC, the mutator can call a.get() and upgrade the weak reference to a strong reference. Now the strong reference points to an object b that contains dangling references. When the mutator attempts to follow those references, it will crash.

It will be similar if the VM binding uses the Scanning::process_weak_ref and uses is_live to test if the referent is live.

The is_reachable method is unaffected because it actually checks the mark bit. Unreachable objects will be unmarked, and is_reachable will return false.

VO bits

This is not related to the is_live method, but the VO bits in ImmortalSpace is never cleared, as if the objects never die. It mainly impacts conservative stack scanning. It has several solutions and I have elaborated in #1274

What should we do?

We should just remove is_live. It's simply wrong.

We should use is_reachable in places where is_live is used. Particularly, is_reachable should be used when processing weak references.

And we should clarify that is_reachable does not return true root-reachability because that's too expensive to compute. It returns true if the current plan/space considers the object is reachable at the time it is called. It will return true if the object is marked and/or forwarded, and it is the same for ImmortalSpace. It will also return true for objects in the mature space and objects in the nursery that are reachable from the remembered set.

Related issues

VO bits and Immortal spaces: #1274

I previously considered renaming is_reachable and is_live in #1271 But after discussion, I think we don't need to rename is_reachable. We just need to clarify its semantics. More discussions can be found in the comments of #1271

The text was updated successfully, but these errors were encountered:

wks · 2025-02-19T05:38:19Z

In today's meeting, @steveblackburn suggested we should not just use the definition of "live" in the GC Handbook, but instead make our own definition. One possible definition is:

An object is "live" if one of the following is true:

It is directly referenced by a root.
It is conservatively considered "live" by the plan or policy. Particularly, generational plans consider all mature objects as "live", and the ImmortalSpace considers all objects in it as "live".
It is referenced by another live object.

A live object is not necessarily root-reachable, and may not necessarily be used by the mutator. This gives the plans and policies a certain degree of freedom to define the liveness.

An invariant must be maintained that "live objects must only contain valid references (i.e. no dangling references)". Otherwise, the problem described here (WeakReference pointing to an immortal object that contain dangling references) and in #1274 (an immortal object that contain dangling references is picked up by the conservative stack scanner due to the presence of VO bit) will manifest and result in crash. Generational GCs maintain this invariant using the remembered set. ImmortalSpace needs to maintain this invariant, too. We will discuss this in #1274

wks mentioned this issue Feb 17, 2025

Rename is_reachable and is_live #1271

Closed

wks mentioned this issue Feb 19, 2025

VO bit and ImmortalSpace #1274

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove `is_live` in favor for `is_reachable`. #1275

Remove `is_live` in favor for `is_reachable`. #1275

wks commented Feb 17, 2025

wks commented Feb 19, 2025

Remove is_live in favor for is_reachable. #1275

Remove is_live in favor for is_reachable. #1275

Comments

wks commented Feb 17, 2025

Definition

Actual behavior

Consequences

Weak reference processing

VO bits

What should we do?

Related issues

wks commented Feb 19, 2025

Remove `is_live` in favor for `is_reachable`. #1275

Remove `is_live` in favor for `is_reachable`. #1275