Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove is_live in favor for is_reachable. #1275

Open
wks opened this issue Feb 17, 2025 · 1 comment
Open

Remove is_live in favor for is_reachable. #1275

wks opened this issue Feb 17, 2025 · 1 comment

Comments

@wks
Copy link
Collaborator

wks commented Feb 17, 2025

TL;DR: MMTk has both is_live and is_reachable, and the only difference is that is_live always returns true for objects in ImmortalSpace. This is wrong, and has consequences. We should remove is_live.

Definition

According to the GC Handbook, the precise definition of live is that "an object is live if it will be used by a mutator". Because it is undecidable, GC systems use reachability instead, i.e. "an object is reachable if there is a path from roots to that object following references". in GC, "live" and "reachable" are used interchangeably, and "dead" is a synonym of "unreachable".

Actual behavior

However, in MMTk, Objects in the ImmortalSpace are erroneously considered "immortal", i.e. always live. The object.is_live() function always returns true if object is in the ImmortalSpace. Historically, this behavior can be traced back to the very first commit in JikesRVM when the ImmortalSpace was introduced. A subsequent commit in JikesRVM introduced isReachable which checks the mark bit for ImmortalSpace, and it really checks reachability. Since then, JikesRVM had both isLive and isReachable. Both of them were ported to the Rust MMTk we have today, and they behave just like in JikesRVM.

This behavior contradicts with the definition of "live" and "reachable". Objects in the ImmortalSpace can become both unused by the mutator and unreachable from roots. Such objects are not "live" in either sense, yet is_live() still returns true.

Consequences

If an object is not traced during tracing, it will not be scanned, and its fields will not be forwarded. If the GC is a copying GC (such as SemiSpace), the object will contain dangling references. This is OK if the object dies (in which case the VM will never touch the object again). But since ImmortalSpace erroneously considers such objects "live", there will be consequences.

Weak reference processing

In the built-in ReferenceProcessor and FinalizableProcessor, is_live() is used to test if a weak reference or a finalizable reference is reached by stronger references.

Suppose in a JVM there is a WeakReference named a which refers to an object b in the ImmortalSpace. In a GC, a is traced, but b is not traced, and b contains references to objects that have been moved. In the WeakRefClosure stage, ReferenceProcessor will inspect a. is_live() will show that b is "live", and it will retain the weak reference from a to b. After GC, the mutator can call a.get() and upgrade the weak reference to a strong reference. Now the strong reference points to an object b that contains dangling references. When the mutator attempts to follow those references, it will crash.

It will be similar if the VM binding uses the Scanning::process_weak_ref and uses is_live to test if the referent is live.

The is_reachable method is unaffected because it actually checks the mark bit. Unreachable objects will be unmarked, and is_reachable will return false.

VO bits

This is not related to the is_live method, but the VO bits in ImmortalSpace is never cleared, as if the objects never die. It mainly impacts conservative stack scanning. It has several solutions and I have elaborated in #1274

What should we do?

We should just remove is_live. It's simply wrong.

We should use is_reachable in places where is_live is used. Particularly, is_reachable should be used when processing weak references.

And we should clarify that is_reachable does not return true root-reachability because that's too expensive to compute. It returns true if the current plan/space considers the object is reachable at the time it is called. It will return true if the object is marked and/or forwarded, and it is the same for ImmortalSpace. It will also return true for objects in the mature space and objects in the nursery that are reachable from the remembered set.

Related issues

VO bits and Immortal spaces: #1274

I previously considered renaming is_reachable and is_live in #1271 But after discussion, I think we don't need to rename is_reachable. We just need to clarify its semantics. More discussions can be found in the comments of #1271

@wks
Copy link
Collaborator Author

wks commented Feb 19, 2025

In today's meeting, @steveblackburn suggested we should not just use the definition of "live" in the GC Handbook, but instead make our own definition. One possible definition is:

An object is "live" if one of the following is true:

  1. It is directly referenced by a root.
  2. It is conservatively considered "live" by the plan or policy. Particularly, generational plans consider all mature objects as "live", and the ImmortalSpace considers all objects in it as "live".
  3. It is referenced by another live object.

A live object is not necessarily root-reachable, and may not necessarily be used by the mutator. This gives the plans and policies a certain degree of freedom to define the liveness.

An invariant must be maintained that "live objects must only contain valid references (i.e. no dangling references)". Otherwise, the problem described here (WeakReference pointing to an immortal object that contain dangling references) and in #1274 (an immortal object that contain dangling references is picked up by the conservative stack scanner due to the presence of VO bit) will manifest and result in crash. Generational GCs maintain this invariant using the remembered set. ImmortalSpace needs to maintain this invariant, too. We will discuss this in #1274

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant