Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shutdown should cancel Tablet internal major compactions #5393

Open
dlmarion opened this issue Mar 10, 2025 · 6 comments
Open

Shutdown should cancel Tablet internal major compactions #5393

dlmarion opened this issue Mar 10, 2025 · 6 comments
Labels
bug This issue has been verified to be a bug.
Milestone

Comments

@dlmarion
Copy link
Contributor

Is your feature request related to a problem? Please describe.
When a user shuts down the cluster it sets the Manager goal state
to CLEAN_STOP, which will set the Manager to SAFE_MODE, which
will set user table tablets goal state to UNASSIGNED. The Manager
will then start to unload tablets for the user tables by calling the
UnloadTabletHandler on the TabletServer, which in turn calls
Tablet.close, which calls CompactableImpl.close. The following
code block will wait for all internal major compactions to
complete.

// Wait while internal jobs are running or external compactions are committing. When
// chopStatus is MARKING or selectStatus is SELECTING, there may be metadata table writes so
// wait on those. Do not wait on external compactions that are running.
Predicate<CompactionJob> jobsToWaitFor =
job -> !((CompactionExecutorIdImpl) job.getExecutor()).isExternalId();
while (runningJobs.stream().anyMatch(jobsToWaitFor)
|| !externalCompactionsCommitting.isEmpty()
|| fileMgr.chopStatus == ChopSelectionStatus.MARKING
|| fileMgr.selectStatus == FileSelectionStatus.SELECTING) {
log.debug(
"Closing {} is waiting on {} running compactions, {} committing external compactions, chop marking {}, file selection {}",
getExtent(), runningJobs.stream().filter(jobsToWaitFor).count(),
externalCompactionsCommitting.size(), fileMgr.chopStatus == ChopSelectionStatus.MARKING,
fileMgr.selectStatus == FileSelectionStatus.SELECTING);
try {
wait(50);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
throw new RuntimeException(e);
}
}

In the case of a tablet unload due to migration, waiting for the
major compactions to complete might make sense as you may
not want to lose the time spent on the compaction. When shutting
down the system, you might not care about that, and just need
to shut the system down now.

Describe the solution you'd like
When shutting down, don't wait for in-progress major compactions
to complete. Cancel them instead.

Describe alternatives you've considered
The only option when in this state and needing to shutdown the
cluster immediately is to kill the processes and incur the cost
of recovery.

@dlmarion dlmarion added the enhancement This issue describes a new feature, improvement, or optimization. label Mar 10, 2025
@dlmarion dlmarion added this to the 2.1.4 milestone Mar 10, 2025
@dlmarion
Copy link
Contributor Author

More information: CompactableImp.close sets closed to true before the code block referenced in the description. closed is used in the CompactionCheck object and is used to determine if any in-progress major compactions should abort processing. This should have occurred, but it was noticed that compactions were progressing and the queue was increasing for a tablet that was supposed to be unloading.

@dlmarion dlmarion added bug This issue has been verified to be a bug. and removed enhancement This issue describes a new feature, improvement, or optimization. labels Mar 10, 2025
@keith-turner
Copy link
Contributor

Looked to see if we had any test for this and did not find any, so opened #5395. The basic functionality does seem to work.

Looking through the code, I only saw the check to cancel a compaction done after the iterator used to read input files returned a key/value. If that iterator is filtering and never returns anything, then that is one possible thing that could hold it up. Maybe the compaction thread also needs to be interrupted, we made some changes like that for the scan threads where they set an atomic boolean to cancel and try to interrupt the thread.

@keith-turner
Copy link
Contributor

I modified the test in #5395 to sleep for 300 seconds instead of 3 seconds in the slow iterator to simulate an iterator filtering and not returning data. With that change the test times out and fails.

@dlmarion
Copy link
Contributor Author

Looking through the code, I only saw the check to cancel a compaction done after the iterator used to read input files returned a key/value. If that iterator is filtering and never returns anything, then that is one possible thing that could hold it up. Maybe the compaction thread also needs to be interrupted, we made some changes like that for the scan threads where they set an atomic boolean to cancel and try to interrupt the thread.

Looking at the code below, the CompactionEnv is passed to the FileCompactor and checked as you said above. But there is also a background thread that will interrupt the FileCompactor if the compaction is no longer enabled.

/**
* Create the FileCompactor and finally call compact. Returns the Major CompactionStats.
*/
static CompactionStats compact(Tablet tablet, CompactionJob job,
CompactableImpl.CompactionInfo cInfo, CompactionEnv cenv,
Map<StoredTabletFile,DataFileValue> compactFiles, TabletFile tmpFileName)
throws IOException, CompactionCanceledException {
TableConfiguration tableConf = tablet.getTableConfiguration();
AccumuloConfiguration compactionConfig = getCompactionConfig(tableConf,
getOverrides(job.getKind(), tablet, cInfo.localHelper, job.getFiles()));
final FileCompactor compactor = new FileCompactor(tablet.getContext(), tablet.getExtent(),
compactFiles, tmpFileName, cInfo.propagateDeletes, cenv, cInfo.iters, compactionConfig,
tableConf.getCryptoService());
final Runnable compactionCancellerTask = () -> {
if (!cenv.isCompactionEnabled()) {
compactor.interrupt();
}
};
final ScheduledFuture<?> future = tablet.getContext().getScheduledExecutor()
.scheduleWithFixedDelay(compactionCancellerTask, 10, 10, TimeUnit.SECONDS);
try {
return compactor.call();
} finally {
future.cancel(true);
}
}

I modified the test in #5395 to sleep for 300 seconds instead of 3 seconds in the slow iterator to simulate an iterator filtering and not returning data. With that change the test times out and fails.

I'm wondering if you should set the Manager goal state to SAFE_MODE in the IT to simulate a shutdown and unloading of user tablets. Also, I'm seeing in the SlowIterator that it's swallowing the InterruptedException, but setting the Thread's interrupted state, then calling next or seek. We are assuming that some other code is checking the Thread state, but it doesn't look like it's happening in this case. FileCompactor.call handles an IterationInterruptedException, maybe we should be throwing it in more places. From what I can tell looking at TabletIteratorEnv.getTopLevelIterator, the MultiIterator is the top-level iterator and it's not checking the Thread interrupt state.

@keith-turner
Copy link
Contributor

I tried modifying the test in #5395 to make the SlowIterator not eat the interrupted exception and to sleep for 300s. The test would still hang. Looking at the code not seeing anything that interrupts the compaction thread. The following code does the compaction on 1344 and then updates the metadata table and tablet in memory data on 1346.

stats = CompactableUtils.compact(tablet, job, cInfo, compactEnv, compactFiles, tmpFileName);
newFile = CompactableUtils.bringOnline(tablet.getDatafileManager(), cInfo, stats,

Seems like the ideal thing would be to only interrupt the thread when its in line 1344. Have been looking into ways to do that. Found that interrupting the thread when the code is in line 1346 may have some undesirable effects based on seeing #3750 while looking into this.

@keith-turner
Copy link
Contributor

I added code in #5395 to interrupt compactions threads and updated the new IT there to test this case.

If we wanted to get aggressive w/ troublesome compaction threads we could introduce the concept of zombie compaction threads where close will eventually abandon a compaction thread. However it will only do so if it can atomically confirm the thread is not making a metadata update and ensure the thread can never make a metadata update. Could probably make all of these changes in CompactableImpl. In general would probably want metrics for zombie compaction threads like the zombie scan threads metrics, so those changes would end up being outside of CompactableImpl.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue has been verified to be a bug.
Projects
None yet
Development

No branches or pull requests

2 participants