-
Notifications
You must be signed in to change notification settings - Fork 455
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
implements interuption of compactions and adds an IT to verify compaction cancels on tablet migration #5395
base: 2.1
Are you sure you want to change the base?
Conversation
After running this test, see the following in the logs.
|
is.addOption(SLEEP_UNINTERRUPTIBLY, Boolean.toString(b)); | ||
} | ||
|
||
private void sleep(long time) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked these changes by jstacking a tablet server running the new test and verifying that compaction threads were sleeping both ways.
.map(ac -> ac.getTablet().getTable()) | ||
.filter(tid -> tid.equals(tableId1) || tid.equals(tableId2)).count(); | ||
log.debug("Running compactions {}", runningCompactions); | ||
return runningCompactions == 40; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test usually passes for me, every once in a while it will get stuck here w/ less than the expected 40 compactions. Probably some issue w/ the propagation of compaction config changes.
// Limit interrupting compactions to the part that reads and writes files and avoid | ||
// interrupting the metadata table updates. | ||
try (var ignored = new CompactionInterrupter()) { | ||
stats = CompactableUtils.compact(tablet, job, cInfo, compactEnv, compactFiles, tmpFileName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying to understand how this PR is different than what is happening inside CompactableUtils.compact at:
accumulo/server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/CompactableUtils.java
Lines 573 to 579 in 2950699
final Runnable compactionCancellerTask = () -> { | |
if (!cenv.isCompactionEnabled()) { | |
compactor.interrupt(); | |
} | |
}; | |
final ScheduledFuture<?> future = tablet.getContext().getScheduledExecutor() | |
.scheduleWithFixedDelay(compactionCancellerTask, 10, 10, TimeUnit.SECONDS); |
When CompactableImpl.close
is called and sets closed=true
, then cenv.isCompactionEnabled()
should return false
and interrupt the FileCompactor. I'm thinking the difference here is checking the Thread.interrupted
state and throwing an InterruptedException
in CompactionInterrupter.close
. I'm wondering if the code below is functionally equivalent:
try {
stats = CompactableUtils.compact(tablet, job, cInfo, compactEnv, compactFiles, tmpFileName);
} finally {
if (Thread.interrupted()) {
throw new InterruptedException();
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main difference is that compactor.interrupt() was not calling Thread.interrupt(). However it could call that, it was already tracking the thread running the compaction. Modified it do that in 7d39c8d moving where Thread.interrupt() is called down a level (this will probably benefit external compactions also). The test still pass w/ this change, however they were a bit slower because the interrupting the thread could take up 10 seconds after calling close(). So changed the scheduled timer to run every 3 seconds. The previous code would immediately interrupt the thread when close() was called.
I'm wondering if we should change: accumulo/server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/CompactableImpl.java Lines 1199 to 1201 in 2950699
to:
|
For the change to isCompactionEnabled() the way the code is working w/ the latest that changes it would probably not help to check for interrupted there. The interrupt stats will be set on the compaction thread when is compaction isCompactionEnabled() returns false. Also another thead besides the compaction thread could call isCompactionEnabled(), so could not use Thread.currentThread() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good.
Changed this from adding a test to adding a test and adding functionality. Added the ability to interrupt compaction threads on tablet close. Added an IT that covers the two ways compactions can be interrupted on compaction close.