-
Notifications
You must be signed in to change notification settings - Fork 455
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable auto merging of tablets #5353
base: main
Are you sure you want to change the base?
Conversation
This change adds support for periodically scanning tables to find ranges of tables that can be automatically merged. The thread that runs uses the recently added TabletMergeabilty column as well as total file counts/sizes to compute if a range can be merged. By default, tablets can be merged if the total size is less than 25% of the split threshold so we don't immediately merge only to split again. There is a new thread in the Manager class that performs the computation and will submit fate jobs for merging on tablet ranges. There is a new fate operation during merge that will validate the range is still ok to merge if the merge was submitted as a system merge. This closes apache#5014
0e8e73b
to
c089d87
Compare
@keith-turner and @dlmarion - I gave details in the description but even though this is marked as a draft it's still ready for review and feedback. |
Looks like there's a couple tests to fix which I can do when I get a chance, they are boken cause the PR changes the default tablet mergeability setting to always instead of never as mentioned in the description.
Edit: Fixed tests in fc0dbec |
core/src/main/java/org/apache/accumulo/core/client/admin/InstanceOperations.java
Outdated
Show resolved
Hide resolved
@@ -1292,6 +1293,13 @@ boolean canSuspendTablets() { | |||
ThreadPools.watchCriticalScheduledTask(context.getScheduledExecutor() | |||
.scheduleWithFixedDelay(() -> ScanServerMetadataEntries.clean(context), 10, 10, MINUTES)); | |||
|
|||
// TODO - create new threadpool? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume that this cannot go into the TabletGroupWatcher because it needs to look at more than one Tablet? I think using the general scheduled executor threadpool should be fine. However, we may want to consider bumping the default value of GENERAL_THREADPOOL_SIZE
from 1
to something larger. Probably good for a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, TGW doesn't work in this case because because the task needs to look at multiple tablets together and not just one tablet at a time. Also, the task likely doesn't need to run very often to look for mergeable ranges where as TGW runs frequently
server/manager/src/main/java/org/apache/accumulo/manager/merge/FindMergeableRangeTask.java
Outdated
Show resolved
Hide resolved
server/manager/src/main/java/org/apache/accumulo/manager/merge/FindMergeableRangeTask.java
Show resolved
Hide resolved
Is there work to do in the upgrade code, or has that already been done? |
There is still upgrade work to do, we should go through and mark all existing tablets as "never" merge on upgrade. If I recall the default will be never if the column is absent but I think it's better to be explicit and mark the tablets. |
server/manager/src/main/java/org/apache/accumulo/manager/merge/FindMergeableRangeTask.java
Outdated
Show resolved
Hide resolved
long totalFiles = 0; | ||
int totalUnMergeable = 0; | ||
|
||
for (var tabletMetadata : tablets) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I first saw this code I though maybe it could use the MergeableRange code and try to add all tablets in the range to that. However that would not give the reason that a tablet could not be added, so it almost works but not quite.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought the same thing, I have/had been trying to figure out a good way to share the same logic between both but it's just different enough there wasn't really a good way to share the code but i'll keep thinking about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering if MergeableRange.add() returned an enum instead of a boolean if that could work. The enum would indicate if added or if not added then why.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I refactored things to share MergeableRange between the task and verify in 3c84e85
core/src/main/java/org/apache/accumulo/core/client/admin/InstanceOperations.java
Outdated
Show resolved
Hide resolved
server/manager/src/main/java/org/apache/accumulo/manager/merge/FindMergeableRangeTask.java
Show resolved
Hide resolved
test/src/main/java/org/apache/accumulo/test/functional/TabletMergeabilityIT.java
Outdated
Show resolved
Hide resolved
|
||
manager.fate(type).seedTransaction(FateOperation.SYSTEM_MERGE, fateId, | ||
manager.fate(type).seedTransaction(FateOperation.SYSTEM_MERGE, fateKey, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will wait for the data to be written to the walog of the fate table for each operation seeded. W/ the changes in #5160 this is another place where waiting on the walog per operation can be avoided.
@@ -683,8 +689,12 @@ protected void testListFateKeys(FateStore<TestEnv> store, ServerContext sctx) th | |||
var fateKey3 = FateKey.forCompactionCommit(cid1); | |||
var fateKey4 = FateKey.forCompactionCommit(cid2); | |||
|
|||
// use one overlapping extent and one different | |||
var fateKey5 = FateKey.forMerge(extent1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a change for this PR. Was looking around to see if FateKey had any unit test that should be updated and did not see any. Could be useful to add some unit test to check equals, hashcode, serder, etc in another PR.
test/src/main/java/org/apache/accumulo/test/functional/TabletMergeabilityIT.java
Show resolved
Hide resolved
I ran the full IT build and it passed so this is ready for another review. |
This change adds support for periodically scanning tables to find ranges of tables that can be automatically merged. The thread that runs uses the recently added TabletMergeabilty column as well as total file counts/sizes to compute if a range can be merged. By default, tablets can be merged if the total size is less than 25% of the split threshold so we don't immediately merge only to split again.
There is a new thread in the Manager class that performs the computation and will submit fate jobs for merging on tablet ranges. There is a new fate operation during merge that will validate the range is still ok to merge if the merge was submitted as a system merge.
This closes #5014
I am making this a draft because while it's feature complete, I still think there's some more work to do on the tests and tweaks to make, but wanted to get some feedback/suggestions.
Outstanding issues/note/questions:
never
merge, but I think we should change this to always so i switched the default tablet to be a mergeability ofalways
in this PR. The reason is we can split the default tablet, so we should be able to merge other tablets back to it. It will never go away because the end row is null. One reason why I did this is there is no way in the API right now to change the mergeability for the default tablet (end row is null which you can't put into a map as a key) so I figure we should just make it always mergeable and any other splits would not be mergeable by default for user created tablets. This would be easy to set back to never by default if desired, but if keep the default tablet as never then we can never merge back to a single tablet again.never
. Should we change this? I am thinking I probably want to switch the default tablet to be mergeable just like I did for user tables for consistency. However, pre-splitting is ever done the tablets would need to be made sure to be marked as never merge so they don't get merged away due to a size of 0.FindMergeableRangeTask
? What interval by default? For now I put in a todo marker and just used the default context scheduler and set a run time of every 24 hours.TabletMergeabilityIT
especially only has 4 tests for now to demonstrate but there's a lot more tests we could write to verify the checks work like max files, sizes, etc work and we get the exepcted merges done.