Skip to content

Commit cb1158e

Browse files
authored
admin fate improvements, LockIDs use for fate stores improved/fixed (apache#5028)
* `admin fate` improvements, `LockID`s use for fate stores improved/fixed This commit makes several improvements/fixes: improvements to the `admin fate` util which were made possible from apache#4524, replaced incorrect use of `createDummyLockID()` in real code (now only used in tests), `<User/Meta>FateStore` now support a null lock id if they will be used as read-only stores: write ops will fail on a store with a null lock, and some other misc. changes. Full list of changes: - Removed the check for a dead Manager in the Admin fate util (AdminUtil) which was checked before `admin fate delete <tx>` or `admin fate fail <tx>` was able to run. This check is no longer needed with the changes from apache#4524. apache#4524 moved reservations out of Manager memory into the FATE table (for UserFateStore) and into ZK (for MetaFateStore). Prior to this, the Admin process would have no way of knowing if the Manager process had a transaction reserved, so the Manager had to be shutdown to ensure it was not. But now that reservations are visible to any process, we can try to reserve the transaction in Admin, and if it cannot be reserved and deleted/failed in a reasonable time, we let the user know that the Manager would need to be shutdown if deleting/failing the transaction is still desired. - This has several benefits: - It is one less thing to worry about when implementing multiple managers in the future since Admin assumes only one Manager for these commands. However, there is still the case where the Manager may keep a transaction reserved for a long period of time and the Admin can never reserve it. In this case, we inform the user that the transaction could not be deleted/failed and that if deleting/failing is still desired, the Manager may need to be shutdown. - It covers a potential issue in the previously existing code where there was nothing stopping or ensuring a Manager is not started after the check is already performed in Admin but before the delete/ fail was executed. - It also should make the commands easier to use now since the Manager is not required to be shutdown before use. - Changes and adds some tests for `admin fate fail` and `admin fate delete`: ensures the Manager is not required to be down to fail/delete a transaction, and ensures that if the Manager does have a transaction reserved, admin will fail to reserve and fail/delete the transaction. - Another change which was needed as a prerequisite for the above changes was creating a ZK lock for Admin so transactions can be properly reserved by the command. Added new constant `Constants.ZADMIN_LOCK = "/admin/lock"`, changed `ServiceLockPaths`, and added `Admin.createAdminLock()` to support this - New class `TestLock` (in test package) which is used by tests to create a real ZK lock, or a dummy one. Removed `createDummyLockID()` from `AbstractFateStore` (moved to TestLock), and `createDummyLock()` is now only used in test code. Added new constant `ZTEST_LOCK = "/test/lock"`, changed `ServiceLockPaths`, and added `createTestLock()` which is used to create a real lock id (one held in ZK) which is needed for some tests. - This fixes an unexpected failure that could have occurred for `ExternalCompaction_1_IT`. Was using a dummy lock for the store before and the fate data was being stored in the same locations that the Manager uses for it's fate data. The DeadReservationCleaner running in Manager would have cleaned up reservations created using this store if it ran when reservations were present. Now the test creates a real ZK lock so the DeadReservationCleaner won't clean these up unexpectedly. - Stores now support a null lock id for the situation where they will be used as read-only stores. A store with a null lock id will fail on write ops. Changed all existing uses of stores to only have a lock id if writes will occur (previously, all instances of the stores had a lock id). - Removed unused or unneccesary constructors for AbstractFateStore, MetaFateStore, UserFateStore - Ensured all tests changed, all FATE tests, and sunny day tests still pass closes apache#4904
1 parent 0add23e commit cb1158e

31 files changed

+725
-319
lines changed

core/src/main/java/org/apache/accumulo/core/Constants.java

+2
Original file line numberDiff line numberDiff line change
@@ -88,6 +88,8 @@ public class Constants {
8888

8989
public static final String ZTABLE_LOCKS = "/table_locks";
9090
public static final String ZMINI_LOCK = "/mini";
91+
public static final String ZADMIN_LOCK = "/admin/lock";
92+
public static final String ZTEST_LOCK = "/test/lock";
9193

9294
public static final String BULK_PREFIX = "b-";
9395
public static final String BULK_RENAME_FILE = "renames.json";

core/src/main/java/org/apache/accumulo/core/fate/AbstractFateStore.java

+5-17
Original file line numberDiff line numberDiff line change
@@ -101,10 +101,6 @@ public FateId newRandomId(FateInstanceType instanceType) {
101101
// Keeps track of the number of concurrent callers to waitForStatusChange()
102102
private final AtomicInteger concurrentStatusChangeCallers = new AtomicInteger(0);
103103

104-
public AbstractFateStore() {
105-
this(createDummyLockID(), null, DEFAULT_MAX_DEFERRED, DEFAULT_FATE_ID_GENERATOR);
106-
}
107-
108104
public AbstractFateStore(ZooUtil.LockID lockID, Predicate<ZooUtil.LockID> isLockHeld) {
109105
this(lockID, isLockHeld, DEFAULT_MAX_DEFERRED, DEFAULT_FATE_ID_GENERATOR);
110106
}
@@ -114,9 +110,7 @@ public AbstractFateStore(ZooUtil.LockID lockID, Predicate<ZooUtil.LockID> isLock
114110
this.maxDeferred = maxDeferred;
115111
this.fateIdGenerator = Objects.requireNonNull(fateIdGenerator);
116112
this.deferred = Collections.synchronizedMap(new HashMap<>());
117-
this.lockID = Objects.requireNonNull(lockID);
118-
// If the store is used for a Fate which runs a dead reservation cleaner,
119-
// this should be non-null, otherwise null is fine
113+
this.lockID = lockID;
120114
this.isLockHeld = isLockHeld;
121115
}
122116

@@ -291,6 +285,10 @@ protected void verifyFateKey(FateId fateId, Optional<FateKey> fateKeySeen,
291285
"Collision detected for fate id " + fateId);
292286
}
293287

288+
protected void verifyLock(ZooUtil.LockID lockID, FateId fateId) {
289+
Preconditions.checkState(lockID != null, "Tried to reserve " + fateId + " with null lockID");
290+
}
291+
294292
protected abstract Stream<FateIdStatus> getTransactions(EnumSet<TStatus> statuses);
295293

296294
protected abstract TStatus _getStatus(FateId fateId);
@@ -450,14 +448,4 @@ protected static Serializable deserializeTxInfo(TxInfo txInfo, byte[] data) {
450448
throw new IllegalStateException("Bad node data " + txInfo);
451449
}
452450
}
453-
454-
/**
455-
* this is a temporary method used to create a dummy lock when using a FateStore outside the
456-
* context of a Manager (one example is testing) so reservations can still be made.
457-
*
458-
* @return a dummy {@link ZooUtil.LockID}
459-
*/
460-
public static ZooUtil.LockID createDummyLockID() {
461-
return new ZooUtil.LockID("/path", "node", 123);
462-
}
463451
}

0 commit comments

Comments
 (0)