Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

admin check run command functionality #4892

Open
6 of 10 tasks
kevinrr888 opened this issue Sep 16, 2024 · 0 comments · May be fixed by #5348
Open
6 of 10 tasks

admin check run command functionality #4892

kevinrr888 opened this issue Sep 16, 2024 · 0 comments · May be fixed by #5348
Assignees
Labels
blocker This issue blocks any release version labeled on it. enhancement This issue describes a new feature, improvement, or optimization.
Milestone

Comments

@kevinrr888
Copy link
Member

kevinrr888 commented Sep 16, 2024

Is your feature request related to a problem? Please describe.
#4807 added a new admin command admin check run which can be used to run various checks for problems in Accumulo. The checks don't do anything yet. The functionality for each of them should be added.

Describe the solution you'd like

  • Functionality for SYSTEM_CONFIG check
    • Check ZooKeeper locks for Accumulo server processes
    • Check ZooKeeper nodes for tables
    • Check the WAL metadata in ZooKeeper
  • Create and add functionality for a SERVER_CONFIG check
    • Check that all configured properties are valid (valid key and value)
    • Check that some expected required properties are present in the config (default value or user set)
  • Create and add functionality for a TABLE_LOCKS check (completed in [1])
    • Ensures that table and namespace locks are valid and are associated with a FATE op
  • Functionality for ROOT_METADATA check (completed in [1])
    • offline tablets
    • missing "columns"
    • invalid "columns"
  • Functionality for ROOT_TABLE check (completed in [1])
    • offline tablets
    • tablets for metadata table have no holes, valid (null) prev end row for first tablet, and valid (null) end row for last tablet
    • missing columns
    • invalid columns
  • Functionality for METADATA_TABLE check (completed in [1])
    • offline tablets
    • tablets for user tables (and scanref) have no holes, valid (null) prev end row for first tablet, and valid (null) end row for last tablet
    • missing columns
    • invalid columns
  • Functionality for SYSTEM_FILES check (completed in [1])
    • missing system files
  • Functionality for USER_FILES check (completed in [1])
    • missing user files
  • Existing checks currently done through other commands should be moved under the appropriate new check command.
    • Check for dangling fate locks (printed as info from accumulo admin fate print) (completed in [1])
    • accumulo admin checkTablets (completed in [1])
    • accumulo check-server-config
    • Note that accumulo check-compaction-config and accumulo check-accumulo-properties should not be moved since they just check the validity of a provided file, and do not operate on a running instance
  • Expand on AdminCheckIT as functionality is added. To complete this sub task, AdminCheckIT needs to check passing and failing cases for all the checks.

These should probably be completed over several PRs

The above list is subject to change

[1] #4957

Additional context
#4807 - added the check command
#4687 - detailed info about what should be checked

@kevinrr888 kevinrr888 added blocker This issue blocks any release version labeled on it. enhancement This issue describes a new feature, improvement, or optimization. labels Sep 16, 2024
@kevinrr888 kevinrr888 added this to the 3.1.0 milestone Sep 16, 2024
@kevinrr888 kevinrr888 self-assigned this Sep 16, 2024
kevinrr888 added a commit to kevinrr888/accumulo that referenced this issue Oct 8, 2024
This commit:
- Moves existing checks (`checkTablets` and the fate check for dangling locks) into the appropriate new `admin check` command
- Adds new checks
- New tests in AdminCheckIT
- SYSTEM_CONFIG now checks for
	- valid locked table/namespace ids (the locked table/namespaces exist)
	- locked table/namespaces are associated with a fate op
- ROOT_METADATA now checks for
	- offline tablets
	- missing "columns"
	- invalid "columns"
- ROOT_TABLE now checks for
	- offline tablets
	- tablets for metadata table have no holes, valid (null) prev end row for first tablet, and valid (null) end row for last tablet
	- missing columns
	- invalid columns
- METADATA_TABLE now checks for
	- offline tablets
	- tablets for user tables (and scanref) have no holes, valid (null) prev end row for first tablet, and valid (null) end row for last tablet
	- missing columns
	- invalid columns
- SYSTEM_FILES now checks for
	- missing system files
- USER_FILES now checks for
	- missing user files

Part of apache#4892
kevinrr888 added a commit that referenced this issue Dec 10, 2024
* Checks for problems in Accumulo

This partially completes #4892:

- Moves existing checks (`checkTablets` and the fate check for dangling locks) into the appropriate new `admin check` command
- Adds new checks
- New tests in AdminCheckIT
- Created new check TABLE_LOCKS which checks for
	- valid locked table/namespace ids (the locked table/namespaces exist)
	- locked table/namespaces are associated with a fate op
- ROOT_METADATA now checks for
	- offline tablets
	- missing "columns"
	- invalid "columns"
- ROOT_TABLE now checks for
	- offline tablets
	- tablets for metadata table have no holes, valid (null) prev end row for first tablet, and valid (null) end row for last tablet
	- missing columns
	- invalid columns
- METADATA_TABLE now checks for
	- offline tablets
	- tablets for user tables (and scanref) have no holes, valid (null) prev end row for first tablet, and valid (null) end row for last tablet
	- missing columns
	- invalid columns
- SYSTEM_FILES now checks for
	- missing system files
- USER_FILES now checks for
	- missing user files
@kevinrr888 kevinrr888 linked a pull request Feb 21, 2025 that will close this issue
@kevinrr888 kevinrr888 linked a pull request Feb 21, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocker This issue blocks any release version labeled on it. enhancement This issue describes a new feature, improvement, or optimization.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant