Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Completed functionality for admin check command #5348

Open
wants to merge 2 commits into
base: 3.1
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@
import org.apache.accumulo.core.conf.SiteConfiguration;
import org.apache.accumulo.server.ServerDirs;
import org.apache.accumulo.server.fs.VolumeManagerImpl;
import org.apache.accumulo.server.util.Admin;
import org.apache.accumulo.start.spi.KeywordExecutable;
import org.apache.hadoop.conf.Configuration;

Expand All @@ -45,7 +46,8 @@ public String description() {
return "Checks the provided Accumulo configuration file for errors. "
+ "This only checks the contents of the file and not any running Accumulo system, "
+ "so it can be used prior to init, but only performs a subset of the checks done by "
+ (new CheckServerConfig().keyword());
+ "the admin " + Admin.CheckCommand.class.getSimpleName() + " check "
+ Admin.CheckCommand.Check.SERVER_CONFIG;
}

@SuppressFBWarnings(value = "PATH_TRAVERSAL_IN", justification = "intentional user-provided path")
Expand Down

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -148,7 +148,7 @@ public void walUnreferenced(TServerInstance tsi, Path path) throws WalMarkerExce
updateState(tsi, path, WalState.UNREFERENCED);
}

private static Pair<WalState,Path> parse(byte[] data) {
public static Pair<WalState,Path> parse(byte[] data) {
String[] parts = new String(data, UTF_8).split(",");
return new Pair<>(WalState.valueOf(parts[0]), new Path(parts[1]));
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,7 @@
import org.apache.accumulo.server.util.checkCommand.MetadataTableCheckRunner;
import org.apache.accumulo.server.util.checkCommand.RootMetadataCheckRunner;
import org.apache.accumulo.server.util.checkCommand.RootTableCheckRunner;
import org.apache.accumulo.server.util.checkCommand.ServerConfigCheckRunner;
import org.apache.accumulo.server.util.checkCommand.SystemConfigCheckRunner;
import org.apache.accumulo.server.util.checkCommand.SystemFilesCheckRunner;
import org.apache.accumulo.server.util.checkCommand.TableLocksCheckRunner;
Expand Down Expand Up @@ -174,6 +175,8 @@ public enum Check {
// Caution should be taken when changing or adding any new checks: order is important
SYSTEM_CONFIG(SystemConfigCheckRunner::new, "Validate the system config stored in ZooKeeper",
Collections.emptyList()),
SERVER_CONFIG(ServerConfigCheckRunner::new, "Validate the server configuration",
Collections.singletonList(SYSTEM_CONFIG)),
TABLE_LOCKS(TableLocksCheckRunner::new,
"Ensures that table and namespace locks are valid and are associated with a FATE op",
Collections.singletonList(SYSTEM_CONFIG)),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,6 @@

import org.apache.accumulo.core.client.IteratorSetting;
import org.apache.accumulo.core.client.Scanner;
import org.apache.accumulo.core.client.TableNotFoundException;
import org.apache.accumulo.core.data.Key;
import org.apache.accumulo.core.data.Mutation;
import org.apache.accumulo.core.data.TableId;
Expand All @@ -43,7 +42,6 @@
import org.apache.accumulo.server.constraints.SystemEnvironment;
import org.apache.accumulo.server.util.Admin;
import org.apache.hadoop.io.Text;
import org.apache.zookeeper.KeeperException;

public interface MetadataCheckRunner extends CheckRunner {

Expand All @@ -64,8 +62,7 @@ default String scanning() {
* that are expected. For the root metadata, ensures that the expected "columns" exist in ZK.
*/
default Admin.CheckCommand.CheckStatus checkRequiredColumns(ServerContext context,
Admin.CheckCommand.CheckStatus status)
throws TableNotFoundException, InterruptedException, KeeperException {
Admin.CheckCommand.CheckStatus status) throws Exception {
Set<ColumnFQ> requiredColFQs;
Set<Text> requiredColFams;
boolean missingReqCol = false;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,6 @@
import java.util.HashSet;
import java.util.Set;

import org.apache.accumulo.core.client.TableNotFoundException;
import org.apache.accumulo.core.data.TableId;
import org.apache.accumulo.core.metadata.AccumuloTable;
import org.apache.accumulo.core.metadata.RootTable;
Expand All @@ -35,7 +34,6 @@
import org.apache.accumulo.server.util.Admin;
import org.apache.accumulo.server.util.FindOfflineTablets;
import org.apache.hadoop.io.Text;
import org.apache.zookeeper.KeeperException;

public class RootMetadataCheckRunner implements MetadataCheckRunner {
private static final Admin.CheckCommand.Check check = Admin.CheckCommand.Check.ROOT_METADATA;
Expand All @@ -54,8 +52,7 @@ public TableId tableId() {
public Set<ColumnFQ> requiredColFQs() {
return Set.of(MetadataSchema.TabletsSection.TabletColumnFamily.PREV_ROW_COLUMN,
MetadataSchema.TabletsSection.ServerColumnFamily.DIRECTORY_COLUMN,
MetadataSchema.TabletsSection.ServerColumnFamily.TIME_COLUMN,
MetadataSchema.TabletsSection.ServerColumnFamily.LOCK_COLUMN);
MetadataSchema.TabletsSection.ServerColumnFamily.TIME_COLUMN);
}

@Override
Expand All @@ -70,7 +67,7 @@ public String scanning() {

@Override
public Admin.CheckCommand.CheckStatus runCheck(ServerContext context, ServerUtilOpts opts,
boolean fixFiles) throws TableNotFoundException, InterruptedException, KeeperException {
boolean fixFiles) throws Exception {
Admin.CheckCommand.CheckStatus status = Admin.CheckCommand.CheckStatus.OK;
printRunning();

Expand All @@ -97,8 +94,7 @@ public Admin.CheckCommand.CheckStatus runCheck(ServerContext context, ServerUtil

@Override
public Admin.CheckCommand.CheckStatus checkRequiredColumns(ServerContext context,
Admin.CheckCommand.CheckStatus status)
throws TableNotFoundException, InterruptedException, KeeperException {
Admin.CheckCommand.CheckStatus status) throws Exception {
final String path = context.getZooKeeperRoot() + RootTable.ZROOT_TABLET;
final String json = new String(context.getZooSession().asReader().getData(path), UTF_8);
final var rtm = new RootTabletMetadata(json);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -51,8 +51,7 @@ public TableId tableId() {
public Set<ColumnFQ> requiredColFQs() {
return Set.of(MetadataSchema.TabletsSection.TabletColumnFamily.PREV_ROW_COLUMN,
MetadataSchema.TabletsSection.ServerColumnFamily.DIRECTORY_COLUMN,
MetadataSchema.TabletsSection.ServerColumnFamily.TIME_COLUMN,
MetadataSchema.TabletsSection.ServerColumnFamily.LOCK_COLUMN);
MetadataSchema.TabletsSection.ServerColumnFamily.TIME_COLUMN);
}
Comment on lines 51 to 55
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do the required col FQs and col fams in RootTableCheckRunner, RootMetadataCheckRunner, and MetadataTableCheckRunner look correct now? If these look good, since these are all equivalent now, can push these up to no longer override and return the same thing...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah those look good.


@Override
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* https://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
package org.apache.accumulo.server.util.checkCommand;

import java.util.HashMap;
import java.util.Map;
import java.util.Set;

import org.apache.accumulo.core.conf.Property;
import org.apache.accumulo.server.ServerContext;
import org.apache.accumulo.server.cli.ServerUtilOpts;
import org.apache.accumulo.server.util.Admin;

public class ServerConfigCheckRunner implements CheckRunner {
private static final Admin.CheckCommand.Check check = Admin.CheckCommand.Check.SERVER_CONFIG;

@Override
public Admin.CheckCommand.CheckStatus runCheck(ServerContext context, ServerUtilOpts opts,
boolean fixFiles) throws Exception {
Admin.CheckCommand.CheckStatus status = Admin.CheckCommand.CheckStatus.OK;
printRunning();

log.trace("********** Checking server configuration **********");

log.trace("Checking that all configured properties are valid (valid key and value)");
final Map<String,String> definedProps = new HashMap<>();
final var config = context.getConfiguration();
config.getProperties(definedProps, s -> true);
for (var entry : definedProps.entrySet()) {
var key = entry.getKey();
var val = entry.getValue();
if (!Property.isValidProperty(key, val)) {
log.warn("Invalid property (key={} val={}) found in the config", key, val);
status = Admin.CheckCommand.CheckStatus.FAILED;
}
}

log.trace("Checking that all required config properties are present");
// there are many properties that should be set (default value or user set), identifying them
// all and checking them here is unrealistic. Some property that is not set but is expected
// will likely result in some sort of failure eventually anyway. We will just check a few
// obvious required properties here.
Set<Property> requiredProps = Set.of(Property.INSTANCE_ZK_HOST, Property.INSTANCE_ZK_TIMEOUT,
Property.INSTANCE_SECRET, Property.INSTANCE_VOLUMES, Property.GENERAL_THREADPOOL_SIZE,
Property.GENERAL_DELEGATION_TOKEN_LIFETIME,
Property.GENERAL_DELEGATION_TOKEN_UPDATE_INTERVAL, Property.GENERAL_IDLE_PROCESS_INTERVAL,
Property.GENERAL_LOW_MEM_DETECTOR_INTERVAL, Property.GENERAL_LOW_MEM_DETECTOR_THRESHOLD,
Property.GENERAL_PROCESS_BIND_ADDRESS, Property.GENERAL_SERVER_LOCK_VERIFICATION_INTERVAL,
Property.MANAGER_CLIENTPORT, Property.TSERV_CLIENTPORT, Property.GC_CYCLE_START,
Property.GC_CYCLE_DELAY, Property.GC_PORT, Property.MONITOR_PORT, Property.TABLE_MAJC_RATIO,
Property.TABLE_SPLIT_THRESHOLD);
Comment on lines +55 to +67
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any other important ones I left out? Any I shouldn't have included?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a change for this PR, but wondering if this should be pushed to the validation code of each property. For example could we attempt to do something like the following in addition to the code that goes through the defined props above. Thinking if a props validation fails on null or empty string then its "required" and should be set. Looking at some of the important props, like instance.volumes their types would need change from something besides STRING that is more specific, which would be a good general change to make (would be good to have specific type to do validation for instance volumes and that could include not accepting empty string).

    for(var prop : Property.values()) {
      var value = config.get(prop);
      if (!Property.isValidProperty(prop.getKey(), value)) {
        log.warn("Invalid property (key={} val={}) found in the config", prop, value);
      }
    }

If the rest of the code worked like this, then would not need this list here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the code is currently written it loops over all the props the loop in the prev comment may not work well because the get method replaces w/ the default value when not present.

In general it seems like it would be best to move the concept of a required property into the Property class in some form. Then the entire system could react appropriately when a required property is not present and is requested. For now a list in this class seems fine.

I experimented w/ validating the volume prop in #5365 based on the exploration done as part of this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like a good idea to me. There are a lot of PropertyType.STRING when a String isn't really what the property is. PropertyType exists to check that the property is valid; setting the PropertyType to STRING is just a way to ignore this validation. I wonder if it would be best for a 1 to 1 mapping Property to PropertyType. This would probably be overkill though, another idea could be to analyze the PropertyTypes that are always valid. From briefly looking, PropertyType.PATH, PropertyType.STRING, PropertyType.URI are always valid. I don't think any properties should always be valid. Those that are PropertyType.STRING could probably be given a more appropriate existing PropertyType or a new one. PATH and URI could have validation.

In addition to this, can analyze all properties, determine if they are required or not, and change the validation:

  • Properties that are not required could accept empty string, null, or a valid value (where validity is well defined)
  • Properties that are required could only accept a valid value

Like you said, for this PR, can just push this list of required properties into Property. Maybe for now/in this PR this list of required props is only accessed/checked in this admin check. Might be a bit of a scope creep to start checking this list elsewhere in the code. Could do it in follow on.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely want to avoid any scope creep in this PR. Identified some areas that need improvement based on this work, we can open follow on issues or PRs for those.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could leave the list as is in the PR. For follow on issues, do we need two issues? One for addressing the STRING types and another for somehow representing and documenting required props in the Property.java?

Copy link
Member Author

@kevinrr888 kevinrr888 Feb 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think either would be fine, but it might be easier as just one issue. There would be overlap in these changes so might be hard to split up/work on as two separate issues/PRs. For example, instance.volumes would need to be a required property (which would be tied to validation in it's PropertyType) and would need to be moved away from PropertyType.STRING

for (var reqProp : requiredProps) {
var confPropVal = config.get(reqProp);
// already checked that all set properties are valid, just check that it is set then we know
// it's valid
if (confPropVal == null || confPropVal.isEmpty()) {
log.warn("Required property {} is not set!", reqProp);
status = Admin.CheckCommand.CheckStatus.FAILED;
}
}

printCompleted(status);
return status;
}

@Override
public Admin.CheckCommand.Check getCheck() {
return check;
}
}
Loading