-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RW Separation] Restrict Search Replica Allocation to Search-Dedicated Nodes #17422
Comments
@vinaykpud Search replicas rely on the allocation filter for hardware designation, not the "Search node role", are you seeing the bug when applying the filter? |
@mch2 , @vinaykpud : Adding allocation filter explicitly adds too much overhead . Is there a way to simplify the experience ? Also how would allocation/balancing work out for search replicas ? |
@gbbafna Can you elaborate here? Are you referring to potential overhead on match with DiscoveryNodeFilter? If thats significant perhaps we limit to a concrete node attr and look up on the DiscoveryNode like The idea was to use SearchReplicaAllocationDecider a decider we introduced a while back that operates similarly to FilterAllocationDecider. This would work with any node attribute and not be reliant on a node role, but I'm ok with narrowing this a bit to a specific node attr... In terms of general allocation/balancing for SRs, we would apply existing deciders, zonal awareness & replica auto expansion etc only within the nodes marked with the attribute. Non SRs would apply the same within their set of nodes. |
This is to prevent search replica allocation in the absence of the include filter. |
I was referring to overhead on cluster operator to apply this settings . They will need to add either an attribute to all such nodes by defining this in their yml file and use this attribute in their cluster settings. The other option would be to specify nodes , which will be inconvenient . |
I’ve responded to the concern here: #17457 (comment) However, it seems we need to revisit our approach of using the inclusion filter to define a fleet of search nodes based on custom node attributes. While this approach allows users to select search nodes dynamically using any attribute, the key question is: How much practical value does this flexibility provide? Concerns:Concern 1: Complexity in Setting Up Searcher FleetThe current approach requires a two-step process for users to configure a searcher fleet:
This additional complexity may not provide significant benefits compared to a simpler, more structured approach. Concern 2: Impact on Awareness Allocation & Auto-ExpandUsing a dynamic inclusion filter makes it difficult for Awareness Allocation and Auto-Expand implementations to identify and list search nodes. Since search nodes are determined based on filters, the only way to list them is by querying searchReplicaIncludeFilters in the Relying on this indirect mechanism to determine search-dedicated nodes doesn’t seem ideal. Proposed Solution: Predefined Node AttributeA better approach would be to introduce a predefined custom attribute, such as: node.attr.searchonly: "true" Users can set this attribute on all nodes they want to designate as search-dedicated. Then,
Additionally, we can introduce a method in boolean isSearchDedicatedNode() This would provide a direct and reliable way to determine if a node is search-dedicated. However, we should be cautious about potential confusion since we already have boolean isSearchNode(), which is based on node roles. Also while doing this we dont want to add confusion since we have a similar method which is based on the node role. Does this predefined node attribute approach look good? If you have any suggestions or alternative ideas, please let me know. |
@vinaykpud : Thanks for putting up this detailed proposal. The approach looks good and simplifies the experience for our users. |
Any thoughts on introducing node roles? |
We already have a Search Node role which was added for Searchable Snapshots, adding another role for this may add some confusion. What advantage we get if its Node Role vs attribute? |
Role or or attr would both work, we also have precedent of using role here with 'search' and the TargetPoolAllocationDecider. Though a concern I have with role as Vinay mentioned is introducing a new role along side the existing "search" which is conceptually confusing to me. I think we're better off repurposing "search" to either hold warm or hot or both shard types.
This is what led me to the filter approach initially, you could use any data node attr or a new one and set it once, but I understand the drawbacks of the additional config. |
If a node only includes the existing search role we pre-allocate cache size to 80%. If the node also includes another role 'data' it sets the cache size to 0, but then blows up on startup requiring a nonzero value on the setting: private void initializeFileCache(Settings settings, CircuitBreaker circuitBreaker) throws IOException {
if (DiscoveryNode.isSearchNode(settings) == false) {
return;
}
String capacityRaw = NODE_SEARCH_CACHE_SIZE_SETTING.get(settings);
logger.info("cache size [{}]", capacityRaw);
if (capacityRaw.equals(ZERO)) {
throw new SettingsException(
"Unable to initialize the "
+ DiscoveryNodeRole.SEARCH_ROLE.roleName()
+ "-"
+ DiscoveryNodeRole.DATA_ROLE.roleName()
+ " node: Missing value for configuration "
+ NODE_SEARCH_CACHE_SIZE_SETTING.getKey()
);
} So to reuse the role we would need to allow an initial file cache size of 0 and for separation we would need to explicitly set the size to 0, change the default to 0 and require it set for searchable snapshots, or require both roles (ew). The cache size is not a dynamic setting either :/ I am thinking changing the default to 0 and requiring a nonzero value in order to hold ss shards makes the most sense here but that is a breaking change - @bugmakerrrrrr @andrross curious what you two think here as I've seen you in that code recently. |
I think we need to go with a new attr here for a separate reason - we would need exclusivity among roles. We are not supporting search only next to writeable shards on a single node. This brings issues around zonal awareness and auto expansion etc so we'd like to support only the extreme separation case at this point. |
@mch2 I know 3.0 is coming up pretty quickly, but would it make any sense to make a breaking change and flip things around here? As in, change searchable snapshots/file cache behavior to work with a node attribute instead of the |
@mch2 @gbbafna @andrross @Bukhtawar Based on the discussions so far, I am going to rename existing "Search role" to "Warm role". Then introduce/repurpose "Search role" for nodes to host Search Replicas. |
@mch2 Here is my commit for renaming: https://github.com/opensearch-project/OpenSearch/compare/main...vinaykpud:OpenSearch:rw/rename-role?expand=1, I will create a PR based on this. Also in this PR: #17457, I will introduce the Search Role for a node. |
@vinaykpud can you pls flag this as well for documentation - we need to make this very clear & discoverable so ppl upgrading with existing 'search' nodes know to update the role. |
Added issue for documentation: opensearch-project/documentation-website#9392 |
@vinaykpud : Can we verify if we upgrade the cluster from 2.x to 3.0 with this change, searchable snapshots continue to work ? There might be few nuances here like changing the cluster-manager first . |
Describe the bug
When an index is created or when search replicas are added for an index, search replicas should be allocated only to nodes that have the search role assigned. If there are no such nodes available, the search replicas should remain in an unassigned state. If a node with the search role becomes available later, the search replicas should be assigned accordingly.
Related component
Search:Performance
To Reproduce
Expected behavior
Assigning a search node role for a node:
Ref: #15445
Additional Details
No response
The text was updated successfully, but these errors were encountered: