Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search Replica Allocation and Recovery #17457

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

vinaykpud
Copy link
Contributor

@vinaykpud vinaykpud commented Feb 25, 2025

Description

Related Issues

Resolves #17422
Resolves #17421
Resolves #17334
Related to #15306

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com>
Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com>
@github-actions github-actions bot added bug Something isn't working Search:Performance labels Feb 25, 2025
Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com>
Copy link
Contributor

❌ Gradle check result for f9c54c5: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com>
Copy link
Contributor

❕ Gradle check result for 2d5b977: UNSTABLE

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Copy link

codecov bot commented Feb 26, 2025

Codecov Report

Attention: Patch coverage is 84.61538% with 6 lines in your changes missing coverage. Please review.

Project coverage is 72.50%. Comparing base (2ee8660) to head (7d5ab5a).
Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
...java/org/opensearch/index/shard/StoreRecovery.java 84.00% 1 Missing and 3 partials ⚠️
...ava/org/opensearch/cluster/node/DiscoveryNode.java 0.00% 0 Missing and 1 partial ⚠️
...cation/decider/SearchReplicaAllocationDecider.java 90.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #17457      +/-   ##
============================================
+ Coverage     72.43%   72.50%   +0.07%     
- Complexity    65694    65769      +75     
============================================
  Files          5311     5311              
  Lines        304937   304932       -5     
  Branches      44226    44230       +4     
============================================
+ Hits         220872   221104     +232     
+ Misses        65912    65733     -179     
+ Partials      18153    18095      -58     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@vinaykpud vinaykpud force-pushed the rw/search-replica-alloc-filter branch from b982b40 to 69f13dd Compare March 6, 2025 19:48
Copy link
Contributor

github-actions bot commented Mar 6, 2025

❌ Gradle check result for 69f13dd: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com>
@vinaykpud vinaykpud force-pushed the rw/search-replica-alloc-filter branch from 69f13dd to 6e3be41 Compare March 6, 2025 21:22
Copy link
Contributor

github-actions bot commented Mar 6, 2025

✅ Gradle check result for 6e3be41: SUCCESS

@vinaykpud
Copy link
Contributor Author

vinaykpud commented Mar 6, 2025

s have to be added explicitly. This espl gets tricky if there are concurrent modifications lets say there was a node removal and a new node addition concurrently. In such a case if the addition doesn't read the deleted state and overrides the node list with a new node, we might run into issues.

Thanks @Bukhtawar

From your response, it seems the concern is about keeping the filter updated while handling concurrent node additions and removals.

However, in this implementation, users can define a set of nodes—let’s say a "Searcher fleet"—by assigning a custom attribute in the YAML configuration. This means that as long as there are nodes matching this attribute, search replicas will be assigned accordingly. Essentially, we are designating nodes dynamically as part of the "Searcher fleet."

For example, if I have 5 nodes and configure 3 of them with a custom attribute:

node.attr.rackid: "rack2"

I can then make them part of the "Searcher fleet" using the following API:

curl -X PUT "http://localhost:9200/_cluster/settings" \
-H "Content-Type: application/json" \
-d '{
  "persistent": {
    "cluster.routing.allocation.search.replica.dedicated.include.rackid": "rack2"
  }
}'

With this setup, any node with rackid: rack2 will be considered a search-dedicated node. This means that regardless of how many nodes are added or removed from the cluster, as long as they have the matching attribute, they will be automatically recognized as part of the "Reader fleet."

This approach provides flexibility for users to use any attribute-based selection for search nodes.

That said, I’d like to understand if there are any additional concerns, such as potential performance implications, with this approach.

I think the filter logic should be used in cases where we actually need to add an exception(exclusion) or override the allocation logic on top of a generic node roles.

Do you mean we are incorrectly utilizing the inclusion filter here, or is this approach misaligned with its intended purpose? Do you see any limitations or unintended behavior in using it this way?

Also please find my comment on the approach here : #17422 (comment)

Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com>
Copy link
Contributor

github-actions bot commented Mar 7, 2025

❌ Gradle check result for 858f984: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com>
Copy link
Contributor

github-actions bot commented Mar 8, 2025

❌ Gradle check result for 8f94e31: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com>
Copy link
Contributor

github-actions bot commented Mar 8, 2025

❌ Gradle check result for 66085c5: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@vinaykpud vinaykpud closed this Mar 10, 2025
@vinaykpud vinaykpud reopened this Mar 10, 2025
Copy link
Contributor

❌ Gradle check result for 66085c5: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com>
Copy link
Contributor

❌ Gradle check result for 5836b20: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com>
Copy link
Contributor

❕ Gradle check result for 7d5ab5a: UNSTABLE

  • TEST FAILURES:
      1 org.opensearch.cluster.MinimumClusterManagerNodesIT.testThreeNodesNoClusterManagerBlock

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

if (o1.isSearchOnly() ^ o2.isSearchOnly()) {
// If index names are equal, one shard is a regular replica,
// and the other is search replica, regular replica comes first
return o1.isSearchOnly() ? 1 : -1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is no real reason why search replica should come later. I am making an assumption here to allocate Replica first then allocate search replica. So that allocation is consistent.

Allocation works like this:
Assume these are share shards:

P1, R1, R1, S1, S1

First allocates primary to some node.

In the next round it starts allocating replicas:

R1, R1, S1, S1

Above logic will help in sorting Replicas first and Search Replicas next.
So allocator will allocate one copy of replica ie R1, then move to allocate copy of search replica ie S1.

So in next iteration it will be

R1, S1

// it indicates a node-left scenario where the search shard was unassigned
// and is now recovering on a new node that lacks SegmentsInfo.
// In this case, fallback to empty store recovery.
if (!indexShard.shardRouting.isSearchOnly()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we delegate to a separate function from internalRecoverFromStore in the SR case rather than updating this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dint get this, do you mean a separate internalRecoverFromStore method just for search replica?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also "updating this" means this method or just this line?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement Enhancement or improvement to existing feature or request Search:Performance
Projects
None yet
3 participants