Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix flaky IT test on range query in RestMLInferenceSearchRequestProcessorIT #3598

Merged
merged 1 commit into from
Mar 1, 2025

Conversation

mingshl
Copy link
Collaborator

@mingshl mingshl commented Feb 28, 2025

Description

Earlier the RestMLInferenceSearchRequestProcessorIT has an IT test rewrites to a range query searching on a keyword field, it's flaky, so change to querying an integer field to return stable search result.

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Mingshi Liu <mingshl@amazon.com>
@mingshl
Copy link
Collaborator Author

mingshl commented Feb 28, 2025

flaky test not related to this change, rerun

Suite: Test class org.opensearch.ml.rest.RestCohereInferenceIT
  2> REPRODUCE WITH: ./gradlew ':opensearch-ml-plugin:integTest' --tests "org.opensearch.ml.rest.RestCohereInferenceIT.test_cohereInference_withDifferent_postProcessFunction" -Dtests.seed=6A06CFD3516A4040 -Dtests.security.manager=false -Dtests.locale=sv-SE -Dtests.timezone=America/Cayman -Druntime.java=21
  2> java.lang.AssertionError: failed to run test with test name: connector.post_process.cohere_v2.embedding.ubinary_test
        at __randomizedtesting.SeedInfo.seed([6A06CFD3516A4040:998857A176683568]:0)
        at org.junit.Assert.fail(Assert.java:89)
        at org.junit.Assert.assertTrue(Assert.java:42)
        at org.opensearch.ml.rest.RestCohereInferenceIT.validateOutput(RestCohereInferenceIT.java:85)
        at org.opensearch.ml.rest.RestCohereInferenceIT.test_cohereInference_withDifferent_postProcessFunction(RestCohereInferenceIT.java:74)
  2> NOTE: leaving temporary files on disk at: /__w/ml-commons/ml-commons/plugin/build/testrun/integTest/temp/org.opensearch.ml.rest.RestCohereInferenceIT_6A06CFD3516A4040-001
  2> NOTE: test params are: codec=Asserting(Lucene101): {}, docValues:{}, maxPointsInLeafNode=386, maxMBSortInHeap=6.152622036236546, sim=Asserting(RandomSimilarity(queryNorm=false): {}), locale=sv-SE, timezone=America/Cayman
  2> NOTE: Linux 6.8.0-1021-azure amd64/Azul Systems, Inc. 21.0.6 (64-bit)/cpus=4,threads=1,free=383929800,total=536870912
  2> NOTE: All tests run in this JVM: [MLModelAutoReDeployerIT, RestBedRockInferenceIT, RestCohereInferenceIT]
    NOTE: test params are: codec=Asserting(Lucene101): {}, docValues:{}, maxPointsInLeafNode=386, maxMBSortInHeap=6.152622036236546, sim=Asserting(RandomSimilarity(queryNorm=false): {}), locale=sv-SE, timezone=America/Cayman
    NOTE: Linux 6.8.0-1021-azure amd64/Azul Systems, Inc. 21.0.6 (64-bit)/cpus=4,threads=1,free=383929800,total=536870912
    NOTE: All tests run in this JVM: [MLModelAutoReDeployerIT, RestBedRockInferenceIT, RestCohereInferenceIT]
  1> [2025-02-28T18:09:43,089][INFO ][o.o.m.r.RestCohereInferenceIT] [test_cohereInference_withDifferent_postProcessFunction] before test
  1> [2025-02-28T18:09:43,097][INFO ][o.o.m.r.RestCohereInferenceIT] [test_cohereInference_withDifferent_postProcessFunction] initializing REST clients against [http://[::1]:39439, http://127.0.0.1:45625]/
  1> [2025-02-28T18:09:44,756][INFO ][o.o.m.r.RestCohereInferenceIT] [test_cohereInference_withDifferent_postProcessFunction] after test

RestConnectorToolIT > testConnectorToolInFlowAgent STANDARD_OUT
    [2025-03-01T10:09:44,830][INFO ][o.o.m.r.RestConnectorToolIT] [testConnectorToolInFlowAgent] before test
    [2025-03-01T10:09:44,834][INFO ][o.o.m.r.RestConnectorToolIT] [testConnectorToolInFlowAgent] initializing REST clients against [http://[::1]:39439, http://127.0.0.1:45625]/

RestConnectorToolIT > testConnectorToolInFlowAgent STANDARD_ERROR
    માર્ચ 01, 2025 10:10:05 AM org.opensearch.client.RestClient logResponse
    WARNING: request [DELETE http://127.0.0.1:45625/.plugins-ml-agent] returned 1 warnings: [299 OpenSearch-3.0.0-SNAPSHOT-a961ec728859b5318a8c7f80206ff6566a954971 "this request accesses system indices: [.plugins-ml-agent], but in a future major version, direct access to system indices will be prevented by default"]

RestConnectorToolIT > testConnectorToolInFlowAgent STANDARD_OUT
    [2025-03-01T10:10:05,622][INFO ][o.o.m.r.RestConnectorToolIT] [testConnectorToolInFlowAgent] after test```

@dhrubo-os
Copy link
Collaborator

@mingshl I'm seeing to fail these tests in other PRs as well:

- org.opensearch.ml.rest.RestCohereInferenceIT.test_cohereInference_withDifferent_postProcessFunction
 - org.opensearch.ml.rest.RestMLInferenceSearchRequestProcessorIT.testMLInferenceProcessorRemoteModelOptionalInputs
 - org.opensearch.ml.rest.RestMLInferenceSearchResponseProcessorIT.testMLInferenceProcessorRemoteModelOptionalInputs

Could you please check? I feel like these aren't flaky tests. It's failing here as well: https://github.com/opensearch-project/ml-commons/actions/runs/13598168388/job/38019496540?pr=3597

@dhrubo-os
Copy link
Collaborator

@mingshl
Copy link
Collaborator Author

mingshl commented Mar 1, 2025

@dhrubo-os this test is failing everywhere, but it's not related to search processors, it's the model interface related to cohere

116 tests completed, 1 failed, 11 skipped
Tests with failures:

  • org.opensearch.ml.rest.RestCohereInferenceIT.test_cohereInference_withDifferent_postProcessFunction

@mingshl mingshl mentioned this pull request Mar 1, 2025
5 tasks
@mingshl mingshl had a problem deploying to ml-commons-cicd-env March 1, 2025 03:33 — with GitHub Actions Failure
@zane-neo
Copy link
Collaborator

zane-neo commented Mar 1, 2025

@dhrubo-os this test is failing everywhere, but it's not related to search processors, it's the model interface related to cohere

116 tests completed, 1 failed, 11 skipped Tests with failures:

  • org.opensearch.ml.rest.RestCohereInferenceIT.test_cohereInference_withDifferent_postProcessFunction

Cohere one is an issue and I fixed it in this PR: #3602

@zane-neo zane-neo merged commit 4512e0a into opensearch-project:main Mar 1, 2025
6 of 10 checks passed
opensearch-trigger-bot bot pushed a commit that referenced this pull request Mar 1, 2025
Signed-off-by: Mingshi Liu <mingshl@amazon.com>
(cherry picked from commit 4512e0a)
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.x 2.x
# Navigate to the new working tree
cd .worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-3598-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 4512e0ae34fd25fd1a5236c23aba48259f7fc2aa
# Push it to GitHub
git push --set-upstream origin backport/backport-3598-to-2.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-3598-to-2.x.

mingshl added a commit that referenced this pull request Mar 1, 2025
Signed-off-by: Mingshi Liu <mingshl@amazon.com>
mingshl added a commit that referenced this pull request Mar 2, 2025
Signed-off-by: Mingshi Liu <mingshl@amazon.com>
(cherry picked from commit 4512e0a)

Co-authored-by: Mingshi Liu <mingshl@amazon.com>
mingshl added a commit that referenced this pull request Mar 2, 2025
#3595)

* fix optional mappings in ml inference search processors (#3587)

* fix optional mappings

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* use collections and add more assertion tests

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

* validate query return false

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

---------

Signed-off-by: Mingshi Liu <mingshl@amazon.com>
(cherry picked from commit b22e61a)

* fix flaky test (#3598)

Signed-off-by: Mingshi Liu <mingshl@amazon.com>

---------

Signed-off-by: Mingshi Liu <mingshl@amazon.com>
Co-authored-by: Mingshi Liu <mingshl@amazon.com>
akolarkunnu pushed a commit to akolarkunnu/ml-commons that referenced this pull request Mar 11, 2025
Signed-off-by: Mingshi Liu <mingshl@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants