Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LuceneOnFaiss - Part2] Added IxMp section loading logic from FAISS index. #2590

Conversation

0ctopus13prime
Copy link
Collaborator

Description

[2/11] This PR adds a loading logic of FaissIdMapIndex section in FAISS section.
This section contains a mapping table to maps an internal vector id to external Lucene document id.
There are two cases

  1. Dense case : All documents have at least one KNN field, which has exactly one vector. Therefore, internal vector id is corresponding to Lucene document id.
  2. Sparse case : Not all documents having vectors. Or parent-child case. In which case, FaissIdMapIndex will have a mapping to maps an internal vector id to its parent Lucene document id.

Related Issues

RFC : #2401

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@0ctopus13prime
Copy link
Collaborator Author

Will continuously add unit tests to this PR.

@0ctopus13prime 0ctopus13prime changed the title Added IxMp section loading logic from FAISS index. [LuceneOnFaiss - Part2] Added IxMp section loading logic from FAISS index. Mar 10, 2025
@0ctopus13prime 0ctopus13prime force-pushed the lucene-on-faiss-part2 branch from b60e5d1 to b6bdf7e Compare March 10, 2025 03:39
@0ctopus13prime 0ctopus13prime force-pushed the lucene-on-faiss-part2 branch 2 times, most recently from 4995e37 to 4726205 Compare March 10, 2025 18:57
Copy link
Member

@jmazanec15 jmazanec15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor comments, but overall looks good. Would be good to add tests

Signed-off-by: Dooyong Kim <kdooyong@amazon.com>
@0ctopus13prime 0ctopus13prime force-pushed the lucene-on-faiss-part2 branch from 4726205 to 0d2170e Compare March 10, 2025 23:04
Copy link
Member

@jmazanec15 jmazanec15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@0ctopus13prime 0ctopus13prime merged commit 4e80b65 into opensearch-project:lucene-on-faiss Mar 11, 2025
34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants