[LuceneOnFaiss - Part3] Added FaissHNSW graph. #2594

0ctopus13prime · 2025-03-11T18:17:22Z

Description

In this PR, I've added FaissHNSW graph and a bridge that complies to Lucene's HNSW graph interface.
With the bridge, now we can use Lucene's HNSW graph searcher to perform vector search on FAISS index.

Related Issues

RFC : #2401

Check List

New functionality includes testing.
New functionality has been documented.
API changes companion pull request created.
Commits are signed per the DCO using --signoff.
Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

src/main/java/org/opensearch/knn/memoryoptsearch/faiss/FaissHNSW.java

src/main/java/org/opensearch/knn/memoryoptsearch/faiss/FaissSection.java

src/main/java/org/opensearch/knn/memoryoptsearch/faiss/FaissHNSWFlatIndex.java

src/main/java/org/opensearch/knn/memoryoptsearch/faiss/LuceneFaissHnswGraph.java

shatejas

Will look at the unit tests soon.

Looks good otherwise, minor comments

src/main/java/org/opensearch/knn/memoryoptsearch/faiss/FaissHNSW.java

src/main/java/org/opensearch/knn/memoryoptsearch/faiss/FaissSection.java

src/main/java/org/opensearch/knn/memoryoptsearch/faiss/FaissHNSW.java

src/main/java/org/opensearch/knn/memoryoptsearch/faiss/FaissHNSWFlatIndex.java

src/main/java/org/opensearch/knn/memoryoptsearch/faiss/LuceneFaissHnswGraph.java

src/main/java/org/opensearch/knn/memoryoptsearch/faiss/FaissHNSW.java

src/main/java/org/opensearch/knn/memoryoptsearch/faiss/LuceneFaissHnswGraph.java

Signed-off-by: Dooyong Kim <kdooyong@amazon.com>

src/main/java/org/opensearch/knn/memoryoptsearch/faiss/FaissHNSW.java

0ctopus13prime · 2025-03-12T16:11:40Z

@jmazanec15
Hi Jack, please let me share the required memory space upper bound for long[] offsets within this naive approach, and the result how much we can compress the space with monotonic encoding.

Conclusion : With the encoding, required memory would be shrink to 475KB from 7.45GB for 1B vectors.

Current Approach
The number of vectors : 1B
Size of offsets required : 7.45GB

With DirectMonotonicWriter
Meta bytes : 163KB
Data bytes : 312KB
Total : 475KB = (163KB + 312KB)

Will make sure to add this encoding in the next PR.
Thank you!

shatejas

Looks good, need a clarification and its good to go

shatejas · 2025-03-12T17:56:18Z

src/main/java/org/opensearch/knn/memoryoptsearch/faiss/FaissHnswGraph.java

+                // Note that maxLevel=3 indicates that a vector exists level-0 (bottom), level-1 and level-2.
+                if (maxLevel > level) {
+                    ++numVectorsAtLevel;


just want to make sure that the vectors are not repeated on each level so we don't count extra

there will be no duplicate counts as the index of levels (e.g. i) represents an internal vector id, and we increase the index sequentially.

0ctopus13prime requested review from heemin32, navneet1v, VijayanB, vamshin, jmazanec15, naveentatikonda, junqiu-lei, martin-gaievski, ryanbogan, luyuncheng, shatejas and Vikasht34 as code owners March 11, 2025 18:17

0ctopus13prime self-assigned this Mar 11, 2025

0ctopus13prime added the skip-changelog label Mar 11, 2025

jmazanec15 reviewed Mar 11, 2025

View reviewed changes

shatejas reviewed Mar 11, 2025

View reviewed changes

Added FaissHNSW and bridge to Lucene HNSW graph.

517e2c4

Signed-off-by: Dooyong Kim <kdooyong@amazon.com>

0ctopus13prime force-pushed the lucene-on-faiss-part3 branch from 2e7e464 to 517e2c4 Compare March 12, 2025 00:50

jmazanec15 reviewed Mar 12, 2025

View reviewed changes

src/main/java/org/opensearch/knn/memoryoptsearch/faiss/FaissHNSW.java Show resolved Hide resolved

jmazanec15 approved these changes Mar 12, 2025

View reviewed changes

shatejas reviewed Mar 12, 2025

View reviewed changes

shatejas approved these changes Mar 12, 2025

View reviewed changes

0ctopus13prime merged commit a6beb8c into opensearch-project:lucene-on-faiss Mar 12, 2025
34 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[LuceneOnFaiss - Part3] Added FaissHNSW graph. #2594

[LuceneOnFaiss - Part3] Added FaissHNSW graph. #2594

0ctopus13prime commented Mar 11, 2025

shatejas left a comment

0ctopus13prime commented Mar 12, 2025 •

edited

Loading

shatejas left a comment

shatejas Mar 12, 2025

0ctopus13prime Mar 12, 2025 •

edited

Loading

[LuceneOnFaiss - Part3] Added FaissHNSW graph. #2594

[LuceneOnFaiss - Part3] Added FaissHNSW graph. #2594

Conversation

0ctopus13prime commented Mar 11, 2025

Description

Related Issues

Check List

shatejas left a comment

Choose a reason for hiding this comment

0ctopus13prime commented Mar 12, 2025 • edited Loading

shatejas left a comment

Choose a reason for hiding this comment

shatejas Mar 12, 2025

Choose a reason for hiding this comment

0ctopus13prime Mar 12, 2025 • edited Loading

Choose a reason for hiding this comment

0ctopus13prime commented Mar 12, 2025 •

edited

Loading

0ctopus13prime Mar 12, 2025 •

edited

Loading