-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Custom Faiss Base Image Build Code #7
base: main
Are you sure you want to change the base?
Add Custom Faiss Base Image Build Code #7
Conversation
Signed-off-by: Rajvaibhav Rahane <rrahane@amazon.com>
Fixed the build-docker-image.sh line 45 to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to have some more formalized and automated testing here, as well as clean up the formatting of everything since we are targeting main with this PR
@@ -0,0 +1,105 @@ | |||
import sys | |||
import os.path | |||
# If you faiss python installed you should skip this line |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're trying to merge to main can we clean up this file a bit? For example the dead comments, print statements, etc.
Separately it would be good to add a lint checking CI action to this repo + some auto formatting similar to the spotless checks in Java.
12. Now to test if everthing is build correctly or not run the below command | ||
``` | ||
python faiss-test.py | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be run automatically as a CI action
@@ -0,0 +1,14 @@ | |||
import numpy as np | |||
|
|||
if __name__ == "__main__": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we create a scripts package for files like this? Also it seems like this is a pre-requisite for running the faiss-test
tests but I don't see that mentioned anywhere
## Creating the image | ||
Below are the steps to create the image that can have a custom faiss python code build. | ||
1. The basic docker file is present in this folder named Dockerfile. | ||
2. Ensure that faiss is checked out before you trigger the build. The commit of faiss I have tested with is: 3c8dc4194907e9b911551d5a009468106f8b9c7f |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be a GH link
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we make the folder name: custom-faiss-image
@@ -0,0 +1,3 @@ | |||
[submodule "custom-faiss-installed-image/faiss"] | |||
path = custom-faiss-installed-image/faiss |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change this to faiss
+/* | ||
+Adding add function to support add_with_ids functionality from IndexIDMap index type. | ||
+*/ | ||
+void GpuIndexCagra::add(idx_t n, const float* x) { | ||
+ train(n, x); | ||
+} | ||
+ | ||
+ | ||
bool GpuIndexCagra::addImplRequiresIDs_() const { | ||
return false; | ||
}; | ||
diff --git a/faiss/gpu/GpuIndexCagra.h b/faiss/gpu/GpuIndexCagra.h | ||
index d6fae29..d28bc6d 100644 | ||
--- a/faiss/gpu/GpuIndexCagra.h | ||
+++ b/faiss/gpu/GpuIndexCagra.h | ||
@@ -248,6 +248,8 @@ struct GpuIndexCagra : public GpuIndex { | ||
/// Trains CAGRA based on the given vector data | ||
void train(idx_t n, const float* x) override; | ||
|
||
+ void add(idx_t n, const float* x) override; | ||
+ | ||
/// Initialize ourselves from the given CPU index; will overwrite | ||
/// all data in ourselves | ||
void copyFrom(const faiss::IndexHNSWCagra* index); | ||
-- | ||
2.25.1 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all this is not needed. as the changes are merged already in faiss
@@ -0,0 +1,45 @@ | |||
FROM nvidia/cuda:12.1.1-base-ubuntu22.04 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please check other image types rather than ubuntu. I think this base image has a lot of vulnerabilities. Better to pick the rockyLinux base image
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
if __name__ == "__main__": | ||
d = 768 | ||
np.random.seed(1234) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why we need these files
import numpy as np | ||
import time | ||
|
||
from numpy import dtype |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why we are adding these files
Signed-off-by: Rajvaibhav Rahane <rrahane@amazon.com>
Description
Adds the faiss submodule and build scripts to create a custom faiss installed base image to be used by the remote-vector-index-build-service-worker.
Issues Resolved
Resolves #4
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.