-
Notifications
You must be signed in to change notification settings - Fork 144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use IndexInput#prefetch in Exact search #2423
Comments
A similar mechanism is being addressed here with searchable snapshot in core where based on file type we can perform the read ahead of the blocks. So for exact search if we are using flat vector files then access to that file can be implicitly powered using read ahead functionality to help in sequential access cases. This can tie up well with |
@sohami Thanks for the reference. I would be interested in the low level RFC/ implementation, currently there are only specific cases where we want prefetch since it affects search latencies for lucene engine (and with partial loading it might affect faiss engine as well). Its easy to add a prefetch API in float vector values which can use IndexInput#prefetch and then call prefetch based on how many vectors you need instead of a predefined block of data. |
Just to clarify, the read ahead mechanism is specific to remote store directory such that it can pre-download the blocks for sequential access. It is not tied to prefetching data in the operating system file cache.
Can you share more light on this ? |
Description
Exact search evaluates vectors in linear fashion. Leveraging IndexInput#prefetch to load the next vector in memory, can possibly help with reducing the read cost during runtime reducing the latencies. Prefetch gives a madvise
WILL_NEED
system call to the kernel, kernel may use this signal to prefetch a set of bytes async.We need to benchmark and see if this yields improvements.
Pre-requisites
This can help speed up filtering queries, rescoring and exact search scripting
The text was updated successfully, but these errors were encountered: