Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add vector search documentation #9135

Open
wants to merge 43 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 9 commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
4331ce1
Add vector database section
kolchfa-aws Jan 15, 2025
db9e95e
More restructuring
kolchfa-aws Jan 16, 2025
3d93fc8
Layout update
kolchfa-aws Jan 16, 2025
47b235f
Add cards to search topics
kolchfa-aws Jan 21, 2025
2ac67d9
More restructuring
kolchfa-aws Jan 23, 2025
99ac17a
Add images and more rewrites
kolchfa-aws Jan 27, 2025
27107c8
Update settings
kolchfa-aws Jan 28, 2025
f009e68
Formatting update
kolchfa-aws Jan 28, 2025
8779a7c
Resolve merge conflicts
kolchfa-aws Jan 29, 2025
6cdb7b0
Add more explanations to getting started
kolchfa-aws Feb 10, 2025
f922fae
Change about page and remove k-NN terminology
kolchfa-aws Feb 11, 2025
1900e75
unify terminology
kolchfa-aws Feb 11, 2025
a5e8b8d
Review comments
kolchfa-aws Feb 11, 2025
3f5301a
More review comments
kolchfa-aws Feb 12, 2025
18b40af
Resolve merge conflicts
kolchfa-aws Feb 12, 2025
8b8d831
Resolve merge conflicts
kolchfa-aws Feb 12, 2025
d977108
More restructuring
kolchfa-aws Feb 14, 2025
f73a161
Remove table from exact search
kolchfa-aws Feb 14, 2025
8fde05d
Fix links
kolchfa-aws Feb 14, 2025
7f549c4
Merge branch 'main' into vector-restructure
kolchfa-aws Feb 14, 2025
73f4fd7
Remove text from top page
kolchfa-aws Feb 14, 2025
e2a1cf9
Merge branch 'vector-restructure' of https://github.com/opensearch-pr…
kolchfa-aws Feb 14, 2025
0be522f
More updates
kolchfa-aws Feb 17, 2025
63c0d97
Update _query-dsl/specialized/kNN.md
kolchfa-aws Feb 17, 2025
79eeff1
Review comments
kolchfa-aws Feb 17, 2025
9d42335
Fix links
kolchfa-aws Feb 17, 2025
a6a1cfb
Fix links
kolchfa-aws Feb 17, 2025
ad6fb3c
Rename query file
kolchfa-aws Feb 17, 2025
18e10df
Fix links
kolchfa-aws Feb 17, 2025
c007e46
Add sparse vector option
kolchfa-aws Feb 17, 2025
e2a7427
Compress requests
kolchfa-aws Feb 17, 2025
8d4ffe4
Apply suggestions from code review
kolchfa-aws Feb 19, 2025
e19af36
Apply suggestions from code review
kolchfa-aws Feb 19, 2025
8c49459
Apply suggestions from code review
kolchfa-aws Feb 19, 2025
0ca38f0
Editorial comments
kolchfa-aws Feb 20, 2025
e8595c3
Reformat concepts page
kolchfa-aws Feb 20, 2025
a10a187
Resolve merge conflicts
kolchfa-aws Feb 28, 2025
f0a80eb
Redesign landing page
kolchfa-aws Mar 1, 2025
bdefd99
More refactoring
kolchfa-aws Mar 1, 2025
97487f0
Editorial comments
kolchfa-aws Mar 3, 2025
403142b
Add auto workflow
kolchfa-aws Mar 3, 2025
f802024
Last editorial comments
kolchfa-aws Mar 3, 2025
dff4a5b
Resolve merge conflicts
kolchfa-aws Mar 3, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions _config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,9 @@ collections:
workspace:
permalink: /:collection/:path/
output: true
vector-search:
permalink: /:collection/:path/
output: true

opensearch_collection:
# Define the collections used in the theme
Expand Down Expand Up @@ -173,6 +176,9 @@ opensearch_collection:
search-plugins:
name: Search features
nav_fold: true
vector-search:
name: Vector search
nav_fold: true
ml-commons-plugin:
name: Machine learning
nav_fold: true
Expand Down
14 changes: 7 additions & 7 deletions _field-types/supported-field-types/knn-vector.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
---
layout: default
title: k-NN vector
nav_order: 58
nav_order: 20
has_children: false
parent: Supported field types
has_math: true
---

# k-NN vector field type
# k-NN vector
**Introduced 1.0**
{: .label .label-purple }

The [k-NN plugin]({{site.url}}{{site.baseurl}}/search-plugins/knn/index/) introduces a custom data type, the `knn_vector`, that allows users to ingest their k-NN vectors into an OpenSearch index and perform different kinds of k-NN search. The `knn_vector` field is highly configurable and can serve many different k-NN workloads. In general, a `knn_vector` field can be built either by providing a method definition or specifying a model id.
The `knn_vector` data type allows you to ingest vectors into an OpenSearch index and perform different kinds of vector search. The `knn_vector` field is highly configurable and can serve many different vector workloads. In general, a `knn_vector` field can be built either by providing a method definition or specifying a model id.

## Example

Expand Down Expand Up @@ -53,7 +53,7 @@
| `in_memory` (Default) | `nmslib` | Prioritizes low-latency search. This mode uses the `nmslib` engine without any quantization applied. It is configured with the default parameter values for vector search in OpenSearch. |
| `on_disk` | `faiss` | Prioritizes low-cost vector search while maintaining strong recall. By default, the `on_disk` mode uses quantization and rescoring to execute a two-pass approach to retrieve the top neighbors. The `on_disk` mode supports only `float` vector types. |

To create a k-NN index that uses the `on_disk` mode for low-cost search, send the following request:
To create a vector index that uses the `on_disk` mode for low-cost search, send the following request:

```json
PUT test-index
Expand Down Expand Up @@ -130,7 +130,7 @@

## Method definitions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Above method definitions, can we put mode/compression level example? Basically flow for users should be default > method/compression param tuning > method tuning > model ids


[Method definitions]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-index#method-definitions) are used when the underlying [approximate k-NN]({{site.url}}{{site.baseurl}}/search-plugins/knn/approximate-knn/) algorithm does not require training. For example, the following `knn_vector` field specifies that *nmslib*'s implementation of *hnsw* should be used for approximate k-NN search. During indexing, *nmslib* will build the corresponding *hnsw* segment files.
[Method definitions]({{site.url}}{{site.baseurl}}/vector-search/creating-vector-index/method/) are used when the underlying [approximate k-NN]({{site.url}}{{site.baseurl}}/search-plugins/knn/approximate-knn/) algorithm does not require training. For example, the following `knn_vector` field specifies that NMSLIB's implementation of HNSW should be used for approximate k-NN search. During indexing, NMSLIB will build the corresponding HNSW segment files.

Check failure on line 133 in _field-types/supported-field-types/knn-vector.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _field-types/supported-field-types/knn-vector.md#L133

[OpenSearch.Spelling] Error: NMSLIB's. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.
Raw output
{"message": "[OpenSearch.Spelling] Error: NMSLIB's. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_field-types/supported-field-types/knn-vector.md", "range": {"start": {"line": 133, "column": 308}}}, "severity": "ERROR"}

```json
"my_vector": {
Expand All @@ -150,7 +150,7 @@

## Model IDs

Model IDs are used when the underlying Approximate k-NN algorithm requires a training step. As a prerequisite, the model must be created with the [Train API]({{site.url}}{{site.baseurl}}/search-plugins/knn/api#train-a-model). The
Model IDs are used when the underlying approximate k-NN algorithm requires a training step. As a prerequisite, the model must be created with the [Train API]({{site.url}}{{site.baseurl}}/search-plugins/knn/api#train-a-model). The
model contains the information needed to initialize the native library segment files.

```json
Expand Down Expand Up @@ -180,7 +180,7 @@
When using `byte` vectors, expect some loss of precision in the recall compared to using `float` vectors. Byte vectors are useful in large-scale applications and use cases that prioritize a reduced memory footprint in exchange for a minimal loss of recall.
{: .important}

When using `byte` vectors with the `faiss` engine, we recommend using [SIMD optimization]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-index#simd-optimization-for-the-faiss-engine), which helps to significantly reduce search latencies and improve indexing throughput.
When using `byte` vectors with the `faiss` engine, we recommend using [SIMD optimization]({{site.url}}{{site.baseurl}}/vector-search/creating-vector-index/vector-field/#simd-optimization-for-the-faiss-engine), which helps to significantly reduce search latencies and improve indexing throughput.
{: .important}

Introduced in k-NN plugin version 2.9, the optional `data_type` parameter defines the data type of a vector. The default value of this parameter is `float`.
Expand Down
56 changes: 15 additions & 41 deletions _includes/cards.html
Original file line number Diff line number Diff line change
@@ -1,43 +1,17 @@
<div class="card-container-wrapper">
<p class="heading-main">Explore OpenSearch documentation</p>
<div class="card-container">
<div class="card">
<a href="{{site.url}}{{site.baseurl}}/about/" class='card-link'></a>
<p class="heading">OpenSearch and OpenSearch Dashboards</p>
<p class="description">Build your OpenSearch solution using core tooling and visualizations</p>
<p class="last-link">Documentation &#x2192;</p>
</div>


<div class="card">
<a href="{{site.url}}/docs/latest/data-prepper/" class='card-link'></a>
<p class="heading">OpenSearch Data Prepper</p>
<p class="description">Filter, mutate, and sample your data for ingestion into OpenSearch</p>
<p class="last-link" >Documentation &#x2192;</p>
</div>

<div class="card">
<a href="{{site.url}}/docs/latest/clients/" class='card-link'></a>
<p class="heading">Clients</p>
<p class="description">Interact with OpenSearch from your application using language APIs</p>
<p class="last-link">Documentation &#x2192;</p>
</div>


<div class="card">
<a href="{{site.url}}/docs/latest/benchmark/" class='card-link'></a>
<p class="heading">OpenSearch Benchmark</p>
<p class="description">Measure performance metrics for your OpenSearch cluster</p>
<p class="last-link">Documentation &#x2192;</p>
</div>

<div class="card">
<a href="{{site.url}}/docs/latest/migration-assistant/" class='card-link'></a>
<p class="heading">Migration Assistant</p>
<p class="description">Migrate to OpenSearch</p>
<p class="last-link">Documentation &#x2192;</p>
</div>
<div class="card-container">
{% for card in include.cards %}
<div class="card">
<a href="{{ site.url }}{{ site.baseurl }}{{ card.link }}" class="card-link"></a>
<p class="heading">{{ card.heading }}</p>
{% if card.description %}
<p class="description">{{ card.description }}</p>
{% endif %}
{% if include.documentation_link %}
<p class="last-link">Documentation &#x2192;</p>
{% endif %}
</div>
{% endfor %}
</div>
</div>

</div>


72 changes: 72 additions & 0 deletions _includes/home_cards.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
<div class="home-card-container-wrapper">
<p class="heading-main">OpenSearch and OpenSearch Dashboards</p>
<div class="home-card-container">
<div class="home-card">
<a href="{{site.url}}{{site.baseurl}}/about/" class='card-link'></a>
<p class="heading">All documentation</p>
<p class="description">Build your OpenSearch solution using core tooling and visualizations.</p>
<p class="last-link">Documentation &#x2192;</p>
</div>


<div class="home-card">
<a href="{{site.url}}{{site.baseurl}}/vector-search/" class='card-link'></a>
<p class="heading">Vector search</p>
<p class="description">Use vector database capabilities for more relevant search results.</p>
<p class="last-link" >Documentation &#x2192;</p>
</div>

<div class="home-card">
<a href="{{site.url}}{{site.baseurl}}/ml-commons-plugin/" class='card-link'></a>
<p class="heading">Machine learning</p>
<p class="description">Power your applications with machine learning model integration.</p>
<p class="last-link">Documentation &#x2192;</p>
</div>


<div class="home-card">
<a href="{{site.url}}{{site.baseurl}}/dashboards/" class='card-link'></a>
<p class="heading">OpenSearch Dashboards</p>
<p class="description">Explore and visualize your data using interactive dashboards.</p>
<p class="last-link">Documentation &#x2192;</p>
</div>
</div>

</div>

<div class="home-card-container-wrapper">
<p class="heading-main">Supporting tools</p>
<div class="home-card-container">

<div class="home-card">
<a href="{{site.url}}/docs/latest/data-prepper/" class='card-link'></a>
<p class="heading">Data Prepper</p>
<p class="description">Filter, mutate, and sample your data for ingestion into OpenSearch.</p>
<p class="last-link" >Documentation &#x2192;</p>
</div>

<div class="home-card">
<a href="{{site.url}}/docs/latest/clients/" class='card-link'></a>
<p class="heading">Clients</p>
<p class="description">Interact with OpenSearch from your application using language APIs.</p>
<p class="last-link">Documentation &#x2192;</p>
</div>


<div class="home-card">
<a href="{{site.url}}/docs/latest/benchmark/" class='card-link'></a>
<p class="heading">OpenSearch Benchmark</p>
<p class="description">Measure performance metrics for your OpenSearch cluster.</p>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Measure OpenSearch cluster performance metrics"?

<p class="last-link">Documentation &#x2192;</p>
</div>

<div class="home-card">
<a href="{{site.url}}/docs/latest/migration-assistant/" class='card-link'></a>
<p class="heading">Migration Assistant</p>
<p class="description">Migrate to OpenSearch.</p>
<p class="last-link">Documentation &#x2192;</p>
</div>
</div>

</div>

22 changes: 22 additions & 0 deletions _includes/list.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
<div class="numbered-list">
{% if include.list_title %}
<div class="heading">{{ include.list_title }}</div>
{% endif %}
{% assign counter = 0 %}
{% for item in include.list_items %}
{% assign counter = counter | plus: 1 %}
<div class="list-item">
<div class="number-circle">{{ counter }}</div>
<div class="list-content">
<div class="list-heading">
{% if item.link %}
<a href="{{ site.url }}{{ site.baseurl }}{{ item.link }}">{{ item.heading }}</a>
{% else %}
{{ item.heading }}
{% endif %}
</div>
<p class="description">{{ item.description | markdownify }}</p>
</div>
</div>
{% endfor %}
</div>
2 changes: 1 addition & 1 deletion _ml-commons-plugin/custom-local-models.md
Original file line number Diff line number Diff line change
Expand Up @@ -320,7 +320,7 @@ The response contains the tokens and weights:

## Step 5: Use the model for search

To learn how to use the model for vector search, see [Using an ML model for neural search]({{site.url}}{{site.baseurl}}/search-plugins/neural-search/#using-an-ml-model-for-neural-search).
To learn how to use the model for vector search, see [ML-powered search methods]({{site.url}}{{site.baseurl}}/vector-search/ml-powered-search/index/#ml-powered-search-methods).

## Question answering models

Expand Down
2 changes: 1 addition & 1 deletion _ml-commons-plugin/remote-models/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -323,7 +323,7 @@ To learn how to use the model for batch ingestion in order to improve ingestion

## Step 7: Use the model for search

To learn how to use the model for vector search, see [Using an ML model for neural search]({{site.url}}{{site.baseurl}}/search-plugins/neural-search/#using-an-ml-model-for-neural-search).
To learn how to use the model for vector search, see [ML-powered search methods]({{site.url}}{{site.baseurl}}/vector-search/ml-powered-search/index/#ml-powered-search-methods).

## Step 8 (Optional): Undeploy the model

Expand Down
81 changes: 77 additions & 4 deletions _sass/_home.scss
Original file line number Diff line number Diff line change
Expand Up @@ -22,11 +22,16 @@

// Card style

.card-container-wrapper {
.home-card-container-wrapper {
@include gradient-open-sky;
margin-bottom: 2rem;
}

.card-container {
.card-container-wrapper {
margin-bottom: 0;
}

.home-card-container {
display: grid;
grid-template-columns: 1fr;
margin: 0 auto;
Expand All @@ -42,11 +47,27 @@
}
}

.card {
.card-container {
display: grid;
grid-template-columns: 1fr;
margin: 0 auto;
padding: 2rem 0;
grid-row-gap: 1rem;
grid-column-gap: 1rem;
grid-auto-rows: 1fr;
@include mq(md) {
grid-template-columns: repeat(1, 1fr);
}
@include mq(lg) {
grid-template-columns: repeat(2, 1fr);
}
}

.home-card {
@extend .panel;
@include thick-edge-left;
padding: 1rem;
margin-bottom: 4rem;
margin-bottom: 2rem;
text-align: left;
background-color: white;
display: flex;
Expand All @@ -67,6 +88,11 @@
}
}

.card {
@extend .home-card;
margin-bottom: 0;
}

@mixin heading-font {
@include heading-sans-serif;
font-size: 1.5rem;
Expand Down Expand Up @@ -110,6 +136,53 @@
width: 100%;
}

// List layout

.numbered-list {
display: flex;
flex-direction: column;
gap: 2rem;
padding: 1rem;
}

.list-item {
display: flex;
align-items: flex-start;
gap: 1rem;
}

.number-circle {
width: 2.5rem;
height: 2.5rem;
border-radius: 50%;
background-color: $blue-lt-100;
color: $blue-dk-300;
display: flex;
align-items: center;
justify-content: center;
font-weight: bold;
font-size: 1.2rem;
flex-shrink: 0;
}

.list-content {
max-width: 100%;
}

.list-heading {
@include heading-font;
margin: 0 0 0.75rem 0;
font-size: 1.2rem;
color: $blue-dk-300;
font-weight: bold;
}

.list-content p {
margin: 0.5rem 0;
font-size: 1rem;
line-height: 1.5;
}

// Banner style

.os-banner {
Expand Down
Loading
Loading