Skip to content

Commit c9fd1a2

Browse files
Merge pull request #2698 from owaiskazi19/flow-framework
[Blogpost] Configurable Automation for OpenSearch ML Use Cases
2 parents 2c43c6c + 0de3cff commit c9fd1a2

File tree

6 files changed

+153
-3
lines changed

6 files changed

+153
-3
lines changed

_community_members/amitgalitz.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -20,4 +20,4 @@ personas:
2020
permalink: '/community/members/amit-galitzky.html'
2121
---
2222

23-
**Amit Galitzky** is a software engineer at Amazon Web Services. He focuses mostly on the Anomaly Detection plugin for OpenSearch.
23+
**Amit Galitzky** is a software engineer at Amazon Web Services focusing mostly on the OpenSearch Anomaly Detection and Flow Framework plugins.

_community_members/hnyng.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -20,4 +20,4 @@ personas:
2020
permalink: '/community/members/jackie-han.html'
2121
---
2222

23-
Jackie Han is a software engineer at AWS. focusing mostly on anomaly detection in OpenSearch.
23+
Jackie Han is a software engineer at Amazon Web Services focusing mostly on the OpenSearch Anomaly Detection and Flow Framework plugins.

_community_members/jpalis.md

+23
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
---
2+
name: Josh Palis
3+
short_name: jpalis
4+
photo: '/assets/media/community/members/jpalis.jpg'
5+
title: 'OpenSearch Community Member: Josh Palis'
6+
primary_title: Josh Palis
7+
breadcrumbs:
8+
icon: community
9+
items:
10+
- title: Community
11+
url: /community/index.html
12+
- title: Members
13+
url: /community/members/index.html
14+
- title: 'Josh Palis's Profile'
15+
url: '/community/members/josh-palis.html'
16+
github: joshpalis
17+
job_title_and_company: 'Software engineer at Amazon Web Services'
18+
personas:
19+
- author
20+
permalink: '/community/members/josh-palis.html'
21+
---
22+
23+
**Josh Palis** is a software engineer at Amazon Web Services focusing mostly on the OpenSearch Flow Framework plugin.

_community_members/ohltyler.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -20,4 +20,4 @@ personas:
2020
permalink: '/community/members/tyler-ohlsen.html'
2121
---
2222

23-
**Tyler Ohlsen** is a Software Engineer at AWS, focusing on anomaly detection in OpenSearch.
23+
**Tyler Ohlsen** is a software engineer at Amazon Web Services focusing mostly on the OpenSearch Anomaly Detection and Flow Framework plugins.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,127 @@
1+
---
2+
layout: post
3+
title: "Configurable automation for OpenSearch ML use cases"
4+
authors:
5+
- kazabdu
6+
- amitgalitz
7+
- dwiddis
8+
- jpalis
9+
- hnyng
10+
- ohltyler
11+
- minalsha
12+
date: 2024-04-08
13+
categories:
14+
- technical-posts
15+
meta_keywords: Flow Framework, OpenSearch plugins, Machine Learning
16+
meta_description: Explore the simplicity of integrating Machine Learning capabilities within OpenSearch through an innovative and groundbreaking framework designed to simplify complex setup tasks.
17+
---
18+
19+
In OpenSearch, to use machine learning (ML) offerings, such as semantic, hybrid, and multimodal search, you often have to grapple with complex setup and preprocessing tasks. Additionally, you must write verbose queries, which can be a time-consuming and error-prone process.
20+
21+
In this blog post, we introduce the OpenSearch Flow Framework plugin, [released in version 2.13](https://opensearch.org/blog/2.13-is-ready-for-download/) and designed to streamline this cumbersome process. By using this plugin, you can simplify complex setups with just one simple API call. We've provided automated templates, enabling you to create connectors, register models, deploy them, and register agents and tools through a single API call. This eliminates the complexity of calling multiple APIs and orchestrating setups based on the responses.
22+
23+
## Before the Flow Framework plugin
24+
25+
Previously, setting up semantic search involves *four separate API calls*, outlined in the [semantic search documentation](https://opensearch.org/docs/latest/search-plugins/semantic-search/):
26+
27+
1. Create a connector for a remote model, specifying pre- and post-processing functions.
28+
2. Register an embedding model using the connector ID obtained in the previous step.
29+
3. Configure an ingest pipeline to generate vector embeddings using the model ID of the registered model.
30+
4. Create a k-NN index and add the pipeline created in the previous step.
31+
32+
This complex setup required you to be familiar with the OpenSearch ML Commons APIs. However, we are simplifying this experience through the Flow Framework plugin. Let's demonstrate how the plugin simplifies this process using the preceding semantic search example.
33+
34+
## With the Flow Framework plugin
35+
36+
In this example, you will configure the `semantic_search_with_cohere_embedding_query_enricher` workflow template. The workflow created using this template performs the following configuration steps:
37+
38+
* Deploys an externally hosted Cohere model
39+
* Creates an ingest pipeline using the model
40+
* Creates a sample k-NN index and configures a search pipeline to define the default model ID for that index
41+
42+
### Step 1: Create and provision the workflow
43+
44+
Using the `semantic_search_with_cohere_embedding_query_enricher` workflow template, you provision the workflow with just one required field---the API key for the Cohere Embed model:
45+
46+
```json
47+
POST /_plugins/_flow_framework/workflow?use_case=semantic_search_with_cohere_embedding_query_enricher&provision=true
48+
{
49+
"create_connector.credential.key" : "<YOUR API KEY>"
50+
}
51+
```
52+
53+
OpenSearch responds with a unique workflow ID, simplifying the tracking and management of the setup process:
54+
55+
```json
56+
{
57+
"workflow_id" : "8xL8bowB8y25Tqfenm50"
58+
}
59+
```
60+
61+
Note: The workflow in the previous step creates a default k-NN index. The default index name is `my-nlp-index`.
62+
63+
You can customize the template default values by providing the new values in the request body. For a comprehensive list of default parameter values for this workflow template, see [Cohere Embed semantic search defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/cohere-embedding-semantic-search-defaults.json).
64+
65+
### Step 2: Ingest documents into the index
66+
67+
Once the workflow is provisioned, you can ingest documents into the index created by the workflow:
68+
69+
```json
70+
POST /my-nlp-index/_doc
71+
{
72+
"passage_text": "Hello world",
73+
"id": "s1"
74+
}
75+
```
76+
77+
### Step 3: Perform vector search
78+
79+
Performing a vector search on the index is equally straightforward. Using a neural query clause, you can easily retrieve relevant results:
80+
81+
```json
82+
GET /my-nlp-index/_search
83+
{
84+
"_source": {
85+
"excludes": [
86+
"passage_embedding"
87+
]
88+
},
89+
"query": {
90+
"neural": {
91+
"passage_embedding": {
92+
"query_text": "Hi world",
93+
"k": 10
94+
}
95+
}
96+
}
97+
}
98+
```
99+
100+
With the Flow Framework plugin, we've simplified this complex setup process, enabling you to focus on your tasks without the burden of navigating complex APIs. Our goal is for you to use OpenSearch seamlessly, uncovering new possibilities in your projects.
101+
102+
## Viewing workflow resources
103+
104+
The workflow you created provisioned all the necessary resources for semantic search. To view the provisioned resources, call the Get Workflow Status API and provide the `workflowID` for your workflow:
105+
106+
```
107+
GET /_plugins/_flow_framework/workflow/8xL8bowB8y25Tqfenm50/_status
108+
```
109+
110+
## Additional default use cases
111+
112+
You can explore more default use cases by viewing [substitution templates](https://github.com/opensearch-project/flow-framework/tree/2.13/src/main/resources/substitutionTemplates) and their corresponding [defaults](https://github.com/opensearch-project/flow-framework/tree/2.13/src/main/resources/defaults).
113+
114+
## Creating custom use cases
115+
116+
You can tailor templates according to your requirements. For more information, see [these sample templates](https://github.com/opensearch-project/flow-framework/tree/main/sample-templates) and the [Automating configurations](https://opensearch.org/docs/latest/automating-configurations/index/) documentation.
117+
118+
## Next steps
119+
120+
In our ongoing efforts to enhance the user experience and streamline the process of provisioning OpenSearch ML offerings, we have some exciting plans on our roadmap. We aim to develop a user-friendly drag-and-drop frontend interface. This interface will simplify the complex steps involved in provisioning ML features, thereby allowing you to seamlessly configure and deploy your workflows. Stay tuned for updates on this exciting development!
121+
122+
If you have any comments or suggestions, you can comment on the following RFCs:
123+
124+
- [Backend RFC](https://github.com/opensearch-project/OpenSearch/issues/9213)
125+
- [Frontend RFC](https://github.com/opensearch-project/OpenSearch-Dashboards/issues/4755)
126+
- [Flow Framework GitHub repository](https://github.com/opensearch-project/flow-framework)
127+
- [Flow Framework Dashboards GitHub repository](https://github.com/opensearch-project/dashboards-flow-framework)
26.5 KB
Loading

0 commit comments

Comments
 (0)