Skip to content

Commit 93cdf77

Browse files
committed
add offline batch inference connector blueprints
Signed-off-by: Xun Zhang <xunzh@amazon.com>
1 parent 1c43be5 commit 93cdf77

File tree

2 files changed

+303
-0
lines changed

2 files changed

+303
-0
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,148 @@
1+
### OpenAI connector blueprint example for batch inference:
2+
3+
Read more details on https://opensearch.org/docs/latest/ml-commons-plugin/remote-models/blueprints/
4+
5+
Integrate the OpenAI Batch API using the connector below with a new action type "batch_predict".
6+
For more details of the OpenAI Batch API, please refer to https://platform.openai.com/docs/guides/batch/overview.
7+
8+
#### 1. Create your Model connector and Model group
9+
10+
##### 1a. Register Model group
11+
```json
12+
POST /_plugins/_ml/model_groups/_register
13+
{
14+
"name": "openAI_model_group",
15+
"description": "Your openAI model group"
16+
}
17+
```
18+
This request response will return the `model_group_id`, note it down.
19+
Sample response:
20+
```json
21+
{
22+
"model_group_id": "IMobmY8B8aiZvtEZeO_i",
23+
"status": "CREATED"
24+
}
25+
```
26+
27+
##### 1b. Create Connector
28+
```json
29+
POST /_plugins/_ml/connectors/_create
30+
{
31+
"name": "OpenAI Embedding model",
32+
"description": "OpenAI embedding model for testing offline batch",
33+
"version": "1",
34+
"protocol": "http",
35+
"parameters": {
36+
"model": "text-embedding-ada-002",
37+
"input_file_id": "file-YbowBByiyVJN89oSZo2Enu9W",
38+
"endpoint": "/v1/embeddings"
39+
},
40+
"credential": {
41+
"openAI_key": "<your openAI key>"
42+
},
43+
"actions": [
44+
{
45+
"action_type": "predict",
46+
"method": "POST",
47+
"url": "https://api.openai.com/v1/embeddings",
48+
"headers": {
49+
"Authorization": "Bearer ${credential.openAI_key}"
50+
},
51+
"request_body": "{ \"input\": ${parameters.input}, \"model\": \"${parameters.model}\" }",
52+
"pre_process_function": "connector.pre_process.openai.embedding",
53+
"post_process_function": "connector.post_process.openai.embedding"
54+
},
55+
{
56+
"action_type": "batch_predict",
57+
"method": "POST",
58+
"url": "https://api.openai.com/v1/batches",
59+
"headers": {
60+
"Authorization": "Bearer ${credential.openAI_key}"
61+
},
62+
"request_body": "{ \"input_file_id\": \"${parameters.input_file_id}\", \"endpoint\": \"${parameters.endpoint}\", \"completion_window\": \"24h\" }"
63+
}
64+
]
65+
}
66+
```
67+
To create the file_id in the connector, please prepare your batch file and upload it to the OpenAI service through the file API. Please refer to this [Public doc](https://platform.openai.com/docs/api-reference/files)
68+
69+
#### Sample response
70+
```json
71+
{
72+
"connector_id": "XU5UiokBpXT9icfOM0vt"
73+
}
74+
```
75+
76+
### 2. Register model to the model group and link the created connector:
77+
78+
```json
79+
POST /_plugins/_ml/models/_register?deploy=true
80+
{
81+
"name": "OpenAI model for realtime embedding and offline batch inference",
82+
"function_name": "remote",
83+
"model_group_id": "IMobmY8B8aiZvtEZeO_i",
84+
"description": "OpenAI text embedding model",
85+
"connector_id": "XU5UiokBpXT9icfOM0vt"
86+
}
87+
```
88+
Sample response:
89+
```json
90+
{
91+
"task_id": "rMormY8B8aiZvtEZIO_j",
92+
"status": "CREATED",
93+
"model_id": "lyjxwZABNrAVdFa9zrcZ"
94+
}
95+
```
96+
### 3. Test offline batch inference using the connector
97+
98+
```json
99+
POST /_plugins/_ml/models/lyjxwZABNrAVdFa9zrcZ/_batch_predict
100+
{
101+
"parameters": {
102+
"model": "text-embedding-ada-002"
103+
}
104+
}
105+
```
106+
Sample response:
107+
```json
108+
{
109+
"inference_results": [
110+
{
111+
"output": [
112+
{
113+
"name": "response",
114+
"dataAsMap": {
115+
"id": "batch_khFSJIzT0eev9PuxVDsIGxv6",
116+
"object": "batch",
117+
"endpoint": "/v1/embeddings",
118+
"errors": null,
119+
"input_file_id": "file-YbowBByiyVJN89oSZo2Enu9W",
120+
"completion_window": "24h",
121+
"status": "validating",
122+
"output_file_id": null,
123+
"error_file_id": null,
124+
"created_at": 1722037257,
125+
"in_progress_at": null,
126+
"expires_at": 1722123657,
127+
"finalizing_at": null,
128+
"completed_at": null,
129+
"failed_at": null,
130+
"expired_at": null,
131+
"cancelling_at": null,
132+
"cancelled_at": null,
133+
"request_counts": {
134+
"total": 0,
135+
"completed": 0,
136+
"failed": 0
137+
},
138+
"metadata": null
139+
}
140+
}
141+
],
142+
"status_code": 200
143+
}
144+
]
145+
}
146+
```
147+
For the definition of each field in the result, please refer to https://platform.openai.com/docs/guides/batch.
148+
Once the batch is complete, you can download the output by making a request directly against the OpenAI Files API via the "id" field in the output.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,155 @@
1+
### Sagemaker connector blueprint example for batch inference:
2+
3+
Read more details on https://opensearch.org/docs/latest/ml-commons-plugin/remote-models/blueprints/
4+
5+
Integrate the SageMaker Batch Transform API using the connector below with a new action type "batch_predict".
6+
For more details to use batch transform to run inference with Amazon SageMaker, please refer to https://docs.aws.amazon.com/sagemaker/latest/dg/batch-transform.html.
7+
8+
#### 1. Create your Model connector and Model group
9+
10+
##### 1a. Register Model group
11+
```json
12+
POST /_plugins/_ml/model_groups/_register
13+
{
14+
"name": "sagemaker_model_group",
15+
"description": "Your sagemaker model group"
16+
}
17+
```
18+
This request response will return the `model_group_id`, note it down.
19+
Sample response:
20+
```json
21+
{
22+
"model_group_id": "IMobmY8B8aiZvtEZeO_i",
23+
"status": "CREATED"
24+
}
25+
```
26+
27+
##### 1b. Create Connector
28+
```json
29+
POST /_plugins/_ml/connectors/_create
30+
{
31+
"name": "DJL Sagemaker Connector: all-MiniLM-L6-v2",
32+
"version": "1",
33+
"description": "The connector to sagemaker embedding model all-MiniLM-L6-v2",
34+
"protocol": "aws_sigv4",
35+
"credential": {
36+
"access_key": "<your access_key>",
37+
"secret_key": "<your secret_key>",
38+
"session_token": "<your session_token>"
39+
},
40+
"parameters": {
41+
"region": "us-east-1",
42+
"service_name": "sagemaker",
43+
"DataProcessing": {
44+
"InputFilter": "$.content",
45+
"JoinSource": "Input",
46+
"OutputFilter": "$"
47+
},
48+
"ModelName": "DJL-Text-Embedding-Model-imageforjsonlines",
49+
"TransformInput": {
50+
"ContentType": "application/json",
51+
"DataSource": {
52+
"S3DataSource": {
53+
"S3DataType": "S3Prefix",
54+
"S3Uri": "s3://offlinebatch/sagemaker_djl_batch_input.json"
55+
}
56+
},
57+
"SplitType": "Line"
58+
},
59+
"TransformJobName": "SM-offline-batch-transform-07-12-13-30",
60+
"TransformOutput": {
61+
"AssembleWith": "Line",
62+
"Accept": "application/json",
63+
"S3OutputPath": "s3://offlinebatch/output"
64+
},
65+
"TransformResources": {
66+
"InstanceCount": 1,
67+
"InstanceType": "ml.c5.xlarge"
68+
},
69+
"BatchStrategy": "SingleRecord"
70+
},
71+
"actions": [
72+
{
73+
"action_type": "predict",
74+
"method": "POST",
75+
"headers": {
76+
"content-type": "application/json"
77+
},
78+
"url": "https://runtime.sagemaker.us-east-1.amazonaws.com/endpoints/OpenSearch-sagemaker-060124023703/invocations",
79+
"request_body": "${parameters.input}",
80+
"pre_process_function": "connector.pre_process.default.embedding",
81+
"post_process_function": "connector.post_process.default.embedding"
82+
},
83+
{
84+
"action_type": "batch_predict",
85+
"method": "POST",
86+
"headers": {
87+
"content-type": "application/json"
88+
},
89+
"url": "https://api.sagemaker.us-east-1.amazonaws.com/CreateTransformJob",
90+
"request_body": "{ \"BatchStrategy\": \"${parameters.BatchStrategy}\", \"ModelName\": \"${parameters.ModelName}\", \"DataProcessing\" : ${parameters.DataProcessing}, \"TransformInput\": ${parameters.TransformInput}, \"TransformJobName\" : \"${parameters.TransformJobName}\", \"TransformOutput\" : ${parameters.TransformOutput}, \"TransformResources\" : ${parameters.TransformResources}}"
91+
}
92+
]
93+
}
94+
```
95+
SageMaker supports data processing through a subset of the defined JSONPath operators, and supports Associating Inferences results with Input Records.
96+
Please refer to this [AWS doc](https://docs.aws.amazon.com/sagemaker/latest/dg/batch-transform-data-processing.html)
97+
98+
#### Sample response
99+
```json
100+
{
101+
"connector_id": "XU5UiokBpXT9icfOM0vt"
102+
}
103+
```
104+
105+
### 2. Register model to the model group and link the created connector:
106+
107+
```json
108+
POST /_plugins/_ml/models/_register?deploy=true
109+
{
110+
"name": "SageMaker model for realtime embedding and offline batch inference",
111+
"function_name": "remote",
112+
"model_group_id": "IMobmY8B8aiZvtEZeO_i",
113+
"description": "SageMaker hosted DJL model",
114+
"connector_id": "XU5UiokBpXT9icfOM0vt"
115+
}
116+
```
117+
Sample response:
118+
```json
119+
{
120+
"task_id": "rMormY8B8aiZvtEZIO_j",
121+
"status": "CREATED",
122+
"model_id": "lyjxwZABNrAVdFa9zrcZ"
123+
}
124+
```
125+
### 3. Test offline batch inference using the connector
126+
127+
```json
128+
POST /_plugins/_ml/models/dBK3t5ABrxVhHgFYhg7Q/_batch_predict
129+
{
130+
"parameters": {
131+
"TransformJobName": "SM-offline-batch-transform-07-15-11-30"
132+
}
133+
}
134+
```
135+
Sample response:
136+
```json
137+
{
138+
"inference_results": [
139+
{
140+
"output": [
141+
{
142+
"name": "response",
143+
"dataAsMap": {
144+
"job_arn": "arn:aws:sagemaker:us-east-1:802041417063:transform-job/SM-offline-batch-transform"
145+
}
146+
}
147+
],
148+
"status_code": 200
149+
}
150+
]
151+
}
152+
```
153+
The "job_arn" is returned immediately from this request, and you can use this job_arn to check the job status
154+
in the SageMaker service. Once the job is done, you can check your batch inference results in the S3 that is
155+
specified in the "S3OutputPath" field in your connector.

0 commit comments

Comments
 (0)