|
| 1 | +### Sagemaker connector blueprint example for batch inference: |
| 2 | + |
| 3 | +Read more details on https://opensearch.org/docs/latest/ml-commons-plugin/remote-models/blueprints/ |
| 4 | + |
| 5 | +Integrate the SageMaker Batch Transform API using the connector below with a new action type "batch_predict". |
| 6 | +For more details to use batch transform to run inference with Amazon SageMaker, please refer to https://docs.aws.amazon.com/sagemaker/latest/dg/batch-transform.html. |
| 7 | + |
| 8 | +#### 1. Create your Model connector and Model group |
| 9 | + |
| 10 | +##### 1a. Register Model group |
| 11 | +```json |
| 12 | +POST /_plugins/_ml/model_groups/_register |
| 13 | +{ |
| 14 | + "name": "sagemaker_model_group", |
| 15 | + "description": "Your sagemaker model group" |
| 16 | +} |
| 17 | +``` |
| 18 | +This request response will return the `model_group_id`, note it down. |
| 19 | +Sample response: |
| 20 | +```json |
| 21 | +{ |
| 22 | + "model_group_id": "IMobmY8B8aiZvtEZeO_i", |
| 23 | + "status": "CREATED" |
| 24 | +} |
| 25 | +``` |
| 26 | + |
| 27 | +##### 1b. Create Connector |
| 28 | +```json |
| 29 | +POST /_plugins/_ml/connectors/_create |
| 30 | +{ |
| 31 | + "name": "DJL Sagemaker Connector: all-MiniLM-L6-v2", |
| 32 | + "version": "1", |
| 33 | + "description": "The connector to sagemaker embedding model all-MiniLM-L6-v2", |
| 34 | + "protocol": "aws_sigv4", |
| 35 | + "credential": { |
| 36 | + "access_key": "<your access_key>", |
| 37 | + "secret_key": "<your secret_key>", |
| 38 | + "session_token": "<your session_token>" |
| 39 | + }, |
| 40 | + "parameters": { |
| 41 | + "region": "us-east-1", |
| 42 | + "service_name": "sagemaker", |
| 43 | + "DataProcessing": { |
| 44 | + "InputFilter": "$.content", |
| 45 | + "JoinSource": "Input", |
| 46 | + "OutputFilter": "$" |
| 47 | + }, |
| 48 | + "ModelName": "DJL-Text-Embedding-Model-imageforjsonlines", |
| 49 | + "TransformInput": { |
| 50 | + "ContentType": "application/json", |
| 51 | + "DataSource": { |
| 52 | + "S3DataSource": { |
| 53 | + "S3DataType": "S3Prefix", |
| 54 | + "S3Uri": "s3://offlinebatch/sagemaker_djl_batch_input.json" |
| 55 | + } |
| 56 | + }, |
| 57 | + "SplitType": "Line" |
| 58 | + }, |
| 59 | + "TransformJobName": "SM-offline-batch-transform-07-12-13-30", |
| 60 | + "TransformOutput": { |
| 61 | + "AssembleWith": "Line", |
| 62 | + "Accept": "application/json", |
| 63 | + "S3OutputPath": "s3://offlinebatch/output" |
| 64 | + }, |
| 65 | + "TransformResources": { |
| 66 | + "InstanceCount": 1, |
| 67 | + "InstanceType": "ml.c5.xlarge" |
| 68 | + }, |
| 69 | + "BatchStrategy": "SingleRecord" |
| 70 | + }, |
| 71 | + "actions": [ |
| 72 | + { |
| 73 | + "action_type": "predict", |
| 74 | + "method": "POST", |
| 75 | + "headers": { |
| 76 | + "content-type": "application/json" |
| 77 | + }, |
| 78 | + "url": "https://runtime.sagemaker.us-east-1.amazonaws.com/endpoints/OpenSearch-sagemaker-060124023703/invocations", |
| 79 | + "request_body": "${parameters.input}", |
| 80 | + "pre_process_function": "connector.pre_process.default.embedding", |
| 81 | + "post_process_function": "connector.post_process.default.embedding" |
| 82 | + }, |
| 83 | + { |
| 84 | + "action_type": "batch_predict", |
| 85 | + "method": "POST", |
| 86 | + "headers": { |
| 87 | + "content-type": "application/json" |
| 88 | + }, |
| 89 | + "url": "https://api.sagemaker.us-east-1.amazonaws.com/CreateTransformJob", |
| 90 | + "request_body": "{ \"BatchStrategy\": \"${parameters.BatchStrategy}\", \"ModelName\": \"${parameters.ModelName}\", \"DataProcessing\" : ${parameters.DataProcessing}, \"TransformInput\": ${parameters.TransformInput}, \"TransformJobName\" : \"${parameters.TransformJobName}\", \"TransformOutput\" : ${parameters.TransformOutput}, \"TransformResources\" : ${parameters.TransformResources}}" |
| 91 | + } |
| 92 | + ] |
| 93 | +} |
| 94 | +``` |
| 95 | +SageMaker supports data processing through a subset of the defined JSONPath operators, and supports Associating Inferences results with Input Records. |
| 96 | +Please refer to this [AWS doc](https://docs.aws.amazon.com/sagemaker/latest/dg/batch-transform-data-processing.html) |
| 97 | + |
| 98 | +#### Sample response |
| 99 | +```json |
| 100 | +{ |
| 101 | + "connector_id": "XU5UiokBpXT9icfOM0vt" |
| 102 | +} |
| 103 | +``` |
| 104 | + |
| 105 | +### 2. Register model to the model group and link the created connector: |
| 106 | + |
| 107 | +```json |
| 108 | +POST /_plugins/_ml/models/_register?deploy=true |
| 109 | +{ |
| 110 | + "name": "SageMaker model for realtime embedding and offline batch inference", |
| 111 | + "function_name": "remote", |
| 112 | + "model_group_id": "IMobmY8B8aiZvtEZeO_i", |
| 113 | + "description": "SageMaker hosted DJL model", |
| 114 | + "connector_id": "XU5UiokBpXT9icfOM0vt" |
| 115 | +} |
| 116 | +``` |
| 117 | +Sample response: |
| 118 | +```json |
| 119 | +{ |
| 120 | + "task_id": "rMormY8B8aiZvtEZIO_j", |
| 121 | + "status": "CREATED", |
| 122 | + "model_id": "lyjxwZABNrAVdFa9zrcZ" |
| 123 | +} |
| 124 | +``` |
| 125 | +### 3. Test offline batch inference using the connector |
| 126 | + |
| 127 | +```json |
| 128 | +POST /_plugins/_ml/models/dBK3t5ABrxVhHgFYhg7Q/_batch_predict |
| 129 | +{ |
| 130 | + "parameters": { |
| 131 | + "TransformJobName": "SM-offline-batch-transform-07-15-11-30" |
| 132 | + } |
| 133 | +} |
| 134 | +``` |
| 135 | +Sample response: |
| 136 | +```json |
| 137 | +{ |
| 138 | + "inference_results": [ |
| 139 | + { |
| 140 | + "output": [ |
| 141 | + { |
| 142 | + "name": "response", |
| 143 | + "dataAsMap": { |
| 144 | + "job_arn": "arn:aws:sagemaker:us-east-1:802041417063:transform-job/SM-offline-batch-transform" |
| 145 | + } |
| 146 | + } |
| 147 | + ], |
| 148 | + "status_code": 200 |
| 149 | + } |
| 150 | + ] |
| 151 | +} |
| 152 | +``` |
| 153 | +The "job_arn" is returned immediately from this request, and you can use this job_arn to check the job status |
| 154 | +in the SageMaker service. Once the job is done, you can check your batch inference results in the S3 that is |
| 155 | +specified in the "S3OutputPath" field in your connector. |
0 commit comments