[RFC] Support Python in OS Scripting Service #17432

yuancu · 2025-02-24T10:04:14Z

Introduction

This RFC proposes adding Python 3 as a supported language in the OpenSearch Scripting Service, especially to comply Painless script which is now considered to be ‘painful’.

Python is widely recognized as a simple yet powerful language, especially within the data science community. By integrating Python, OpenSearch aims to broaden its appeal to users who rely on Python for data processing and analytical tasks.

Background & Motivation

OpenSearch currently supports several scripting languages, such as Painless, Mustache, and Expressions. While each has merits, they also come with learning curves that may be unfamiliar to Python users. Python’s ecosystem offers extensive data processing, machine learning, and analytical libraries. Enabling Python scripts within OpenSearch will reduce adoption barriers and empower a broader segment of the community to write custom logic for tasks such as scoring documents, executing specialized aggregations, and customizing ingestion pipelines.

Proposed Solution

Overview

The proposal is to implement a Python script plugin that integrates with the existing Scripting Service. This plugin will allow users to evaluate Python scripts at runtime under various contexts.

Below is a high-level flowchart illustrating how Python scripting will interact with the existing OpenSearch architecture:

flowchart LR
    A[Client] --"`{#quot;source#quot;: #quot;sum(doc['ratings'])#quot;,<br>#quot;lang#quot;: #quot;python#quot;}`" --> B("ScriptService<br>(Coordinating Node)")
    B -- Dispatch --> D(PythonScriptPlugin)
    D -- Compile &cache --> H(Shard Execution)
    H --> I("Aggregation<br>(Coordinating Node)")
    I -- Return results --> A

Implementation Approaches

We have identified two primary implementation strategies:

Parsing and Translating Python Code

The Python code could be parsed into an intermediate form (e.g. Calcite’s logical plan) that OpenSearch can convert into its native execution plan.
- Pros: The translation into native representation is under full control of maintainers, thus naturally brings a higher degree of predictability and enhanced security.
- Cons: This approach involves a more complex and extensive development effort. Obviously limited python capabilities can be implemented.
Direct Execution as Guest Language (GraalVM)

Using GraalVM’s Polyglot APIs, Python code can run directly within the OpenSearch process, where a standalone python runtime is hosted inside the same JVM. This solution embeds Python as a guest language, enabling executions of custom Python scripts.
- Pros: Python’s full power is unleashed— users can leverage Python’s standard ad third-party packages, enabling more complex and more robust data processing. This approach is also more straightforward than implementing and maintaining a custom python language translator.
- Cons: Running a Python runtime inside JVM increases the attack surface and may introduce a resource overhead. Strong sandboxing and resource usage limits are critical.

A PoC has demonstrated the feasibility of the second approach with GraalVM.

Demo

Demo1: Custom Scoring with Python

This demo exemplifies how to calculate scores as an average of ratings using a Python script

Create an index called “books” and insert 3 books into it

POST /_bulk
{"create": {"_index": "books", "_id": 1}
{"name":"Beneath the Wheel", "ratings":[4,3,5]}
{"create": {"_index": "books", "_id": 2}}
{"name":"Faust", "ratings":[5,5,5]}
{"create": {"_index": "books", "_id": 3}}
{"name":"The Odyssey", "ratings":[2,1,5]}

Store a Python script called agg_ratings.
```
PUT /_scripts/agg_ratings
{
  "script": {
      "lang": "python",
      "source": "sum(doc['ratings']) / len(doc['ratings']) * params['factor']"
  }
}
```
The script takes the average of book ratings and multiply it by a factor, which will be passed from query parameters.

Execute the script under the score context. The score context runs a script as if the script were in a script_score function in a function_score query.

POST /books/_search
{
  "query": {
    "function_score": {
      "script_score": {
        "script": {
          "id": "agg_ratings",
          "params": {
            "factor": 2.0
          }
        }
      }
    }
  }
}

The params object specifies the factor as 2.0, which will scale the average ratings to a 0–10 range.

A sample response might look as follows:

{
    "took": 330,
    "timed_out": false,
    "_shards": {...},
    "hits": {
        "total": {"value": 3, "relation": "eq"},
        "max_score": 10.0,
        "hits": [
            {
                "_index": "books",
                "_id": "2",
                "_score": 10.0,
                "_source": {
                    "name": "Faust",
                    "ratings": [5, 5, 5]
                }
            },
            {
                "_index": "books",
                "_id": "1",
                "_score": 8.0,
                "_source": {
                    "name": "Beneath the Wheel",
                    "ratings": [4, 3, 5]
                }
            },
            {
                "_index": "books",
                "_id": "3",
                "_score": 5.3333335,
                "_source": {
                    "name": "The Odyssey",
                    "ratings": [2, 1, 5]
                }
            }
        ]
    }
}

Here, _score is the average of the document’s ratings multiplied by the specified factor of 2.0. This confirms that the Python script correctly evaluates the provided documents and parameters.

Demo2: Post-processing tensor output in neural search

Neural search applies language models to transform document texts into vector embedding for a better performance in semantic search. It supports using externally hosted models to embed documents. This tutorial explains the process in more details. However, different language model vendors return tensors wrapped in different formats. Historically, users have to write Painless scripts to transform the data to a unified format that can be recognized by the document ingestion pipeline. In this demonstration, we use Python to process responses from the Bedrock Cohere embed-english model. The following steps follow the standard way to connect to externally hosted models and is modified from this blueprint; we only alter the post-processing part to use custom Python script. Irrelevant parts are omitted for brevity.

Create a connector for Amazon Bedrock

POST /_plugins/_ml/connectors/_create
{
    "name": "Amazon Bedrock Connector: Cohere embed-english-v3",
    ...
    "parameters": {
        "region": "us-east-1",
        "service_name": "bedrock",
        "truncate": "END",
        "input_type": "search_document",
        "model": "cohere.embed-english-v3"
    },
    "actions": [
        {
            "action_type": "predict",
            ...
            "url": "https://bedrock-runtime.${parameters.region}.amazonaws.com/model/${parameters.model}/invoke",
            "request_body": "{ \"texts\": ${parameters.texts}, \"truncate\": \"${parameters.truncate}\", \"input_type\": \"${parameters.input_type}\" }",
            "pre_process_function": "\n    StringBuilder builder = new StringBuilder();\n    builder.append(\"[\");\n    for (int i=0; i< params.text_docs.length; i++) {\n        builder.append(\"\\\"\");\n        builder.append(params.text_docs[i]);\n        builder.append(\"\\\"\");\n        if (i < params.text_docs.length - 1) {\n          builder.append(\",\")\n        }\n    }\n    builder.append(\"]\");\n    def parameters = \"{\" +\"\\\"prompt\\\":\" + builder + \"}\";\n    return  \"{\" +\"\\\"parameters\\\":\" + parameters + \"}\";",
            "pre_process_lang": "painless",
            "post_process_function": "import json\nNone if not doc['embeddings'] else json.dumps([{'name':'sentence_embedding','data_type':'FLOAT32','shape':[len(x)],'data':x} for x in doc['embeddings']])",
            "post_process_lang": "python"
        }
    ]
}

In the above example:

pre_process_function: Utilizes a Painless script to prepare the request payload for the model.
post_process_function: Uses a Python script to transform the returned embeddings into JSON objects that include metadata such as name, data_type, and shape.

The Python script is shown below:

import json
None if not doc['embeddings'] else json.dumps([{'name':'sentence_embedding', 'data_type':'FLOAT32', 'shape':[len(x)], 'data':x} for x in doc['embeddings']])

This script unpacks the returned list of tensors into JSON objects with their corresponding metadata, which can then be used by downstream components in the ingestion pipeline.

Note:

The ml-commons plugin has been modified to support the optional pre_process_lang and post_process_lang parameters for this proof-of-concept.
In the current release of ml-commons, built-in support for certain Cohere models is available via connector.pre_process.cohere.embedding and connector.post_process.cohere.embedding. This demonstration uses custom scripts for illustrative purposes and to verify correctness.

Generate embeddings with custom post-processing

POST /_plugins/_ml/models/<MODEL_ID>/_predict
{
  "parameters": {
    "texts" : ["Hello world", "This is a test"]
  }
}

The <MODEL_ID> is an identifier for the external model generated from previous steps. The response is as follows:

{
    "inference_results": [
        {
            "output": [
                {
                    "name": "sentence_embedding",
                    "data_type": "FLOAT32",
                    "shape": [1024],
                    "data": [-0.029205322, -0.02357483, ...]
                },
                {
                    "name": "sentence_embedding",
                    "data_type": "FLOAT32",
                    "shape": [1024],
                    "data": [-0.013885498, 0.009994507,...]
                }
            ],
            "status_code": 200
        }
    ]
}

The embeddings here have been post-processed by the Python script to provide standardized metadata alongside the raw tensor data.

Python packages

Built-in Python libraries are self-contained in GraalVM’s Polyglot Python runtime. Third-party python packages can be configured with GraalPy gradle plugin by specifying package names and versions in build.gradle :

// An example of including numpy dependence
graalPy {
  packages = ["numpy==1.26.4"]
  ...
}

GraalPy is compatible with common Python packages such as Numpy and Pandas. Please consult GraalPy package compatibility for the list of supported Python packages.

Security and Compatibility

Security (varies based on implementation)

Sandboxing: GraalVM offers security mechanisms like sandboxing and host access control out of the box. We will need to scrutinize them to ensure the extended capability aligns with the security guidelines of OpenSearch.
Malicious scripts: Multiple approaches has been discussed to eliminate the risks of malicious scripts
- Import restrictions: only whitelisted packages are allowed to import
- No I/O access: Access to files and network are forbidden
- Fine-grained syntactical / behavioral whitelist: only whitelisted syntax or behaviors are allowed
- Please feel free to propose more measures to enhance security
Resource management: Python scripts should be subject to resource usage limits (e.g., CPU and memory) to ensure they do not disrupt cluster stability.

GraalVM Compatibility

GraalVM’s polyglot API is able to run on various Java runtime, including OpenJDK, GraalVM Community Edition, Oracle JDK, etc. This should cover most use cases. Runtimes that are not GraalVM can be further optimized if experimental VM options as below are enabled:

-XX:+UnlockExperimentalVMOptions
-XX:+EnableJVMCI

The text was updated successfully, but these errors were encountered:

model-collapse · 2025-02-24T13:18:26Z

This is a fantastic proposal. We all agree that python can definitely help opensearch pricking into lots of potential areas. While scripts are consider to be a light weight interface for our users to do customizations. Python, as a popular language, will turn over the users' impress to painless script which is 'painful'.

msfroh · 2025-02-28T00:58:42Z

I think this is a really interesting idea.

I'm glad that you're considering the security implications! I see that as the biggest obstacle that we'll need to overcome to make this reality. The GraalVM sandboxing could be a promising start, but we'll definitely need to be very careful that we don't introduce a new vector for attackers.

smacrakis · 2025-02-28T22:10:23Z

I love the idea of using Python rather than Painless -- as you say, it is better-known and has far more capabilities.
Besides using Python in the core engine, I've been thinking that it might be a great language for the UI (dashboard), where I can imagine it giving a huge boost to user-written extensions.

epugh · 2025-03-04T21:47:49Z

From my perspective, the real value of adding Python is when we go big. My gut feeling is that adding Python just as a bare bones syntax replacement for Painless, while worthwhile in itself isn't the big win. We'll just be fielding requests to "support this Python library" and "can I make this call out?"

The big win is more meta... It's when we can tell our Data Scientist colleages that "We Appreciate YOU and Care About YOU", and we're expressing that by bring your number 1 tool to the table: Python.

The next great search engine is the one that the Data Scientist community embraces wholeheartedly, and we want that to be OpenSearch. We want OpenSearch to be a tool they naturally reach for, just like Jupyter Notebooks, Pandas etc. That means supporting Python as a first class citizen up and down the stack.

Yes, there are engineering challenges and we need to embrace them, not shy away from them. Someone will embrace them, why not us?

model-collapse · 2025-03-05T01:51:57Z

@epugh @smacrakis we will definitely raise another RFC on that, after we finish a short demo.

model-collapse · 2025-03-13T09:19:40Z

We also want to save the effort of users writing import xxx @yuancu let's think about that. Moving the import clause to some configurations.

github-actions bot added the untriaged label Feb 24, 2025

yyfamazon mentioned this issue Mar 13, 2025

[RFC] Introducing JupyterNotebook into OpenSearch Dashboards opensearch-project/OpenSearch-Dashboards#9537

Open

gaobinlong assigned yuancu Mar 13, 2025

gaobinlong added the enhancement Enhancement or improvement to existing feature or request label Mar 13, 2025

gaobinlong removed the untriaged label Mar 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Support Python in OS Scripting Service #17432

[RFC] Support Python in OS Scripting Service #17432

yuancu commented Feb 24, 2025 •

edited

Loading

model-collapse commented Feb 24, 2025

msfroh commented Feb 28, 2025

smacrakis commented Feb 28, 2025

epugh commented Mar 4, 2025

model-collapse commented Mar 5, 2025

model-collapse commented Mar 13, 2025

[RFC] Support Python in OS Scripting Service #17432

[RFC] Support Python in OS Scripting Service #17432

Comments

yuancu commented Feb 24, 2025 • edited Loading

Introduction

Background & Motivation

Proposed Solution

Overview

Implementation Approaches

Demo

Demo1: Custom Scoring with Python

Demo2: Post-processing tensor output in neural search

Python packages

Security and Compatibility

Security (varies based on implementation)

GraalVM Compatibility

model-collapse commented Feb 24, 2025

msfroh commented Feb 28, 2025

smacrakis commented Feb 28, 2025

epugh commented Mar 4, 2025

model-collapse commented Mar 5, 2025

model-collapse commented Mar 13, 2025

yuancu commented Feb 24, 2025 •

edited

Loading