This repository serves as a final report summarizing my contributions during Google Summer of Code 2023 with OpenVINO. The project involved exposing the OpenVINO Runtime API in Java using the JNI framework and adding support for OpenVINO Runtime in John Snow Labs Spark NLP, a high-performance NLP library.
Spark NLP is an open-source, NLP library widely used in production that offers simple, performant and accurate NLP annotations for machine learning pipelines that can scale easily in distributed environments. It provides an enterprise-grade, unified solution that comes with thousands of pretrained models, pipelines and several NLP features that enable users to build end-to-end NLP pipelines and fits seamlessly into your data processing pipeline by extending Apache Spark natively. Written in Scala, Spark NLP provides support for Python, R and the JVM ecosystem (Java, Scala and Kotlin). Currently, it offers CPU optimization capabilities via Intel-optimized Tensorflow and ONNX Runtime, and supports importing custom models in the Tensorflow SavedModel and ONNX formats.
This project aims to enhance the capabilities of Spark NLP by adding support for OpenVINO Runtime, providing significant out-of-the-box improvements for LLM models including BERT, especially on CPU and integrated GPU-based systems, and extending support for various model formats like ONNX, PaddlePaddle, TensorFlow, TensorFlow Lite and OpenVINO IR. Combined with further optimization and quantization capabilities offered by the OpenVINO Toolkit ecosystem when exporting models, OpenVINO Runtime will serve as a unified, high-performance inference engine capable of delivering accelerated inferencing of NLP pipelines on a variety of Intel hardware platforms. Furthermore, exposing the OpenVINO API bindings in Java will open up avenues for a large community of Java developers to benefit from OpenVINO's rich feature set as an inference and deployment solution for JVM-based projects in the future.
- Add required JNI Bindings to the OpenVINO Java module
- Enable the OpenVINO Runtime API to import and run models in Spark NLP
- Benchmark models run with the new OpenVINO backend
- Sample scripts demonstrating the usage of this feature
- Sample notebooks demonstrating how to export and prepare models
PR Link | Description |
---|---|
PR #668 | Reorganize project structure and improve documentation |
PR #709 | Add Java API bindings |
PR #13947 | Integrating OpenVINO Runtime in Spark NLP |
Spark NLP | Notebook | Sample |
---|---|---|
BertEmbeddings | Export BERT HuggingFace | |
RoBertaEmbeddings | Export RoBerta HuggingFace | |
XlmRoBertaEmbeddings | Export XLM RoBerta HuggingFace |
- The Need for Speed: Accelerating NLP Inferencing in Spark NLP with OpenVINO™ Runtime
- Deep Learning Inference in Java with OpenVINO™ Runtime
- Spark NLP-OpenVINO Integration Architecture
- OpenVINO Java Setup (Linux)
- OpenVINO Java Setup (Windows)
- Spark NLP Dev Setup
- Build Spark NLP Jar
- Spark NLP-OpenVINO Dockerfile
- Benchmarks