Skip to content

Commit 15f6714

Browse files
Agentic kit: Fix issues, improve performance, and add shopping cart feature (#212)
* Moved files, updated readme * Moved code to main function * Agentic LLM RAG: Fix issues, improve performance, and add shopping cart feature Signed-off-by: Antonio Martinez <jose.antonio.martinez.torres@intel.com> * Removing config files Signed-off-by: Antonio Martinez <jose.antonio.martinez.torres@intel.com> * Adding missing requirements.txt file Signed-off-by: Antonio Martinez <jose.antonio.martinez.torres@intel.com> * Changing app.py to main.py for comply with github actions Signed-off-by: Antonio Martinez <jose.antonio.martinez.torres@intel.com> * Changing app.py to main.py for comply with github actions Signed-off-by: Antonio Martinez <jose.antonio.martinez.torres@intel.com> * remove personality arg Signed-off-by: Antonio Martinez <jose.antonio.martinez.torres@intel.com> * change default model for convert and optimize Signed-off-by: Antonio Martinez <jose.antonio.martinez.torres@intel.com> * Improving documentation and public arg in main.py Signed-off-by: Antonio Martinez <jose.antonio.martinez.torres@intel.com> * Address PR feedback Signed-off-by: Antonio Martinez <jose.antonio.martinez.torres@intel.com> * fix hf_token arg Signed-off-by: Antonio Martinez <jose.antonio.martinez.torres@intel.com> * Adding main.py wrapper for CI/CDs Signed-off-by: Antonio Martinez <jose.antonio.martinez.torres@intel.com> * Rename app.py Signed-off-by: Antonio Martinez <jose.antonio.martinez.torres@intel.com> * Adding device arg Signed-off-by: Antonio Martinez <jose.antonio.martinez.torres@intel.com> * Add model mapping for bge models Signed-off-by: Antonio Martinez <jose.antonio.martinez.torres@intel.com> * Update args in main.py Signed-off-by: Antonio Martinez <jose.antonio.martinez.torres@intel.com> * fixing main.py Signed-off-by: Antonio Martinez <jose.antonio.martinez.torres@intel.com> * Set AUTO:GPU,CPU to main.py Signed-off-by: Antonio Martinez <jose.antonio.martinez.torres@intel.com> * Update README Signed-off-by: Antonio Martinez <jose.antonio.martinez.torres@intel.com> * Update gif in README and add queue() to Gradio demo Signed-off-by: Antonio Martinez <jose.antonio.martinez.torres@intel.com> * Update README Signed-off-by: Antonio Martinez <jose.antonio.martinez.torres@intel.com> * Update README Signed-off-by: Antonio Martinez <jose.antonio.martinez.torres@intel.com> * test llama 3B for CICDs Signed-off-by: Antonio Martinez <jose.antonio.martinez.torres@intel.com> * update image Signed-off-by: Antonio Martinez <jose.antonio.martinez.torres@intel.com> --------- Signed-off-by: Antonio Martinez <jose.antonio.martinez.torres@intel.com> Co-authored-by: Adrian Boguszewski <adrian.boguszewski@intel.com>
1 parent ef92f4b commit 15f6714

11 files changed

+1332
-1
lines changed

ai_ref_kits/README.md

+14-1
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@
1515
- [🔦 Explainable AI](#-explainable-ai)
1616
- [🖼️ Multimodal AI Visual Generator](#%EF%B8%8F-multimodal-ai-visual-generator)
1717
- [💬 Conversational AI Chatbot](#-conversational-ai-chatbot)
18+
- [🛒 AI Insight Agent with RAG](#-AI-Insight-Agent-with-RAG)
1819

1920
- [Troubleshooting and Resources](#troubleshooting-and-resources)
2021

@@ -115,7 +116,19 @@ An in-depth demo of how the Multimodal AI Visual Generator Kit creates a real-ti
115116
| Example industries | Tourism |
116117
| Demo | |
117118

118-
The Conversational AI Chatbot is an open-source, voice-driven chat agent that answers spoken questions with meaningful, spoken responses. It can be configured to respond in any type of scenario or context. This kit demonstrates the AI Chatbot’s capabilities by simulating the experience of talking to a hotel concierge.
119+
The Conversational AI Chatbot is an open-source, voice-driven chat agent that answers spoken questions with meaningful, spoken responses. It can be configured to respond in any type of scenario or context.
120+
This kit demonstrates the AI Chatbot’s capabilities by simulating the experience of talking to a hotel concierge.
121+
122+
### 🛒 AI Insight Agent with RAG
123+
[![agentic_llm_rag](https://github.com/user-attachments/assets/0471ab91-ded5-4a5f-8d8e-5432f1b4b45c)](agentic-llm-rag)
124+
125+
| [AI Insight Agent with RAG](agentic_llm_rag) | |
126+
|--------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------|
127+
| Related AI concepts | Natural Language Understanding, Large Language Models (LLMs), Retrieval Augmented Generation (RAG), Agentic AI, Generative AI |
128+
| Example industries | Retail |
129+
| Demo | |
130+
131+
The AI Insight Agent with RAG uses Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) to interpret user prompts, engage in meaningful dialogue, perform calculations, use RAG techniques to improve its knowledge and interact with the user to add items to a virtual shopping cart.
119132

120133
## Troubleshooting and Resources
121134
- Open a [discussion topic](https://github.com/openvinotoolkit/openvino_build_deploy/discussions)

ai_ref_kits/agentic_llm_rag/README.md

+203
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,203 @@
1+
<div id="top" align="center">
2+
<h1>AI Insight Agent with RAG</h1>
3+
<h4>
4+
<a href="https://www.intel.com/content/www/us/en/developer/topic-technology/edge-5g/open-potential.html">🏠&nbsp;About&nbsp;the&nbsp;Kits&nbsp;</a>
5+
<!-- <a href="">👨‍💻&nbsp;Code&nbsp;Demo&nbsp;Video</a> -->
6+
</h4>
7+
</div>
8+
9+
[![Apache License Version 2.0](https://img.shields.io/badge/license-Apache_2.0-green.svg)](https://github.com/openvinotoolkit/openvino_build_deploy/blob/master/LICENSE.txt)
10+
11+
<p align="center">
12+
<img src="https://github.com/user-attachments/assets/3dedf848-cc4a-4b1c-b83e-dad29e3e1657" width="500">
13+
</p>
14+
15+
The AI Insight Agent with RAG uses Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) to interpret user prompts, engage in meaningful dialogue, perform calculations, use RAG techniques to improve its knowledge and interact with the user to add items to a virtual shopping cart. This solution uses the OpenVINO™ toolkit to power the AI models at the edge. Designed for both consumers and employees, it functions as a smart, personalized retail assistant, offering an interactive and user-friendly experience similar to an advanced digital kiosk.
16+
17+
This kit uses the following technology stack:
18+
- [OpenVINO Toolkit](https://www.intel.com/content/www/us/en/developer/tools/openvino-toolkit/overview.html) ([docs](https://docs.openvino.ai/))
19+
- [Qwen2-7B-Instruct](https://huggingface.co/Qwen/Qwen2-7B-Instruct)
20+
- [bge-large-en-v1.5](https://huggingface.co/BAAI/bge-large-en-v1.5)
21+
- [Gradio interface](https://www.gradio.app/docs/gradio/chatinterface)
22+
23+
Check out our [AI Reference Kits repository](/) for other kits.
24+
25+
![agentic_llm_rag](https://github.com/user-attachments/assets/0471ab91-ded5-4a5f-8d8e-5432f1b4b45c)
26+
27+
<details open><summary><b>Table of Contents</b></summary>
28+
29+
- [Getting Started](#get-started)
30+
- [Installing Prerequisites](#install-prerequisites)
31+
- [Setting Up Your Environment](#set-up-your-environment)
32+
- [Converting and Optimizing the Model](*convert-and-optimize-the-model)
33+
- [Running the Application](#run-the-application)
34+
- [Additional Resources](#additional-resources)
35+
36+
</details>
37+
38+
# Getting Started
39+
40+
To get started with the AI Insight Agent with RAG, you install Python, set up your environment, and then you can run the application. We recommend using Ubuntu 24.04 to set up and run this project.
41+
42+
## Installing Prerequisites
43+
44+
This project requires Python 3.8 or higher and a few libraries. If you don't already have Python installed on your machine, go to [https://www.python.org/downloads/](https://www.python.org/downloads/) and download the latest version for your operating system. Follow the prompts to install Python, and make sure to select the option to add Python to your PATH environment variable.
45+
46+
To install the Python libraries and tools, run this command:
47+
48+
```shell
49+
sudo apt install git gcc python3-venv python3-dev
50+
```
51+
52+
_NOTE: If you are using Windows, you might also have to install [Microsoft Visual C++ Redistributable](https://aka.ms/vs/16/release/vc_redist.x64.exe)._
53+
54+
## Setting Up Your Environment
55+
56+
To set up your environment, you first clone the repository, then create a virtual environment, activate the environment, and install the packages.
57+
58+
### Clone the Repository
59+
60+
To clone the repository, run this command:
61+
62+
```shell
63+
git clone https://github.com/openvinotoolkit/openvino_build_deploy.git
64+
```
65+
66+
This command clones the repository into a directory named "openvino_build_deploy" in the current directory. After the directory is cloned, run the following command to go to that directory:
67+
68+
69+
```shell
70+
cd openvino_build_deploy/ai_ref_kits/agentic_llm_rag
71+
```
72+
73+
### Create a Virtual Environment
74+
75+
To create a virtual environment, open your terminal or command prompt, and go to the directory where you want to create the environment.
76+
77+
Run the following command:
78+
79+
```shell
80+
python3 -m venv venv
81+
```
82+
This creates a new virtual environment named "venv" in the current directory.
83+
84+
### Activate the Environment
85+
86+
The command you run to activate the virtual environment you created depends on whether you have a Unix-based operating system (Linux or macOS) or a Windows operating system.
87+
88+
To activate the virtual environment for a **Unix-based** operating system, run:
89+
90+
```shell
91+
source venv/bin/activate # For Unix-based operating systems such as Linux or macOS
92+
```
93+
94+
To activate the virtual environment for a **Windows** operating system, run:
95+
96+
```shell
97+
venv\Scripts\activate # This command is for Windows operating systems
98+
```
99+
This activates the virtual environment and changes your shell's prompt to indicate that you are now working in that environment.
100+
101+
### Install the Packages
102+
103+
To install the required packages, run the following commands:
104+
105+
```shell
106+
python -m pip install --upgrade pip
107+
pip install -r requirements.txt
108+
```
109+
## Converting and Optimizing the Model
110+
111+
The application uses 2 separate models. Each model requires conversion and optimization for use with OpenVINO™. The following process includes a step to convert and optimize each model.
112+
113+
_NOTE: This reference kit requires more than 8GB of bandwidth and disk space for downloading models. Because of the large model size, when you run the kit for the first time, the conversion can take more than two hours and require more than 32GB of memory. After the first run, the subsequent runs should finish much faster._
114+
115+
## Chat Model and Embedding Model Conversion
116+
117+
The _chat model_ is the core of the chatbot's ability to generate meaningful and context-aware responses.
118+
119+
The _embedding model_ represents text data (both user queries and potential responses or knowledge base entries) as numerical vectors. These vectors are essential for tasks such as semantic search and similarity matching.
120+
121+
This conversion script handles the conversion and optimization of:
122+
123+
- The chat model (`qwen2-7B`) with `int4` precision.
124+
- The embedding model (`bge-large`) with `FP32` precision.
125+
126+
After the models are converted, they’re saved to the model directory you specify when you run the script.
127+
128+
_Requests can take up to one hour to process._
129+
130+
To convert the chat and embedding models, run:
131+
```shell
132+
python convert_and_optimize_llm.py --chat_model_type qwen2-7B --embedding_model_type bge-large --precision int4 --model_dir model
133+
```
134+
135+
If using gated models from HuggingFace pass the `--hf_token` argument with your HuggingFace token. Remember to request access to gated models if needed.
136+
137+
After you run the conversion scripts, you can run `main.py` to launch the application.
138+
139+
## Running the Application (Gradio Interface)
140+
141+
To run the AI Insight Agent with RAG application, you execute the following python script. Make sure to include all of the necessary model directory arguments.
142+
143+
_NOTE: This application requires more than 16GB of memory because the models are very large (especially the chatbot model). If you have a less powerful device, the application might also run slowly._
144+
145+
After that, you should be able to run the application with default values:
146+
147+
```shell
148+
python main.py
149+
```
150+
151+
For more settings, you can change the argument values:
152+
153+
- `--chat_model`: The path to your chat model directory (for example, `model/qwen2-7B-INT4`) that drives conversation flow and response generation.
154+
155+
- `--rag_pdf`: The path to the document (for example, `data/test_painting_llm_rag.pdf`) that contains additional knowledge for Retrieval-Augmented Generation (RAG).
156+
157+
- `--embedding_model`: The path to your embedding model directory (for example, `model/bge-small-FP32`) for understanding and matching text inputs.
158+
159+
- `--device`: Include this flag to select the inference device for both models. (for example, `CPU`). If you have access to a dedicated GPU (ARC, Flex), you can change the value to `GPU.1`. Possible values: `CPU,GPU,GPU.1,NPU`
160+
161+
- `--public`: Include this flag to make the Gradio interface publicly accessible over the network. Without this flag, the interface will only be available on your local machine.
162+
163+
To run the application, execute the `main.py` script with the following command. Make sure to include all necessary model directory arguments.
164+
```shell
165+
python main.py \
166+
--chat_model model/qwen2-7B-INT4 \
167+
--embedding_model model/bge-small-FP32 \
168+
--rag_pdf data/test_painting_llm_rag.pdf \
169+
--device GPU.1 \
170+
--public
171+
```
172+
173+
### System Prompt Usage in LlamaIndex ReActAgent
174+
175+
The LlamaIndex ReActAgent library relies on a default system prompt that provides essential instructions to the LLM for correctly interacting with available tools. This prompt is fundamental for enabling both tool usage and RAG (Retrieval-Augmented Generation) queries.
176+
177+
#### Important:
178+
Do not override or modify the default system prompt. Altering it may prevent the LLM from using the tools or executing RAG queries properly.
179+
180+
#### Customizing the Prompt:
181+
If you need to add extra rules or custom behavior, modify the Additional Rules section located in the system_prompt.py file.
182+
183+
### Use the Web Interface
184+
After the script runs, Gradio provides a local URL (typically `http://127.0.0.1:XXXX`) that you can open in your web browser to interact with the assistant. If you configured the application to be accessible publicly, Gradio also provides a public URL.
185+
186+
#### Test the Application
187+
When you test the AI Insight Agent with RAG application, you can test both the interaction with the agent and the product selection capabilities.
188+
189+
1. Open a web browers and go to the Gradio-provided URL.
190+
_For example, `http://127.0.0.1:XXXX`._
191+
2. Test text interaction with the application.
192+
- Type your question in the text box and press **Enter**.
193+
_The assistant responds to your question in text form._
194+
195+
For further testing of the AI Insight Agent with RAG appplication, you can engage with the chatbot assistant by asking it questions, or giving it commands that align with the assistant's capabilities. This hands-on experience can help you to understand the assistant's interactive quality and performance.
196+
197+
Enjoy exploring the capabilities of your AI Insight Agent with RAG appplication!
198+
199+
# Additional Resources
200+
- Learn more about [OpenVINO](https://www.intel.com/content/www/us/en/developer/tools/openvino-toolkit/overview.html)
201+
- Explore [OpenVINO’s documentation](https://docs.openvino.ai/2024/home.html)
202+
203+
<p align="right"><a href="#top">Back to top ⬆️</a></p>

0 commit comments

Comments
 (0)