Retrieval-Augmented Generation (RAG) Setup Guide

Overview

The Retrieval-Augmented Generation (RAG) feature helps improve the responses from your application by combining the power of large language models (LLMs) with extra context retrieved from an external data source. Simply put, when you ask a question, the application first searches through a set of relevant documents (stored as embeddings) and then uses this context to provide a more accurate and relevant response. If no relevant context is found, the application returns the LLM response directly. This RAG feature is optional and is disabled by default. If you prefer to use it, simply set the environment variable USE_AZURE_AI_SEARCH_SERVICE to true. Doing so will also trigger the deployment of Azure AI Search resources.

How does RAG work in this application?

In our provided example, the application includes a sample dataset containing information about hiking products. This data was split into sentences, and each sentence was transformed into numerical representations called embeddings. These embeddings were created using OpenAI's text-embedding-3-small model with dimensions=100. The resulting embeddings file (embeddings.csv) is located in the api/data folder.

When you ask a question, the application:

Searches these embeddings for information relevant to your query.
Identifies relevant context if available.
Combines the retrieved context with the LLM to provide a better answer.

If you want to use your own dataset

To create a custom embeddings file with your own data, you can use the provided helper class SearchIndexManager. Below is a straightforward way to build your own embeddings:

from .api.search_index_manager import SearchIndexManager

search_index_manager = SearchIndexManager (
    endpoint=self.search_endpoint,
    credential=your_credentials,
    index_name=index_name,
    dimensions=100,
    model="text-embedding-3-small",
    embeddings_client=embedding_client,
)
await search_index_manager.build_embeddings_file(
    input_directory='data',
    output_file='data/embeddings.csv'
    sentences_per_embedding=4
)

Make sure to replace your_search_endpoint, your_credentials, your_index_name, and embedding_client with your own Azure service details.
Your input data should be placed in the folder specified by input_directory.
sentences_per_embedding parameter, specifies the number of sentences used to construct the embedding. The larger this number, the broader the context that will be identified during the similarity search.

Deploying the Application with RAG enabled

To deploy your application using the RAG feature, set the following environment variables locally: In power shell:

$env:USE_SEARCH_SERVICE="true"
$env:AZURE_AI_SEARCH_INDEX_NAME="index_sample"
$env:AZURE_AI_EMBED_DEPLOYMENT_NAME="text-embedding-3-small"

In bash:

export USE_SEARCH_SERVICE="true"
export AZURE_AI_SEARCH_INDEX_NAME="index_sample"
export AZURE_AI_EMBED_DEPLOYMENT_NAME="text-embedding-3-small"

In cmd:

set USE_SEARCH_SERVICE=true
set AZURE_AI_SEARCH_INDEX_NAME=index_sample
set AZURE_AI_EMBED_DEPLOYMENT_NAME=text-embedding-3-small

USE_SEARCH_SERVICE: Enables (default) or disables RAG.
AZURE_AI_SEARCH_INDEX_NAME: The Azure Search Index the application will use.
AZURE_AI_EMBED_DEPLOYMENT_NAME: The Azure embedding deployment used to create embeddings.

Note: If either AZURE_AI_SEARCH_INDEX_NAME or AZURE_AI_EMBED_DEPLOYMENT_NAME is not provided, or the Azure AI Search service connection is unavailable, the application will run without using the RAG feature.

Creating the Azure Search Index

To utilize RAG, you must have an Azure search index. By default, the application uses index_sample as the index name. You can create an index either by following these official Azure instructions, or programmatically with the provided helper methods:

# Create Azure Search Index (if it does not yet exist)
search_index_manager.ensure_index_created(vector_index_dimensions)

# Upload embeddings to the index
search_index_manager.upload_documents(embeddings_path)

Important: If you have already created the index before deploying your application, the system will skip this step and directly use your existing Azure Search Index. The parameter vector_index_dimensions is only required if dimension information was not already provided when initially constructing the SearchIndexManager object.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files

RAG.md

RAG.md

Retrieval-Augmented Generation (RAG) Setup Guide

Overview

How does RAG work in this application?

If you want to use your own dataset

Deploying the Application with RAG enabled

Creating the Azure Search Index

Files

RAG.md

Latest commit

History

RAG.md

File metadata and controls

Retrieval-Augmented Generation (RAG) Setup Guide

Overview

How does RAG work in this application?

If you want to use your own dataset

Deploying the Application with RAG enabled

Creating the Azure Search Index