5.6 KiB
generic-RAG-demo
A generic Retrieval Augmented Generation (RAG) demo from Sogeti Netherlands built in Python. This project demonstrates how to integrate and run different backends, from cloud providers to local models, to parse and process your PDFs, web data, or other text sources.
Table of Contents
Features
- Multi-backend Support: Easily switch between cloud-based and local LLMs.
- Flexible Data Input: Supports both PDFs and web data ingestion.
- Configurable Workflows: Customize settings via a central
config.yamlfile.
Getting started
Project Environment Setup
This project leverages a modern packaging method defined in pyproject.toml. After cloning the repository, you can install the project along with its dependencies. You have two options:
- Using uv
If you're using uv, simply run:
uv install
- Using a Python Virtual Environment
Alternatively, set up a virtual environment and install the project:
python -m venv .venv # Create a new virtual environment named ".venv"
source .venv/bin/activate # Activate the virtual environment (use ".venv\Scripts\activate" on Windows)
pip install . # Install the project and its dependencies
Installation of system dependencies
Some optional features require additional system applications to be installed.
Unstructered PDF loader (optional)
If you would like to run the application using the unstructered PDF loader (pdf.unstructured setting) you need to install two system dependencies.
sudo apt install poppler-utils tesseract-ocr
For more information please refer to the langchain docs.
Local LLM (optional)
If you would like to run the application using a local LLM backend (local settings), you need to install Ollama.
curl -fsSL https://ollama.com/install.sh | sh # install Ollama
ollama pull llama3.1:8b # fetch and download as model
Include the downloaded model in the config.yaml file:
local:
chat_model: "llama3.1:8b"
emb_model: "llama3.1:8b"
For more information on installing Ollama, please refer to the Langchain Local LLM documentation, specifically the Quickstart section.
Running generic RAG demo
Please mind due to use of argparse the generic RAG demo can not be launched the way chainlit documentation recommends.
chainlit run generic_rag/app.py # will not work
Instead, the app can be launched and debugged the usual way.
python generic_rag/app.py -p data # will work and parsers all pdf files in ./data
python generic_rag/app.py --help # will work and prints command line options
Please configure your config.yaml and .env file with your cloud provider (backend) of choice. See the sections below for more details.
config.yaml file
A config.yaml file is required to specify your API endpoints and local backends. Use the provided config.yaml.example as a starting point. Update the file according to your backend settings and project requirements.
Key configuration points include:
- Chat Backend: Choose among azure, openai, google_vertex, aws, or local.
- Embedding Backend: Configure the embedding models similarly.
- Data Processing Settings: Define PDF and web data sources, chunk sizes, and overlap.
- Vector Database: Customize the path and reset behavior.
For more information on configuring Langchain endpoints and models, please see:
for local models we currently use Ollama
.env file
Set the API keys for your chosen cloud provider (backend). This ensures that your application can authenticate and interact with the services.
AZURE_OPENAI_API_KEY=your_azure_api_key
OPENAI_API_KEY=your_openai_api_key
Chainlit starters
Chainlit suggestions (starters) can be set with the CHAINLIT_STARTERS environment variable.
The variable should be a JSON array of objects with label and message properties.
An example is as followed.
CHAINLIT_STARTERS=[{"label":"Label 1","message":"Message one."},{"label":"Label 2","message":"Message two."},{"label":"Label 3","message":"Message three."}]
Dev details
Linting
Currently Ruff is used as Python linter. It is included in the pyproject.toml as dev dependency if your IDE needs that. However, for VS Code a Ruff extension exists.