Ruben Lucas 74dd3b6947 🎨 Add .env only for API keys

2025-04-18 11:42:40 +02:00

5.6 KiB

Raw Blame History

generic-RAG-demo

A generic Retrieval Augmented Generation (RAG) demo from Sogeti Netherlands built in Python. This project demonstrates how to integrate and run different backends, from cloud providers to local models, to parse and process your PDFs, web data, or other text sources.

generic-RAG-demo

Features

Multi-backend Support: Easily switch between cloud-based and local LLMs.
Flexible Data Input: Supports both PDFs and web data ingestion.
Configurable Workflows: Customize settings via a central config.yaml file.

Getting started

Project Environment Setup

This project leverages a modern packaging method defined in pyproject.toml. After cloning the repository, you can install the project along with its dependencies. You have two options:

Using uv

If you're using uv, simply run:

uv install

Using a Python Virtual Environment

Alternatively, set up a virtual environment and install the project:

python -m venv .venv        # Create a new virtual environment named ".venv"
source .venv/bin/activate   # Activate the virtual environment (use ".venv\Scripts\activate" on Windows)
pip install .              # Install the project and its dependencies

Installation of system dependencies

Some optional features require additional system applications to be installed.

Unstructered PDF loader (optional)

If you would like to run the application using the unstructered PDF loader (pdf.unstructured setting) you need to install two system dependencies.

sudo apt install poppler-utils tesseract-ocr

For more information please refer to the langchain docs.

Local LLM (optional)

If you would like to run the application using a local LLM backend (local settings), you need to install Ollama.

curl -fsSL https://ollama.com/install.sh | sh  # install Ollama
ollama pull llama3.1:8b  # fetch and download as model

Include the downloaded model in the config.yaml file:

local:
    chat_model: "llama3.1:8b"
    emb_model: "llama3.1:8b"

For more information on installing Ollama, please refer to the Langchain Local LLM documentation, specifically the Quickstart section.

Running generic RAG demo

Please mind due to use of argparse the generic RAG demo can not be launched the way chainlit documentation recommends.

chainlit run generic_rag/app.py  # will not work

Instead, the app can be launched and debugged the usual way.

python generic_rag/app.py -p data  # will work and parsers all pdf files in ./data
python generic_rag/app.py --help  # will work and prints command line options

Please configure your config.yaml and .env file with your cloud provider (backend) of choice. See the sections below for more details.

config.yaml file

A config.yaml file is required to specify your API endpoints and local backends. Use the provided config.yaml.example as a starting point. Update the file according to your backend settings and project requirements.

Key configuration points include:

Chat Backend: Choose among azure, openai, google_vertex, aws, or local.
Embedding Backend: Configure the embedding models similarly.
Data Processing Settings: Define PDF and web data sources, chunk sizes, and overlap.
Vector Database: Customize the path and reset behavior.

For more information on configuring Langchain endpoints and models, please see:

for local models we currently use Ollama

.env file

Set the API keys for your chosen cloud provider (backend). This ensures that your application can authenticate and interact with the services.

AZURE_OPENAI_API_KEY=your_azure_api_key
OPENAI_API_KEY=your_openai_api_key

Chainlit starters

Chainlit suggestions (starters) can be set with the CHAINLIT_STARTERS environment variable. The variable should be a JSON array of objects with label and message properties. An example is as followed.

CHAINLIT_STARTERS=[{"label":"Label 1","message":"Message one."},{"label":"Label 2","message":"Message two."},{"label":"Label 3","message":"Message three."}]

Dev details

Linting

Currently Ruff is used as Python linter. It is included in the pyproject.toml as dev dependency if your IDE needs that. However, for VS Code a Ruff extension exists.

5.6 KiB Raw Blame History