generic-RAG-demo

A Sogeti Nederland generic RAG demo

Getting started

Installation of system dependencies

Unstructered PDF loader (optional)

If you would like to run the application with the unstructered PDF loader, the application requires system dependencies. The two currently used:

sudo apt install poppler-utils tesseract-ocr

and run the generic RAG demo with the --unstructured-pdf flag.

For more information please refer to the langchain docs.

Local LLM (optional)

The application supports running a local LLM, using Ollama.

To install Ollama, please run following commands

curl -fsSL https://ollama.com/install.sh | sh  # install Ollama
ollama pull llama3.1:8b  # fetch and dowload specific model

Include the model in the .env file:

LOCAL_CHAT_MODEL="llama3.1:8b"
LOCAL_EMB_MODEL="llama3.1:8b"

And run the generic RAG demo with the -b local flag.

For more information on installing Ollama, please refer to the Langchain Local LLM documentation, specifically the Quickstart section.

Running generic RAG demo

Please mind due to use of argparse the generic RAG demo can not be launched the way chainlit documentation recommends.

chainlit run generic_rag/app.py  # will not work

Instead, the app can be launched and debugged the usual way.

python generic_rag/app.py -p data  # will work and parsers all pdf files in ./data
python generic_rag/app.py --help  # will work and prints command line options

Please configure your .env file with your cloud provider (backend) of choice and set the --backend flag accordingly.

.env file

A .env file needs to be populated to configure API end-points or local back-ends using environment variables. Currently all required environment variables are defined in code at backend/models.py with the exception of the API key variables itself. More information about configuring API endpoints for langchain can be found at the following locations.

for local models we currently use Ollama

An .env example is as followed.

# only one backend (azure, google, local, etc) is required. Please addjust the --backend flag accordingly

AZURE_OPENAI_API_KEY="<secret_key>"
AZURE_LLM_ENDPOINT="https://<project_hub>.openai.azure.com"
AZURE_LLM_DEPLOYMENT_NAME="gpt-4"
AZURE_LLM_API_VERSION="2025-01-01-preview"
AZURE_EMB_ENDPOINT="https://<project_hub>.openai.azure.com"
AZURE_EMB_DEPLOYMENT_NAME="text-embedding-3-large"
AZURE_EMB_API_VERSION="2023-05-15"

LOCAL_CHAT_MODEL="llama3.1:8b"
LOCAL_EMB_MODEL="llama3.1:8b"

# google vertex AI does not use API keys but a seperate authentication method
GOOGLE_GENAI_CHAT_MODEL="gemini-2.0-flash"
GOOGLE_GENAI_EMB_MODEL="models/text-embedding-004"

Chainlit starters

Chainlit suggestions (starters) can be set with the CHAINLIT_STARTERS environment variable. The variable should be a JSON array of objects with label and message properties. An example is as followed.

CHAINLIT_STARTERS=[{"label":"Label 1","message":"Message one."},{"label":"Label 2","message":"Message two."},{"label":"Label 3","message":"Message three."}]

Dev details

Linting

Currently Ruff is used as Python linter. It is included in the pyproject.toml as dev dependency if your IDE needs that. However, for VS Code a Ruff extension excists.

4.0 KiB Raw Blame History