4.0 KiB
generic-RAG-demo
A Sogeti Nederland generic RAG demo
Getting started
Installation of system dependencies
Unstructered PDF loader (optional)
If you would like to run the application with the unstructered PDF loader, the application requires system dependencies. The two currently used:
sudo apt install poppler-utils tesseract-ocr
and run the generic RAG demo with the --unstructured-pdf flag.
For more information please refer to the langchain docs.
Local LLM (optional)
The application supports running a local LLM, using Ollama.
To install Ollama, please run following commands
curl -fsSL https://ollama.com/install.sh | sh # install Ollama
ollama pull llama3.1:8b # fetch and dowload specific model
Include the model in the .env file:
LOCAL_CHAT_MODEL="llama3.1:8b"
LOCAL_EMB_MODEL="llama3.1:8b"
And run the generic RAG demo with the -b local flag.
For more information on installing Ollama, please refer to the Langchain Local LLM documentation, specifically the Quickstart section.
Running generic RAG demo
Please mind due to use of argparse the generic RAG demo can not be launched the way chainlit documentation recommends.
chainlit run generic_rag/app.py # will not work
Instead, the app can be launched and debugged the usual way.
python generic_rag/app.py -p data # will work and parsers all pdf files in ./data
python generic_rag/app.py --help # will work and prints command line options
Please configure your .env file with your cloud provider (backend) of choice and set the --backend flag accordingly.
.env file
A .env file needs to be populated to configure API end-points or local back-ends using environment variables. Currently all required environment variables are defined in code at backend/models.py with the exception of the API key variables itself. More information about configuring API endpoints for langchain can be found at the following locations.
for local models we currently use Ollama
An .env example is as followed.
# only one backend (azure, google, local, etc) is required. Please addjust the --backend flag accordingly
AZURE_OPENAI_API_KEY="<secret_key>"
AZURE_LLM_ENDPOINT="https://<project_hub>.openai.azure.com"
AZURE_LLM_DEPLOYMENT_NAME="gpt-4"
AZURE_LLM_API_VERSION="2025-01-01-preview"
AZURE_EMB_ENDPOINT="https://<project_hub>.openai.azure.com"
AZURE_EMB_DEPLOYMENT_NAME="text-embedding-3-large"
AZURE_EMB_API_VERSION="2023-05-15"
LOCAL_CHAT_MODEL="llama3.1:8b"
LOCAL_EMB_MODEL="llama3.1:8b"
# google vertex AI does not use API keys but a seperate authentication method
GOOGLE_GENAI_CHAT_MODEL="gemini-2.0-flash"
GOOGLE_GENAI_EMB_MODEL="models/text-embedding-004"
Chainlit starters
Chainlit suggestions (starters) can be set with the CHAINLIT_STARTERS environment variable.
The variable should be a JSON array of objects with label and message properties.
An example is as followed.
CHAINLIT_STARTERS=[{"label":"Label 1","message":"Message one."},{"label":"Label 2","message":"Message two."},{"label":"Label 3","message":"Message three."}]
Dev details
Linting
Currently Ruff is used as Python linter. It is included in the pyproject.toml as dev dependency if your IDE needs that. However, for VS Code a Ruff extension excists.