Philosophy-RAG-demo/README.md

# generic-RAG-demo

A generic Retrieval Augmented Generation (RAG) demo from Sogeti Netherlands built in Python. This project demonstrates how to integrate and run different backends, from cloud providers to local models, to parse and process your PDFs, web data, or other text sources.

## Table of Contents

- [generic-RAG-demo](#generic-rag-demo)
  - [Table of Contents](#table-of-contents)
  - [Features](#features)
  - [Getting started](#getting-started)
    - [Project Environment Setup](#project-environment-setup)
    - [Installation of system dependencies](#installation-of-system-dependencies)
      - [Unstructered PDF loader (optional)](#unstructered-pdf-loader-optional)
      - [Local LLM (optional)](#local-llm-optional)
    - [Running generic RAG demo](#running-generic-rag-demo)
    - [config.yaml file](#configyaml-file)
    - [Chainlit starters](#chainlit-starters)
  - [Dev details](#dev-details)
    - [Linting](#linting)

## Features

- **Multi-backend Support:** Easily switch between cloud-based and local LLMs.
- **Flexible Data Input:** Supports both PDFs and web data ingestion.
- **Configurable Workflows:** Customize settings via a central `config.yaml` file.

## Getting started

### Project Environment Setup

This project leverages a modern packaging method defined in `pyproject.toml`. After cloning the repository, you can install the project along with its dependencies. You have two options:
1. Using uv

If you're using uv, simply run:
```bash
uv install
```

2. Using a Python Virtual Environment

Alternatively, set up a virtual environment and install the project:
```bash
python -m venv .venv        # Create a new virtual environment named ".venv"
source .venv/bin/activate   # Activate the virtual environment (use ".venv\Scripts\activate" on Windows)
pip install .              # Install the project and its dependencies
```

### Installation of system dependencies

Some optional features require additional system applications to be installed.

#### Unstructered PDF loader (optional)

If you would like to run the application using the unstructered PDF loader (`pdf.unstructured` setting) you need to install two system dependencies.

- [poppler-utils](https://launchpad.net/ubuntu/jammy/amd64/poppler-utils)
- [tesseract-ocr](https://github.com/tesseract-ocr/tesseract?tab=readme-ov-file#installing-tesseract)

```bash
sudo apt install poppler-utils tesseract-ocr
```

> For more information please refer to the [langchain docs.](https://python.langchain.com/docs/integrations/providers/unstructured/)

#### Local LLM (optional)

If you would like to run the application using a local LLM backend (`local` settings), you need to install Ollama.

```bash
curl -fsSL https://ollama.com/install.sh | sh  # install Ollama
ollama pull llama3.1:8b  # fetch and download as model
```

Include the downloaded model in the `config.yaml` file:

```yaml
local:
    chat_model: "llama3.1:8b"
    emb_model: "llama3.1:8b"
```

>For more information on installing Ollama, please refer to the Langchain Local LLM documentation, specifically the [Quickstart section](https://python.langchain.com/docs/how_to/local_llms/#quickstart).

### Running generic RAG demo

Please mind due to use of `argparse` the generic RAG demo can not be launched the way `chainlit` documentation recommends.

```bash
chainlit run generic_rag/app.py  # will not work
```

Instead, the app can be launched and debugged the usual way.

```bash
python generic_rag/app.py -p data  # will work and parsers all pdf files in ./data
python generic_rag/app.py --help  # will work and prints command line options
```

Please configure your `config.yaml` file with your cloud provider (backend) of choice. See the `config.example.yaml` file as a starting point that holds all possible options.

### config.yaml file

A config.yaml file is required to specify your API endpoints, local backends, and environment variables. Use the provided config.yaml.example as a starting point. Update the file according to your backend settings and project requirements.

Key configuration points include:
- Chat Backend: Choose among azure, openai, google_vertex, aws, or local.
- Embedding Backend: Configure the embedding models similarly.
- Data Processing Settings: Define PDF and web data sources, chunk sizes, and overlap.
- Vector Database: Customize the path and reset behavior.

For more information on configuring Langchain endpoints and models, please see:

- [langchain cloud chat model doc](https://python.langchain.com/docs/integrations/chat/)
- [langchain local chat model doc](https://python.langchain.com/docs/how_to/local_llms/)
- [langchain cloud/local emb model doc](https://python.langchain.com/docs/integrations/text_embedding/)

> for local models we currently use Ollama

### Chainlit starters

Chainlit suggestions (starters) can be set with the `CHAINLIT_STARTERS` environment variable.
The variable should be a JSON array of objects with `label` and `message` properties.
An example is as followed.

```text
CHAINLIT_STARTERS=[{"label":"Label 1","message":"Message one."},{"label":"Label 2","message":"Message two."},{"label":"Label 3","message":"Message three."}]
```

## Dev details

### Linting

Currently [Ruff](https://github.com/astral-sh/ruff) is used as Python linter. It is included in the [pyproject.toml](pyproject.toml) as `dev` dependency if your IDE needs that. However, for VS Code a [Ruff extension](https://marketplace.visualstudio.com/items?itemName=charliermarsh.ruff) exists.