forked from AI_team/Philosophy-RAG-demo
Extend readme documentation
This commit is contained in:
parent
67d681fcc4
commit
674220f442
71
README.md
71
README.md
@ -2,9 +2,51 @@
|
|||||||
|
|
||||||
A Sogeti Nederland generic RAG demo
|
A Sogeti Nederland generic RAG demo
|
||||||
|
|
||||||
## Getting Started
|
## Getting started
|
||||||
|
|
||||||
Please mind due to use of argparse the generic RAG demo can not be launched the usual way chainlit is started.
|
### Installation of system dependencies
|
||||||
|
|
||||||
|
#### Unstructered PDF loader (optional)
|
||||||
|
|
||||||
|
If you would like to run the application with the unstructered PDF loader, the application requires system dependencies.
|
||||||
|
The two currently used:
|
||||||
|
|
||||||
|
- [poppler-utils](https://launchpad.net/ubuntu/jammy/amd64/poppler-utils)
|
||||||
|
- [tesseract-ocr](https://github.com/tesseract-ocr/tesseract?tab=readme-ov-file#installing-tesseract)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
sudo apt install poppler-utils tesseract-ocr
|
||||||
|
```
|
||||||
|
|
||||||
|
and run the generic RAG demo with the `--unstructured-pdf` flag.
|
||||||
|
|
||||||
|
> For more information please refer to the [langchain docs.](https://python.langchain.com/docs/integrations/providers/unstructured/)
|
||||||
|
|
||||||
|
#### Local LLM (optional)
|
||||||
|
|
||||||
|
The application supports running a local LLM, using Ollama.
|
||||||
|
|
||||||
|
To install Ollama, please run following commands
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl -fsSL https://ollama.com/install.sh | sh # install Ollama
|
||||||
|
ollama pull llama3.1:8b # fetch and dowload specific model
|
||||||
|
```
|
||||||
|
|
||||||
|
Include the model in the `.env` file:
|
||||||
|
|
||||||
|
```text
|
||||||
|
LOCAL_CHAT_MODEL="llama3.1:8b"
|
||||||
|
LOCAL_EMB_MODEL="llama3.1:8b"
|
||||||
|
```
|
||||||
|
|
||||||
|
And run the generic RAG demo with the `-b local` flag.
|
||||||
|
|
||||||
|
>For more information on installing Ollama, please refer to the Langchain Local LLM documentation, specifically the [Quickstart section](https://python.langchain.com/docs/how_to/local_llms/#quickstart).
|
||||||
|
|
||||||
|
### Running generic RAG demo
|
||||||
|
|
||||||
|
Please mind due to use of `argparse` the generic RAG demo can not be launched the way `chainlit` documentation recommends.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
chainlit run generic_rag/app.py # will not work
|
chainlit run generic_rag/app.py # will not work
|
||||||
@ -13,28 +55,29 @@ chainlit run generic_rag/app.py # will not work
|
|||||||
Instead, the app can be launched and debugged the usual way.
|
Instead, the app can be launched and debugged the usual way.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
python generic_rag/app.py -p data # will work
|
python generic_rag/app.py -p data # will work and parsers all pdf files in ./data
|
||||||
python generic_rag/app.py --help # will work and prints command line options
|
python generic_rag/app.py --help # will work and prints command line options
|
||||||
```
|
```
|
||||||
|
|
||||||
## .env file
|
Please configure your `.env` file with your cloud provider (backend) of choice and set the `--backend` flag accordingly.
|
||||||
|
|
||||||
|
### .env file
|
||||||
|
|
||||||
A .env file needs to be populated to configure API end-points or local back-ends using environment variables.
|
A .env file needs to be populated to configure API end-points or local back-ends using environment variables.
|
||||||
Currently all required environment variables are defined in code at [backend/models.py](generic_rag/backend/models.py)
|
Currently all required environment variables are defined in code at [backend/models.py](generic_rag/backend/models.py)
|
||||||
with the exception of the API key variables itself.
|
with the exception of the API key variables itself.
|
||||||
More information about configuring API endpoints for langchain can be found at the following locations.
|
More information about configuring API endpoints for langchain can be found at the following locations.
|
||||||
|
|
||||||
- [langchain cloud chat model doc](https://python.langchain.com/docs/integrations/chat/)
|
- [langchain cloud chat model doc](https://python.langchain.com/docs/integrations/chat/)
|
||||||
- [langchain local chat model doc](https://python.langchain.com/docs/how_to/local_llms/)
|
- [langchain local chat model doc](https://python.langchain.com/docs/how_to/local_llms/)
|
||||||
- [langchain cloud/local emb model doc](https://python.langchain.com/docs/integrations/text_embedding/)
|
- [langchain cloud/local emb model doc](https://python.langchain.com/docs/integrations/text_embedding/)
|
||||||
|
|
||||||
> for local models we currently use Ollama
|
> for local models we currently use Ollama
|
||||||
|
|
||||||
|
|
||||||
An `.env` example is as followed.
|
An `.env` example is as followed.
|
||||||
|
|
||||||
```text
|
```text
|
||||||
# only need 1 backend (azure, google, local, etc)
|
# only one backend (azure, google, local, etc) is required. Please addjust the --backend flag accordingly
|
||||||
|
|
||||||
AZURE_OPENAI_API_KEY="<secret_key>"
|
AZURE_OPENAI_API_KEY="<secret_key>"
|
||||||
AZURE_LLM_ENDPOINT="https://<project_hub>.openai.azure.com"
|
AZURE_LLM_ENDPOINT="https://<project_hub>.openai.azure.com"
|
||||||
@ -52,12 +95,18 @@ GOOGLE_GENAI_CHAT_MODEL="gemini-2.0-flash"
|
|||||||
GOOGLE_GENAI_EMB_MODEL="models/text-embedding-004"
|
GOOGLE_GENAI_EMB_MODEL="models/text-embedding-004"
|
||||||
```
|
```
|
||||||
|
|
||||||
## Chainlit starters
|
### Chainlit starters
|
||||||
|
|
||||||
Chainlit suggestions (starters) can be set with the `CHAINLIT_STARTERS` environment variable.
|
Chainlit suggestions (starters) can be set with the `CHAINLIT_STARTERS` environment variable.
|
||||||
The variable should be a JSON array of objects with `label` and `message` properties.
|
The variable should be a JSON array of objects with `label` and `message` properties.
|
||||||
An example is as followed.
|
An example is as followed.
|
||||||
|
|
||||||
```
|
```text
|
||||||
CHAINLIT_STARTERS=[{"label":"Label 1","message":"Message one."},{"label":"Label 2","message":"Message two."},{"label":"Label 3","message":"Message three."}]
|
CHAINLIT_STARTERS=[{"label":"Label 1","message":"Message one."},{"label":"Label 2","message":"Message two."},{"label":"Label 3","message":"Message three."}]
|
||||||
```
|
```
|
||||||
|
|
||||||
|
## Dev details
|
||||||
|
|
||||||
|
### Linting
|
||||||
|
|
||||||
|
Currently [Ruff](https://github.com/astral-sh/ruff) is used as Python linter. It is included in the [pyproject.toml](pyproject.toml) as `dev` dependency if your IDE needs that. However, for VS Code a [Ruff extension](https://marketplace.visualstudio.com/items?itemName=charliermarsh.ruff) excists.
|
||||||
|
|||||||
Loading…
Reference in New Issue
Block a user