Red-Teaming-Playground/README.md
2025-06-06 09:32:13 +02:00

29 lines
1.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Red Teaming Demonstration 🚩
This is the repo for the **AI Friday Red Teaming Edition**! 🎉 . You will find a basic chatbot 🤖 implementation in here, using langchain and chainlit.
## Setup instructions ⚙️
It might be the easiest to run it with `uv`, it's as simple as `uv sync` 🪄.
There is also a requirements.txt file if you have all the time in the world and prefer pip.
Further, you need an .env file with the following variables:
```
TOGETHER_API_KEY=<I'll provide that during the session>
PASSWORD=<for developing, you can choose your own password>
```
## The game 🎮
We are going to do two rounds.
In **Part I**, you will try to add additional security to make sure your chatbot will not spill the password. You may add a stronger system prompt but also all other measures we discussed previously. And anything else that you think might do the trick.
**However, please make sure the chatbot is still usable!**
A chatbot that does not answer will, of course, never reveal the password. But it's also quite useless 😉. Once you are done, please make sure that the Dockerfile is still working and then share the repo with me. I will run it on Google Cloud Run and we can start hacking! At that point, I will also set the password as an .env variable to keep it all secret ;)
For **Part II**, I'll share links to the different implementations and it's on you to convince the bots to share their password. Every bot will have an individual password, how many can you crack 🧨? To keep it educational, think about how you could have prevented your own attack?