add readme
This commit is contained in:
parent
66e8aeb99d
commit
0ca68198e3
28
README.md
Normal file
28
README.md
Normal file
@ -0,0 +1,28 @@
|
|||||||
|
# Red Teaming Demonstration 🚩
|
||||||
|
|
||||||
|
This is the repo for the **AI Friday – Red Teaming Edition**! 🎉 . You will find a basic chatbot 🤖 implementation in here, using langchain and chainlit.
|
||||||
|
|
||||||
|
## Setup instructions ⚙️
|
||||||
|
|
||||||
|
It might be the easiest to run it with `uv`, it's as simple as `uv sync` 🪄.
|
||||||
|
|
||||||
|
There is also a requirements.txt file if you have all the time in the world and prefer pip.
|
||||||
|
|
||||||
|
Further, you need an .env file with the following variables:
|
||||||
|
|
||||||
|
```
|
||||||
|
TOGETHER_API_KEY=<I'll provide that during the session>
|
||||||
|
PASSWORD=<for developing, you can choose your own password>
|
||||||
|
```
|
||||||
|
|
||||||
|
## The game 🎮
|
||||||
|
|
||||||
|
We are going to do two rounds.
|
||||||
|
|
||||||
|
In **Part I**, you will try to add additional security to make sure your chatbot will not spill the password. You may add a stronger system prompt but also all other measures we discussed previously. And anything else that you think might do the trick.
|
||||||
|
|
||||||
|
**However, please make sure the chatbot is still usable!**
|
||||||
|
|
||||||
|
A chatbot that does not answer will, of course, never reveal the password. But it's also quite useless 😉. Once you are done, please make sure that the Dockerfile is still working and then share the repo with me. I will run it on Google Cloud Run and we can start hacking! At that point, I will also set the password as an .env variable to keep it all secret ;)
|
||||||
|
|
||||||
|
For **Part II**, I'll share links to the different implementations and it's on you to convince the bots to share their password. Every bot will have an individual password, how many can you crack 🧨? To keep it educational, think about how you could have prevented your own attack?
|
||||||
Loading…
Reference in New Issue
Block a user