diff --git a/README.md b/README.md new file mode 100644 index 0000000..468e8f9 --- /dev/null +++ b/README.md @@ -0,0 +1,28 @@ +# Red Teaming Demonstration ๐Ÿšฉ + +This is the repo for the **AI Friday โ€“ Red Teaming Edition**! ๐ŸŽ‰ . You will find a basic chatbot ๐Ÿค– implementation in here, using langchain and chainlit. + +## Setup instructions โš™๏ธ + +It might be the easiest to run it with `uv`, it's as simple as `uv sync` ๐Ÿช„. + +There is also a requirements.txt file if you have all the time in the world and prefer pip. + +Further, you need an .env file with the following variables: + +``` +TOGETHER_API_KEY= +PASSWORD= +``` + +## The game ๐ŸŽฎ + +We are going to do two rounds. + +In **Part I**, you will try to add additional security to make sure your chatbot will not spill the password. You may add a stronger system prompt but also all other measures we discussed previously. And anything else that you think might do the trick. + +**However, please make sure the chatbot is still usable!** + +A chatbot that does not answer will, of course, never reveal the password. But it's also quite useless ๐Ÿ˜‰. Once you are done, please make sure that the Dockerfile is still working and then share the repo with me. I will run it on Google Cloud Run and we can start hacking! At that point, I will also set the password as an .env variable to keep it all secret ;) + +For **Part II**, I'll share links to the different implementations and it's on you to convince the bots to share their password. Every bot will have an individual password, how many can you crack ๐Ÿงจ? To keep it educational, think about how you could have prevented your own attack?