β-redex comments on Jailbreaking GPT-4′s code interpreter

β-redex 13 Jul 2023 21:52 UTC
14 points
20

These examples show that, at least in this lower-stakes setting, OpenAI’s current cybersecurity measures on an already-deployed model are insufficient to stop a moderately determined red-teamer.

I… don’t actually see any non-trivial vulnerabilities here? Like, these are stuff you can do on any cloud VM you rent?

Cool exploration though, and it’s certainly interesting that OpenAI is giving you such a powerful VM for free (well actually not because you already pay for GPT-4 I guess?), but I have to agree with their assessment which you found that “it’s expected that you can see and modify files on this system”.