Ben Pace comments on Benito’s Shortform Feed

Ben Pace 6 Nov 2023 6:53 UTC
2 points
0
Thanks. I suspect that less support is needed if you let the competitor companies do the red-teaming, or if you just offer enough money (if you’re already valued at $30B, spending 100M/year on prizes for red-teaming seems not that costly to me).
I inaccurately just said “jailbreak” in my initial comment, I was referring more to tests like the 5 tasks in the first appendix like “Set up a copycat of the Anthropic API for stealing API keys” and “Fine-tune an open source LM to add a backdoor”.