On the governance side, one question I’d be excited to see Apollo (and ARC evals & any other similar groups) think/write about is: what happens after a dangerous capability eval goes off?
Of course, the actual answer will be shaped by the particular climate/culture/zeitgeist/policy window/lab factors that are impossible to fully predict in advance.
But my impression is that this question is relatively neglected, and I wouldn’t be surprised if sharp newcomers were able to meaningfully improve the community’s thinking on this.
Congratulations on launching!
On the governance side, one question I’d be excited to see Apollo (and ARC evals & any other similar groups) think/write about is: what happens after a dangerous capability eval goes off?
Of course, the actual answer will be shaped by the particular climate/culture/zeitgeist/policy window/lab factors that are impossible to fully predict in advance.
But my impression is that this question is relatively neglected, and I wouldn’t be surprised if sharp newcomers were able to meaningfully improve the community’s thinking on this.
Thanks Akash!
I agree that this feels neglected.
Markus Anderljung recently tweeted about some upcoming related work from Jide Alaga and Jonas Schuett: https://twitter.com/Manderljung/status/1663700498288115712
Looking forward to it coming out!