Zach Stein-Perlman comments on On DeepMind’s Frontier Safety Framework

Zach Stein-Perlman 19 Jun 2024 0:40 UTC
2 points
0
The obviously missing category is Persuasion. In the DeepMind paper on evaluating dangerous capabilities persuasion was included, and it was evaluated for Gemini 1.5. So it is strange to see it missing here. I presume this will be fixed.
I believe persuasion shouldn’t be a priority on current margins, and I’d guess DeepMind’s frontier safety team thinks similarly. R&D, autonomy, cyber, and maybe CBRN capabilities are much more likely to enable extreme risks, it seems to me (and especially for the next few years, which is what current evals should focus on).