rajathsalegame comments on Model evals for dangerous capabilities

rajathsalegame 24 Sep 2024 4:47 UTC
1 point
0
Out of curiosity, do you have any thoughts on the importance / feasibility of formal verification / mathematically “provable” safety based approaches in these evals you mention?
- Zach Stein-Perlman 24 Sep 2024 4:51 UTC
  8 points
  7
  Parent
  No. But I’m skeptical: seems hard to imagine provable safety, much less competitive with the default path to powerful AI, much less how post-hoc evals are relevant.