Out of curiosity, do you have any thoughts on the importance / feasibility of formal verification / mathematically “provable” safety based approaches in these evals you mention?
No. But I’m skeptical: seems hard to imagine provable safety, much less competitive with the default path to powerful AI, much less how post-hoc evals are relevant.
Out of curiosity, do you have any thoughts on the importance / feasibility of formal verification / mathematically “provable” safety based approaches in these evals you mention?
No. But I’m skeptical: seems hard to imagine provable safety, much less competitive with the default path to powerful AI, much less how post-hoc evals are relevant.