Daniel_Eth comments on Twitter thread on AI safety evals

Daniel_Eth 1 Aug 2024 5:25 UTC
7 points
6
evals for things like automated ML R&D are only worrying for people who already believe in AI xrisk
I don’t think this is true – or, more specifically, I think there are a lot of people who will start to worry about AI xrisk if things like automated ML R&D pick up. Most people who dismiss AI xrisk I don’t think do so because they think intelligence is inherently good, but instead because AI xrisk just seems too “scifi.” But if AI is automating ML R&D, then the idea of things getting out of hand won’t feel as scifi. In principle, people should be able to separate the question of “will AI soon be able to automate ML R&D” from the question of “if AI could automate ML R&D, would it pose an xrisk”, but I think most low-decouplers struggle to make this separation. For the kind of reaction that a “normal” person will have to automated ML R&D, I think this reaction from a CBS host interviewing Hinton is informative.
(I agree with your general point that it’s better to focus on worrying capabilities, and also I agree with some of your other points, such as how demos might be more useful than evals.)