Thanks for sharing the link to ARC. It seems to me the kinds of things they are testing for and worried about are analogous to the risks of self-driving cars: when you incorporate ML systems into a range of human activities, their behaviour is unpredictable and can be dangerous. I am glad ARC is doing the work they are doing. People are using unpredictable tools and ARC is investigating the risks. That’s great.
I don’t think these capabilities ARC is looking at are “similar” to runaway intelligence, as you suggest. They clearly do not require it. They are far more mundane (but dangerous nonetheless, as you rightly point out).
At one point in the ARC post, they hint vaguely at being motivated by Yudkowsky-like worries: “As AI systems improve, it is becoming increasingly difficult to rule out that models might be able to autonomously gain resources and evade human oversight – so rigorous evaluation is essential.” They seem to be imagining a system giving itself goals, such that it is motivated to engage in tactical deception to carry out its goals—a behaviour we find in a range of problem-solving non-human animals. It strikes me as a worry that is extraneous to the good work ARC is doing. And the end of the quote is odd, since rigorous evaluation is clearly essential regardless of autonomous resource gains or oversight evasion.
Thanks for sharing the link to ARC. It seems to me the kinds of things they are testing for and worried about are analogous to the risks of self-driving cars: when you incorporate ML systems into a range of human activities, their behaviour is unpredictable and can be dangerous. I am glad ARC is doing the work they are doing. People are using unpredictable tools and ARC is investigating the risks. That’s great.
I don’t think these capabilities ARC is looking at are “similar” to runaway intelligence, as you suggest. They clearly do not require it. They are far more mundane (but dangerous nonetheless, as you rightly point out).
At one point in the ARC post, they hint vaguely at being motivated by Yudkowsky-like worries: “As AI systems improve, it is becoming increasingly difficult to rule out that models might be able to autonomously gain resources and evade human oversight – so rigorous evaluation is essential.” They seem to be imagining a system giving itself goals, such that it is motivated to engage in tactical deception to carry out its goals—a behaviour we find in a range of problem-solving non-human animals. It strikes me as a worry that is extraneous to the good work ARC is doing. And the end of the quote is odd, since rigorous evaluation is clearly essential regardless of autonomous resource gains or oversight evasion.