I agree that AI absolutely needs to be regulated ASAP to mitigate the many potential harms that could arise from its use. So, even though the FLI letter is flimsy and vague, I appreciate its performance of concern.
Yudkowsky’s worry about runaway intelligence is, I think, an ungrounded distraction. It is ungrounded because Yudkowsky does not have a coherent theory of intentionality that makes sense of the idea of an algorithm gaining a capacity to engage in its own goal directed activity. It is a distraction from the public discourse about the very real, immediate, tangible risks of harms caused AI systems we have today.
The independent red-teaming organization ARC Evals that OpenAI partnered with to evaluate GPT-4 seems to disagree with this. While they don’t use the term “runaway intelligence”, they have flagged similar dangerous capabilities that they think will possibly be in reach for the next models beyond GPT-4:
We think that, for systems more capable than Claude and GPT-4, we are now at the point where we need to check carefully that new models do not have sufficient capabilities to replicate autonomously or cause catastrophic harm – it’s no longer obvious that they won’t be able to.
Thanks for sharing the link to ARC. It seems to me the kinds of things they are testing for and worried about are analogous to the risks of self-driving cars: when you incorporate ML systems into a range of human activities, their behaviour is unpredictable and can be dangerous. I am glad ARC is doing the work they are doing. People are using unpredictable tools and ARC is investigating the risks. That’s great.
I don’t think these capabilities ARC is looking at are “similar” to runaway intelligence, as you suggest. They clearly do not require it. They are far more mundane (but dangerous nonetheless, as you rightly point out).
At one point in the ARC post, they hint vaguely at being motivated by Yudkowsky-like worries: “As AI systems improve, it is becoming increasingly difficult to rule out that models might be able to autonomously gain resources and evade human oversight – so rigorous evaluation is essential.” They seem to be imagining a system giving itself goals, such that it is motivated to engage in tactical deception to carry out its goals—a behaviour we find in a range of problem-solving non-human animals. It strikes me as a worry that is extraneous to the good work ARC is doing. And the end of the quote is odd, since rigorous evaluation is clearly essential regardless of autonomous resource gains or oversight evasion.
I agree that AI absolutely needs to be regulated ASAP to mitigate the many potential harms that could arise from its use. So, even though the FLI letter is flimsy and vague, I appreciate its performance of concern.
Yudkowsky’s worry about runaway intelligence is, I think, an ungrounded distraction. It is ungrounded because Yudkowsky does not have a coherent theory of intentionality that makes sense of the idea of an algorithm gaining a capacity to engage in its own goal directed activity. It is a distraction from the public discourse about the very real, immediate, tangible risks of harms caused AI systems we have today.
The independent red-teaming organization ARC Evals that OpenAI partnered with to evaluate GPT-4 seems to disagree with this. While they don’t use the term “runaway intelligence”, they have flagged similar dangerous capabilities that they think will possibly be in reach for the next models beyond GPT-4:
Thanks for sharing the link to ARC. It seems to me the kinds of things they are testing for and worried about are analogous to the risks of self-driving cars: when you incorporate ML systems into a range of human activities, their behaviour is unpredictable and can be dangerous. I am glad ARC is doing the work they are doing. People are using unpredictable tools and ARC is investigating the risks. That’s great.
I don’t think these capabilities ARC is looking at are “similar” to runaway intelligence, as you suggest. They clearly do not require it. They are far more mundane (but dangerous nonetheless, as you rightly point out).
At one point in the ARC post, they hint vaguely at being motivated by Yudkowsky-like worries: “As AI systems improve, it is becoming increasingly difficult to rule out that models might be able to autonomously gain resources and evade human oversight – so rigorous evaluation is essential.” They seem to be imagining a system giving itself goals, such that it is motivated to engage in tactical deception to carry out its goals—a behaviour we find in a range of problem-solving non-human animals. It strikes me as a worry that is extraneous to the good work ARC is doing. And the end of the quote is odd, since rigorous evaluation is clearly essential regardless of autonomous resource gains or oversight evasion.