This is an interesting historical perspective… But it’s not really what the fundamental case for AGI doom routes through. In particular: AGI doom is not about “AI systems”, as such.
AGI doom is, specifically, about artificial generally intelligent systems capable of autonomously optimizing the world the way humans can, and who are more powerful at this task than humans. The AGI-doom arguments do not necessarily have anything to do with the current SoTA ML models.
Case in point: A manually written FPS bot is technically “an AI system”. However, I think you’d agree that the AGI-doom arguments were never about this type of system, despite it falling under the broad umbrella of “an AI system”.
Similarly, if a given SoTA ML model architecture fails to meet the definition of “a generally intelligent system capable of autonomously optimizing the world the way humans can”, then the AGI doom is not about it. The details of its workings, therefore, have little to say, one way or another, about the AGI doom.
Why are the AGI-doom concerns extended to the current AI-capabilities research, then, if the SoTA models don’t fall under said concerns? Well, because building artificial generally intelligent systems is something the AGI labs are specifically and deliberately trying to do. Inasmuch as the SoTA models are not the generally intelligent systems that are within the remit of the AGI-doom arguments, and are instead some other type of systems, the current AGI labs view this as their failure that they’re doing their best to “fix”.
And this is where the fundamental AGI-doom arguments – all these coherence theorems, utility-maximization frameworks, et cetera – come in. At their core, they’re claims that any “artificial generally intelligent system capable of autonomously optimizing the world the way humans can” would necessarily be well-approximated as a game-theoretic agent. Which, in turn, means that any system that has the set of capabilities the AI researchers ultimately want their AI models to have, would inevitably have a set of potentially omnicidal failure modes.
In other words: The set of AI systems defined by “a generally intelligent world-optimization-capable agent”, and the set of AI systems defined by “the subject of fundamental AGI-doom arguments”, is the same set of systems. You can’t have the former without the latter. And the AI industry wants the former; therefore, the arguments go, it will unleash the latter on the world.
While, yes, the current SoTA models are not subjects of the AGI doom arguments, that doesn’t matter, because the current SoTA models are incidental research artefacts that are produced on AI industry’s path to building an AGI. The AGI-doom arguments apply to the endpoint of that process, not the messy byproducts.
So any evidence we uncover about how the current models are not dangerous the way AGI-doom arguments predict AGIs to be dangerous, is just evidence that they’re not AGI yet. It’s not evidence that AGI would not be dangerous. (Again: FPS bots’ non-dangerousness isn’t evidence that AGI would be non-dangerous.)
(I’d written some more about this topic here. See also gwern’s Why Tool AIs Want to Be Agent AIs for more arguments regarding why AI research’s endpoint would be an AI agent, instead of something as harmless and compliant as the contemporary models.)
Counterarguments to AGI-doom arguments that focus on pointing to the SoTA models, as such, miss the point. Actual counterarguments would instead find some way to argue that “generally intelligent world-optimizing agents” and “subjects of AGI-doom arguments” are not the exact same type of system; that you can, in theory, have the former without the latter. I have not seen any such argument, and the mathematical noose around them is slowly tightening (uh, by which I mean: their impossibility may be formally provable).
This is an interesting historical perspective… But it’s not really what the fundamental case for AGI doom routes through. In particular: AGI doom is not about “AI systems”, as such.
AGI doom is, specifically, about artificial generally intelligent systems capable of autonomously optimizing the world the way humans can, and who are more powerful at this task than humans. The AGI-doom arguments do not necessarily have anything to do with the current SoTA ML models.
Case in point: A manually written FPS bot is technically “an AI system”. However, I think you’d agree that the AGI-doom arguments were never about this type of system, despite it falling under the broad umbrella of “an AI system”.
Similarly, if a given SoTA ML model architecture fails to meet the definition of “a generally intelligent system capable of autonomously optimizing the world the way humans can”, then the AGI doom is not about it. The details of its workings, therefore, have little to say, one way or another, about the AGI doom.
Why are the AGI-doom concerns extended to the current AI-capabilities research, then, if the SoTA models don’t fall under said concerns? Well, because building artificial generally intelligent systems is something the AGI labs are specifically and deliberately trying to do. Inasmuch as the SoTA models are not the generally intelligent systems that are within the remit of the AGI-doom arguments, and are instead some other type of systems, the current AGI labs view this as their failure that they’re doing their best to “fix”.
And this is where the fundamental AGI-doom arguments – all these coherence theorems, utility-maximization frameworks, et cetera – come in. At their core, they’re claims that any “artificial generally intelligent system capable of autonomously optimizing the world the way humans can” would necessarily be well-approximated as a game-theoretic agent. Which, in turn, means that any system that has the set of capabilities the AI researchers ultimately want their AI models to have, would inevitably have a set of potentially omnicidal failure modes.
In other words: The set of AI systems defined by “a generally intelligent world-optimization-capable agent”, and the set of AI systems defined by “the subject of fundamental AGI-doom arguments”, is the same set of systems. You can’t have the former without the latter. And the AI industry wants the former; therefore, the arguments go, it will unleash the latter on the world.
While, yes, the current SoTA models are not subjects of the AGI doom arguments, that doesn’t matter, because the current SoTA models are incidental research artefacts that are produced on AI industry’s path to building an AGI. The AGI-doom arguments apply to the endpoint of that process, not the messy byproducts.
So any evidence we uncover about how the current models are not dangerous the way AGI-doom arguments predict AGIs to be dangerous, is just evidence that they’re not AGI yet. It’s not evidence that AGI would not be dangerous. (Again: FPS bots’ non-dangerousness isn’t evidence that AGI would be non-dangerous.)
(I’d written some more about this topic here. See also gwern’s Why Tool AIs Want to Be Agent AIs for more arguments regarding why AI research’s endpoint would be an AI agent, instead of something as harmless and compliant as the contemporary models.)
Counterarguments to AGI-doom arguments that focus on pointing to the SoTA models, as such, miss the point. Actual counterarguments would instead find some way to argue that “generally intelligent world-optimizing agents” and “subjects of AGI-doom arguments” are not the exact same type of system; that you can, in theory, have the former without the latter. I have not seen any such argument, and the mathematical noose around them is slowly tightening (uh, by which I mean: their impossibility may be formally provable).