Point 1 is overstated: the strong default is that unaligned AGI will be indifferent to human survival as an end. The leap to wanting to kill humans relies on a much stronger assumption than STEM-level AGI. It requires the AGI to have confidence that it can replace all human activity with its own, to the extent that human activity suppports or enables the AGI’s functioning. Perhaps call this World-System-Designing-level AGI: an AGI which can design an alternate way for the world to function not based on human activity and human industrial production.
Point 3 is a weird amalgamation. It talks about two very different kinds of capability levels: (i) capability levels that allow the AGI to understand its situation, and (ii) capability levels that allow the AGI to kill all humans.
On the physical-world capabilities side, something that anyone in technology knows is that it requires a lot of iteration and experimentation to make things that work in the real world. Its implausible that AGI will break from pure digital influence to designing and controlling physical systems without such an iterative feedback.
The Earth’s evolution has had billions of years of experimentation with different self-replicating systems each trying to convert the world’s resources to their own ends. I think the argument that AGI will be wildly more effective at that game than evolution scaled over billions of years is also implausible.
On the evolutionary optimization over billions of years point: consider that humans managed it in millennia, taking over many environments and niches despite lacking relevant physical adaptations and then, over centuries, also accessing large quantities and many types of resources that no organism previously used. Add in that, even if nothing else, digital computers operate at speeds several OOMs faster than humans with larger working memories.
Point 1 is overstated: the strong default is that unaligned AGI will be indifferent to human survival as an end. The leap to wanting to kill humans relies on a much stronger assumption than STEM-level AGI. It requires the AGI to have confidence that it can replace all human activity with its own, to the extent that human activity suppports or enables the AGI’s functioning. Perhaps call this World-System-Designing-level AGI: an AGI which can design an alternate way for the world to function not based on human activity and human industrial production.
Point 3 is a weird amalgamation. It talks about two very different kinds of capability levels: (i) capability levels that allow the AGI to understand its situation, and (ii) capability levels that allow the AGI to kill all humans.
On the physical-world capabilities side, something that anyone in technology knows is that it requires a lot of iteration and experimentation to make things that work in the real world. Its implausible that AGI will break from pure digital influence to designing and controlling physical systems without such an iterative feedback.
The Earth’s evolution has had billions of years of experimentation with different self-replicating systems each trying to convert the world’s resources to their own ends. I think the argument that AGI will be wildly more effective at that game than evolution scaled over billions of years is also implausible.
On the evolutionary optimization over billions of years point: consider that humans managed it in millennia, taking over many environments and niches despite lacking relevant physical adaptations and then, over centuries, also accessing large quantities and many types of resources that no organism previously used. Add in that, even if nothing else, digital computers operate at speeds several OOMs faster than humans with larger working memories.