I agree that points iii and IV are the relevant ones. Just to clarify, no, I don’t think it can’t kill most of humanity and I think that people thinking that they can come up with valid plans themselves (and by extension an AGI could too) are overestimating the things that can be known/predicted/plan in a highly complex system. I do think it can kill millions of humans though, but this is not what is being said. I think that what is being said is alarmist, and that it will have a cost eventually.
Civilization is a highly complex and fragile system, without with most of humanity will die and humanity will be rendered defenseless. If you want to destroy it, you don’t have to predict or plan what will happen, you just have to hit it hard and fast, preferably from a couple of different directions.
There is an implicit norm here against provided detailed plans to destroy civilization so I won’t, but it is not hard to come up with one (or four) and you will likely have thought of some yourself. The key thing is that if you get to hit again (and the AGI will) you only need to achieve a portion of your objective with each try.
The problem is that you not only have to hit hard and first, you have to prevent any possible retaliation because you hitting means that run the risk of being yourself hit. Are you telling me that you can conceive different ways to derail humanity but you can’t imagine a machine concluding that the risk is too high to play that game?
I can certainly imagine a machine concluding that the risk is too high to want to play that game. And I can imagine other reasons a machine might decide not to end humanity. That is why I wind up at maybe instead of definitely (i.e. p(doom) < 99%).
But that ultimately becomes a question of the machine’s goals, motivation, understanding, agency and risk tolerance. I think that there is a wide distribution of these and therefore an unknown but significant chance that the AGI decides not to destroy humanity.
That is very different from the question of whether the AGI could achieve the destruction of humanity. If the AGI couldn’t destroy humanity in practice, p(doom) would be close to 0.
In other words, I think the AGI can kill humanity but may choose not to. You seemed above to think the AGI can’t, but now seem to think it might be able to but may choose not to.
I agree that points iii and IV are the relevant ones. Just to clarify, no, I don’t think it can’t kill most of humanity and I think that people thinking that they can come up with valid plans themselves (and by extension an AGI could too) are overestimating the things that can be known/predicted/plan in a highly complex system. I do think it can kill millions of humans though, but this is not what is being said. I think that what is being said is alarmist, and that it will have a cost eventually.
Civilization is a highly complex and fragile system, without with most of humanity will die and humanity will be rendered defenseless. If you want to destroy it, you don’t have to predict or plan what will happen, you just have to hit it hard and fast, preferably from a couple of different directions.
There is an implicit norm here against provided detailed plans to destroy civilization so I won’t, but it is not hard to come up with one (or four) and you will likely have thought of some yourself. The key thing is that if you get to hit again (and the AGI will) you only need to achieve a portion of your objective with each try.
The problem is that you not only have to hit hard and first, you have to prevent any possible retaliation because you hitting means that run the risk of being yourself hit. Are you telling me that you can conceive different ways to derail humanity but you can’t imagine a machine concluding that the risk is too high to play that game?
I can certainly imagine a machine concluding that the risk is too high to want to play that game. And I can imagine other reasons a machine might decide not to end humanity. That is why I wind up at maybe instead of definitely (i.e. p(doom) < 99%).
But that ultimately becomes a question of the machine’s goals, motivation, understanding, agency and risk tolerance. I think that there is a wide distribution of these and therefore an unknown but significant chance that the AGI decides not to destroy humanity.
That is very different from the question of whether the AGI could achieve the destruction of humanity. If the AGI couldn’t destroy humanity in practice, p(doom) would be close to 0.
In other words, I think the AGI can kill humanity but may choose not to. You seemed above to think the AGI can’t, but now seem to think it might be able to but may choose not to.