What prevents AIs from owning and disassembling the entire planet because humans, at some point, are outcompeted and can’t offer anything worth the resources of the entire planet?
I was in the middle of writing a frustrated reply to Matthew’s comment when I realized he isn’t making very strong claims. I don’t think he’s claiming your scenario is not possible. Just that not all power seeking is socially destructive, and this is true just because most power seeking is only partially effective. Presumably he agrees that in the limit of perfect power acquisition most power seeking would indeed be socially destructive.
I claim that my scenario is not just possible, it’s default outcome (conditional on “there are multiple misaligned AIs which for some reason don’t just foom”).
Presumably he agrees that in the limit of perfect power acquisition most power seeking would indeed be socially destructive.
I agree with this claim in some limits, depending on the details. In particular, if the cost of trade is non-negligible, and the cost of taking over the world is negligible, then I expect an agent to attempt world takeover. However, this scenario doesn’t seem very realistic to me for most agents who are remotely near human-level intelligence, and potentially even for superintelligent agents.
The claim that takeover is instrumentally beneficial is more plausible for superintelligent agents, who might have the ability to take over the world from humans. But I expect that by the time superintelligent agents exist, they will be in competition with other agents (including humans, human-level AIs, slightly-sub-superintelligent AIs, and other superintelligent AIs, etc.). This raises the bar for what’s needed to perform a world takeover, since “the world” is not identical to “humanity”.
The important point here is just that a predatory world takeover isn’t necessarily preferred to trade, as long as the costs of trade are smaller than the costs of theft. You can just have a situation in which the most powerful agents in the world accumulate 99.999% of the wealth through trade. There’s really no theorem that says that you need to steal the last 0.001%, if the costs of stealing it would outweigh the benefits of obtaining it. Since both the costs of theft and the benefits of theft in this case are small, world takeover is not at all guaranteed to be rational (although it is possibly rational in some situations).
Leaving an unaligned force (humans, here) in control of 0.001% of resources seems risky. There is a chance that you’ve underestimated how large the share of resources controlled by the unaligned force is, and probably more importantly, there is a chance that the unaligned force could use its tiny share of resources in some super-effective way that captures a much higher fraction of resources in the future. The actual effect on the economy of the unaligned force, other than the possibility of its being larger than thought or being used as a springboard to gain more control, seems negligible, so one should still expect full extermination unless there’s some positive reason for the strong force to leave the weak force intact.
Humans do have such reasons in some cazes (we like seeing animals, at least in zoos, and being able to study them, etc.; same thing for the Amish; plus we also at least sometimes place real value on the independence and self-determination of such beings and cultures), but there would need to be an argument made that AI will have such positive reasons (and a further argument why the AIs wouldn’t just “put whatever humans they wanted to preserve” in “zoos”, if one thinks that being in a zoo isn’t a great future). Otherwise, exterminating humans would be trivially easy with that large of a power gap. Even if there are multiple ASIs that aren’t fully aligned with one another, offense is probably easier than defense; if one AI perceives weak benefits to keeping humans around, but another AI perceives weak benefits to exterminating us, I’d assume we get exterminated and then the 2nd AI pays some trivial amount to the 1st for the inconvenience. Getting AI to strongly care about keeping humans around is, of course, one way to frame the alignment problem. I haven’t seen an argument that this will happen by default or that we have any idea how to do it; this seems more like an attempt to say it isn’t necessary.
The share of income going to humans could simply tend towards zero if humans have no real wealth to offer in the economy. If humans own 0.001% of all wealth, for takeover to be rational, it needs to be the case that the benefit of taking that last 0.001% outweighs the costs. However, since both the costs and benefits are small, takeover is not necessarily rationally justified.
In the human world, we already see analogous situations in which groups could “take over” and yet choose not to because the (small) benefits of doing so do not outweigh the (similarly small) costs of doing so. Consider a small sub-unit of the economy, such as an individual person, a small town, or a small country. Given that these small sub-units are small, the rest of the world could—if they wanted to—coordinate to steal all the property from the sub-unit, i.e., they could “take over the world” from that person/town/country. This would be a takeover event because the rest of the world would go from owning <100% of the world prior to the theft, to owning 100% of the world, after the theft.
In the real world, various legal, social, and moral constraints generally prevent people from predating on small sub-units in the way I’ve described. But it’s not just morality: even if we assume agents are perfectly rational and self-interested, theft is not always worth it. Probably the biggest cost is simply coordinating to perform the theft. Even if the cost of coordination is small, to steal someone’s stuff, you might have to fight them. And if they don’t own lots of stuff, the cost of fighting them could easily outweigh the benefits you’d get from taking their stuff, even if you won the fight.
You are conflating “what humans own” with “what you can get by process with side effect of killing humans”. Humans are not going to own any significant chunk of Earth in the end, they are just going to live on its surface and die when this surface will evaporate during disassembling into Dyson swarm, and all of this 6*10^24 kg of silicon, hydrogen, oxygen and carbon are quite valuable. What does, exactly, prevent this scenario?
The environment in which digital minds thrive seem very different from the environment in which humans thrive. I don’t see a way to convert the mass of the earth into computronium without killing all the humans, without doing a lot more economic work than the humans are likely capable of producing.
All it takes is for humans to have enough wealth in absolute (not relative) terms afford their own habitable shelter and environment, which doesn’t seem implausible?
Anyway, my main objection here is that I expect we’re far away (in economic time) from anything like the Earth being disassembled. As a result, this seems like a long-run consideration, from the perspective of how different the world will be by the time it starts becoming relevant. My guess is that this risk could become significant if humans haven’t already migrated onto computers by this time, they lost all their capital ownership, they lack any social support networks that would be willing to bear these costs (including from potential ems living on computers at that time), and NIMBY political forces become irrelevant. But in most scenarios that I think are realistic, there are simply a lot of ways for the costs of killing humans to disassemble the Earth to be far greater than the benefits.
I’d love to see a scenario by you btw! Your own equivalent of What 2026 Looks Like, or failing that the shorter scenarios here. You’ve clearly thought about this in a decent amount of detail.
Okay, we have wildly different models of tech tree. In my understanding, to make mind uploads you need Awesome Nanotech and if you have misaligned AIs and not-so-awesome nanotech it’s sufficient to kill all humans and start to disassemble Earth. The only coherent scenario that I can imagine misaligned AIs actually participating in human economy in meaningful amounts is scenario where you can’t design nanotech without continent-sized supercomputers.
What prevents AIs from owning and disassembling the entire planet because humans, at some point, are outcompeted and can’t offer anything worth the resources of the entire planet?
I was in the middle of writing a frustrated reply to Matthew’s comment when I realized he isn’t making very strong claims. I don’t think he’s claiming your scenario is not possible. Just that not all power seeking is socially destructive, and this is true just because most power seeking is only partially effective. Presumably he agrees that in the limit of perfect power acquisition most power seeking would indeed be socially destructive.
I claim that my scenario is not just possible, it’s default outcome (conditional on “there are multiple misaligned AIs which for some reason don’t just foom”).
I agree with this claim in some limits, depending on the details. In particular, if the cost of trade is non-negligible, and the cost of taking over the world is negligible, then I expect an agent to attempt world takeover. However, this scenario doesn’t seem very realistic to me for most agents who are remotely near human-level intelligence, and potentially even for superintelligent agents.
The claim that takeover is instrumentally beneficial is more plausible for superintelligent agents, who might have the ability to take over the world from humans. But I expect that by the time superintelligent agents exist, they will be in competition with other agents (including humans, human-level AIs, slightly-sub-superintelligent AIs, and other superintelligent AIs, etc.). This raises the bar for what’s needed to perform a world takeover, since “the world” is not identical to “humanity”.
The important point here is just that a predatory world takeover isn’t necessarily preferred to trade, as long as the costs of trade are smaller than the costs of theft. You can just have a situation in which the most powerful agents in the world accumulate 99.999% of the wealth through trade. There’s really no theorem that says that you need to steal the last 0.001%, if the costs of stealing it would outweigh the benefits of obtaining it. Since both the costs of theft and the benefits of theft in this case are small, world takeover is not at all guaranteed to be rational (although it is possibly rational in some situations).
Leaving an unaligned force (humans, here) in control of 0.001% of resources seems risky. There is a chance that you’ve underestimated how large the share of resources controlled by the unaligned force is, and probably more importantly, there is a chance that the unaligned force could use its tiny share of resources in some super-effective way that captures a much higher fraction of resources in the future. The actual effect on the economy of the unaligned force, other than the possibility of its being larger than thought or being used as a springboard to gain more control, seems negligible, so one should still expect full extermination unless there’s some positive reason for the strong force to leave the weak force intact.
Humans do have such reasons in some cazes (we like seeing animals, at least in zoos, and being able to study them, etc.; same thing for the Amish; plus we also at least sometimes place real value on the independence and self-determination of such beings and cultures), but there would need to be an argument made that AI will have such positive reasons (and a further argument why the AIs wouldn’t just “put whatever humans they wanted to preserve” in “zoos”, if one thinks that being in a zoo isn’t a great future). Otherwise, exterminating humans would be trivially easy with that large of a power gap. Even if there are multiple ASIs that aren’t fully aligned with one another, offense is probably easier than defense; if one AI perceives weak benefits to keeping humans around, but another AI perceives weak benefits to exterminating us, I’d assume we get exterminated and then the 2nd AI pays some trivial amount to the 1st for the inconvenience. Getting AI to strongly care about keeping humans around is, of course, one way to frame the alignment problem. I haven’t seen an argument that this will happen by default or that we have any idea how to do it; this seems more like an attempt to say it isn’t necessary.
Completely as an aside, coordination problems among ASI don’t go away, so this is a highly non trivial claim.
The share of income going to humans could simply tend towards zero if humans have no real wealth to offer in the economy. If humans own 0.001% of all wealth, for takeover to be rational, it needs to be the case that the benefit of taking that last 0.001% outweighs the costs. However, since both the costs and benefits are small, takeover is not necessarily rationally justified.
In the human world, we already see analogous situations in which groups could “take over” and yet choose not to because the (small) benefits of doing so do not outweigh the (similarly small) costs of doing so. Consider a small sub-unit of the economy, such as an individual person, a small town, or a small country. Given that these small sub-units are small, the rest of the world could—if they wanted to—coordinate to steal all the property from the sub-unit, i.e., they could “take over the world” from that person/town/country. This would be a takeover event because the rest of the world would go from owning <100% of the world prior to the theft, to owning 100% of the world, after the theft.
In the real world, various legal, social, and moral constraints generally prevent people from predating on small sub-units in the way I’ve described. But it’s not just morality: even if we assume agents are perfectly rational and self-interested, theft is not always worth it. Probably the biggest cost is simply coordinating to perform the theft. Even if the cost of coordination is small, to steal someone’s stuff, you might have to fight them. And if they don’t own lots of stuff, the cost of fighting them could easily outweigh the benefits you’d get from taking their stuff, even if you won the fight.
You are conflating “what humans own” with “what you can get by process with side effect of killing humans”. Humans are not going to own any significant chunk of Earth in the end, they are just going to live on its surface and die when this surface will evaporate during disassembling into Dyson swarm, and all of this 6*10^24 kg of silicon, hydrogen, oxygen and carbon are quite valuable. What does, exactly, prevent this scenario?
The environment in which digital minds thrive seem very different from the environment in which humans thrive. I don’t see a way to convert the mass of the earth into computronium without killing all the humans, without doing a lot more economic work than the humans are likely capable of producing.
All it takes is for humans to have enough wealth in absolute (not relative) terms afford their own habitable shelter and environment, which doesn’t seem implausible?
Anyway, my main objection here is that I expect we’re far away (in economic time) from anything like the Earth being disassembled. As a result, this seems like a long-run consideration, from the perspective of how different the world will be by the time it starts becoming relevant. My guess is that this risk could become significant if humans haven’t already migrated onto computers by this time, they lost all their capital ownership, they lack any social support networks that would be willing to bear these costs (including from potential ems living on computers at that time), and NIMBY political forces become irrelevant. But in most scenarios that I think are realistic, there are simply a lot of ways for the costs of killing humans to disassemble the Earth to be far greater than the benefits.
I’d love to see a scenario by you btw! Your own equivalent of What 2026 Looks Like, or failing that the shorter scenarios here. You’ve clearly thought about this in a decent amount of detail.
Okay, we have wildly different models of tech tree. In my understanding, to make mind uploads you need Awesome Nanotech and if you have misaligned AIs and not-so-awesome nanotech it’s sufficient to kill all humans and start to disassemble Earth. The only coherent scenario that I can imagine misaligned AIs actually participating in human economy in meaningful amounts is scenario where you can’t design nanotech without continent-sized supercomputers.