It’s true that taking over the world might arguably get you power over the entire future, but this doesn’t seem discontinuously different from smaller fractions, whereas I think people often reason as if it is. Taking over 1% of the world might get you something like 1% of the future in expectation.
I agree with this point, along with the general logic of the post. Indeed, I suspect you aren’t taking this logic far enough. In particular, I think it’s actually very normal for humans in our current world to “take over” small fractions of the world: it’s just called earning income, and owning property.
“Taking over 1% of the world” doesn’t necessarily involve doing anything violent of abnormal. You don’t need to do any public advocacy, or take down 1% of the world’s institutions, or overthrow a country. It could just look like becoming very rich, via ordinary mechanisms of trade and wealth accumulation.
In our current world, higher skill people can earn more income, thereby becoming richer, and better able to achieve their goals. This plausibly scales to much higher levels of skill, of the type smart AIs might have. And as far as we can tell, there don’t appear to be any sharp discontinuities here, such that above a certain skill level it’s beneficial to take things by force rather than through negotiation and trade. It’s plausible that very smart power-seeking AIs would just become extremely rich, rather than trying to kill everyone.
Not all power-seeking behavior is socially destructive.
In the current era, the economics are such that war and violence tend to pay relatively badly, because countries get rich by having a well-developed infrastructure and war tends to destroy that, so conquest will get you something that won’t be of much value. This is argued to be one of the reasons for why we have less war today, compared to the past where land was the scarce resource and military conquest made more sense.
However, if we were to shift to a situation where matter could be converted into computronium… then there are two ways that things could go. One possibility is that it would be an extension of current trends, as computronium is a type of infrastructure and going to war would risk destroying it.
But the other possibility is that if you are good enough at rebuilding something that has been destroyed, then this is going back to the old trend where land/raw matter was a valuable resource—taking over more territory allows you to convert it into computronium (or recycle and rebuild the ruins of the computronium you took over). Also, an important part of “infrastructure” is educated people who are willing and capable of running it—war isn’t bad just because it destroys physical facilities, it’s also bad because it kills some of the experts who could run those facilities for you. This cost is reduced if you can just take your best workers and copy as many of them as you want to. All of that could shift us back to a situation where the return on investment for violence and conquest becomes higher than for peaceful trade.
As Azar Gat notes in War in Human Civilization (2006), for most of human history, war ‘paid,’ at least for the elites who made decisions. In pre-industrial societies, returns to capital investment were very low. They could – and did – build roads and infrastructure, irrigation systems and the like, but the production multiplier for such investments was fairly low. For antiquity, the Roman Empire probably represents close to the best that could be achieved with such capital investments and one estimate, by Richard Saller, puts the total gains per capita at perhaps 25% over three centuries (a very rough estimate, but focus on the implied scale here; the real number could be 15% or 30%, but it absolutely isn’t 1000% or 100% or even probably 50%).
But returns to violent land acquisition were very, very high. In those same three centuries, the Romans probably increased the productive capacity of their empire by conquest 1,200% (note that’s a comma, not a dot!), going from an Italian empire of perhaps 5,000,000 to a Mediterranean empire in excess of 60,000,000 (and because productivity per capita was so relatively insensitive to infrastructure investments, we can on some level extrapolate production straight out of population here in a way that we couldn’t discussing the modern world). Consequently, the ‘returns to warfare’ – if you won – were much higher than returns to peace. The largest and most prosperous states tended to become the largest and most prosperous states through lots of warfare and they tended to stay that way through even more of it.
This naturally produced a lot of very powerful incentives towards militarism in societies. Indeed, Gat argues (and I agree) that the state itself appears to have emerged as a stage in this competitive-militarism contest where the societies which were best at militarizing itself and coordinating those resources survived and aggregated new resources to themselves in conflict; everyone else could imitate or die (technically ‘or suffer state-extinction’ with most of the actual people being subjugated to the new states and later empires). [...]
And this makes a lot of sense if you think about the really basic energy economy of these societies: nearly all of the energy they are using comes from the land, either in the form of crops grown to feed either humans or animals who then do work with that energy. Of course small amounts of wind and water power were used, but only small amounts.
As Gat notes, the industrial revolution changed this, breaking the agricultural energy economy. Suddenly it was possible, with steam power and machines, to use other kinds of energy (initially, burning coal) to do work (more than just heating things) – for the first time, societies could radically increase the amount of energy they could dispose of without expanding. Consequently – as we’ve seen – returns to infrastructure and other capital development suddenly became much higher. At the same time, these new industrial technologies made warfare much more destructive precisely because the societies doing the warfare now had at their disposal far larger amounts of energy. Industrial processes not only made explosives possible, they also enabled such explosives to be produced in tremendous quantities, creating massive, hyper-destructive armies. Those armies were so destructive, they tended to destroy the sort of now-very-valuable mechanical infrastructure of these new industrial economies; they made the land they acquired less valuable by acquiring it. So even as what we might term ‘returns to capital’ were going wildly up, the costs of war were also increasing, which mean that ‘returns to warfare’ were going down for the first time in history.
It’s not clear exactly where the two lines cross, but it seems abundantly clear that for the most developed economies, this happened sometime before 1914 because it is almost impossible to argue that anything that could have possibly been won in the First World War could have ever – even on the cynical terms of the competitive militarism of the pre-industrial world – been worth the expenditure in blood and treasure.
What prevents AIs from owning and disassembling the entire planet because humans, at some point, are outcompeted and can’t offer anything worth the resources of the entire planet?
I was in the middle of writing a frustrated reply to Matthew’s comment when I realized he isn’t making very strong claims. I don’t think he’s claiming your scenario is not possible. Just that not all power seeking is socially destructive, and this is true just because most power seeking is only partially effective. Presumably he agrees that in the limit of perfect power acquisition most power seeking would indeed be socially destructive.
I claim that my scenario is not just possible, it’s default outcome (conditional on “there are multiple misaligned AIs which for some reason don’t just foom”).
Presumably he agrees that in the limit of perfect power acquisition most power seeking would indeed be socially destructive.
I agree with this claim in some limits, depending on the details. In particular, if the cost of trade is non-negligible, and the cost of taking over the world is negligible, then I expect an agent to attempt world takeover. However, this scenario doesn’t seem very realistic to me for most agents who are remotely near human-level intelligence, and potentially even for superintelligent agents.
The claim that takeover is instrumentally beneficial is more plausible for superintelligent agents, who might have the ability to take over the world from humans. But I expect that by the time superintelligent agents exist, they will be in competition with other agents (including humans, human-level AIs, slightly-sub-superintelligent AIs, and other superintelligent AIs, etc.). This raises the bar for what’s needed to perform a world takeover, since “the world” is not identical to “humanity”.
The important point here is just that a predatory world takeover isn’t necessarily preferred to trade, as long as the costs of trade are smaller than the costs of theft. You can just have a situation in which the most powerful agents in the world accumulate 99.999% of the wealth through trade. There’s really no theorem that says that you need to steal the last 0.001%, if the costs of stealing it would outweigh the benefits of obtaining it. Since both the costs of theft and the benefits of theft in this case are small, world takeover is not at all guaranteed to be rational (although it is possibly rational in some situations).
Leaving an unaligned force (humans, here) in control of 0.001% of resources seems risky. There is a chance that you’ve underestimated how large the share of resources controlled by the unaligned force is, and probably more importantly, there is a chance that the unaligned force could use its tiny share of resources in some super-effective way that captures a much higher fraction of resources in the future. The actual effect on the economy of the unaligned force, other than the possibility of its being larger than thought or being used as a springboard to gain more control, seems negligible, so one should still expect full extermination unless there’s some positive reason for the strong force to leave the weak force intact.
Humans do have such reasons in some cazes (we like seeing animals, at least in zoos, and being able to study them, etc.; same thing for the Amish; plus we also at least sometimes place real value on the independence and self-determination of such beings and cultures), but there would need to be an argument made that AI will have such positive reasons (and a further argument why the AIs wouldn’t just “put whatever humans they wanted to preserve” in “zoos”, if one thinks that being in a zoo isn’t a great future). Otherwise, exterminating humans would be trivially easy with that large of a power gap. Even if there are multiple ASIs that aren’t fully aligned with one another, offense is probably easier than defense; if one AI perceives weak benefits to keeping humans around, but another AI perceives weak benefits to exterminating us, I’d assume we get exterminated and then the 2nd AI pays some trivial amount to the 1st for the inconvenience. Getting AI to strongly care about keeping humans around is, of course, one way to frame the alignment problem. I haven’t seen an argument that this will happen by default or that we have any idea how to do it; this seems more like an attempt to say it isn’t necessary.
The share of income going to humans could simply tend towards zero if humans have no real wealth to offer in the economy. If humans own 0.001% of all wealth, for takeover to be rational, it needs to be the case that the benefit of taking that last 0.001% outweighs the costs. However, since both the costs and benefits are small, takeover is not necessarily rationally justified.
In the human world, we already see analogous situations in which groups could “take over” and yet choose not to because the (small) benefits of doing so do not outweigh the (similarly small) costs of doing so. Consider a small sub-unit of the economy, such as an individual person, a small town, or a small country. Given that these small sub-units are small, the rest of the world could—if they wanted to—coordinate to steal all the property from the sub-unit, i.e., they could “take over the world” from that person/town/country. This would be a takeover event because the rest of the world would go from owning <100% of the world prior to the theft, to owning 100% of the world, after the theft.
In the real world, various legal, social, and moral constraints generally prevent people from predating on small sub-units in the way I’ve described. But it’s not just morality: even if we assume agents are perfectly rational and self-interested, theft is not always worth it. Probably the biggest cost is simply coordinating to perform the theft. Even if the cost of coordination is small, to steal someone’s stuff, you might have to fight them. And if they don’t own lots of stuff, the cost of fighting them could easily outweigh the benefits you’d get from taking their stuff, even if you won the fight.
You are conflating “what humans own” with “what you can get by process with side effect of killing humans”. Humans are not going to own any significant chunk of Earth in the end, they are just going to live on its surface and die when this surface will evaporate during disassembling into Dyson swarm, and all of this 6*10^24 kg of silicon, hydrogen, oxygen and carbon are quite valuable. What does, exactly, prevent this scenario?
The environment in which digital minds thrive seem very different from the environment in which humans thrive. I don’t see a way to convert the mass of the earth into computronium without killing all the humans, without doing a lot more economic work than the humans are likely capable of producing.
All it takes is for humans to have enough wealth in absolute (not relative) terms afford their own habitable shelter and environment, which doesn’t seem implausible?
Anyway, my main objection here is that I expect we’re far away (in economic time) from anything like the Earth being disassembled. As a result, this seems like a long-run consideration, from the perspective of how different the world will be by the time it starts becoming relevant. My guess is that this risk could become significant if humans haven’t already migrated onto computers by this time, they lost all their capital ownership, they lack any social support networks that would be willing to bear these costs (including from potential ems living on computers at that time), and NIMBY political forces become irrelevant. But in most scenarios that I think are realistic, there are simply a lot of ways for the costs of killing humans to disassemble the Earth to be far greater than the benefits.
I’d love to see a scenario by you btw! Your own equivalent of What 2026 Looks Like, or failing that the shorter scenarios here. You’ve clearly thought about this in a decent amount of detail.
Okay, we have wildly different models of tech tree. In my understanding, to make mind uploads you need Awesome Nanotech and if you have misaligned AIs and not-so-awesome nanotech it’s sufficient to kill all humans and start to disassemble Earth. The only coherent scenario that I can imagine misaligned AIs actually participating in human economy in meaningful amounts is scenario where you can’t design nanotech without continent-sized supercomputers.
And as far as we can tell, there don’t appear to be any sharp discontinuities here, such that above a certain skill level it’s beneficial to take things by force rather than through negotiation and trade. It’s plausible that very smart power-seeking AIs would just become extremely rich, rather than trying to kill everyone.
I think this would depend quite a bit on the agent’s utility function. Humans tend more toward satisficing than optimizing, especially as they grow older—someone who has established a nice business empire and feels like they’re getting all their wealth-related needs met likely doesn’t want to rock the boat and risk losing everything for what they perceive as limited gain.
As a result, even if discontinuities do exist (and it seems pretty clear to me that being able to permanently rid yourself of all your competitors should be a discontinuity), the kinds of humans who could potentially make use of them are unlikely to.
In contrast, an agent that was an optimizer and had an unbounded utility function might be ready to gamble all of its gains for just a 0.1% chance of success if the reward was big enough.
In contrast, an agent that was an optimizer and had an unbounded utility function might be ready to gamble all of its gains for just a 0.1% chance of success if the reward was big enough.
Risk-neutral agents also have a tendency to go bankrupt quickly, as they keep taking the equivalent of double-or-nothing gambles with 50% + epsilon probability of success until eventually landing on “nothing”. This makes such agents less important in the median world, since their chance of becoming extremely powerful is very small.
I agree with this point, along with the general logic of the post. Indeed, I suspect you aren’t taking this logic far enough. In particular, I think it’s actually very normal for humans in our current world to “take over” small fractions of the world: it’s just called earning income, and owning property.
“Taking over 1% of the world” doesn’t necessarily involve doing anything violent of abnormal. You don’t need to do any public advocacy, or take down 1% of the world’s institutions, or overthrow a country. It could just look like becoming very rich, via ordinary mechanisms of trade and wealth accumulation.
In our current world, higher skill people can earn more income, thereby becoming richer, and better able to achieve their goals. This plausibly scales to much higher levels of skill, of the type smart AIs might have. And as far as we can tell, there don’t appear to be any sharp discontinuities here, such that above a certain skill level it’s beneficial to take things by force rather than through negotiation and trade. It’s plausible that very smart power-seeking AIs would just become extremely rich, rather than trying to kill everyone.
Not all power-seeking behavior is socially destructive.
In the current era, the economics are such that war and violence tend to pay relatively badly, because countries get rich by having a well-developed infrastructure and war tends to destroy that, so conquest will get you something that won’t be of much value. This is argued to be one of the reasons for why we have less war today, compared to the past where land was the scarce resource and military conquest made more sense.
However, if we were to shift to a situation where matter could be converted into computronium… then there are two ways that things could go. One possibility is that it would be an extension of current trends, as computronium is a type of infrastructure and going to war would risk destroying it.
But the other possibility is that if you are good enough at rebuilding something that has been destroyed, then this is going back to the old trend where land/raw matter was a valuable resource—taking over more territory allows you to convert it into computronium (or recycle and rebuild the ruins of the computronium you took over). Also, an important part of “infrastructure” is educated people who are willing and capable of running it—war isn’t bad just because it destroys physical facilities, it’s also bad because it kills some of the experts who could run those facilities for you. This cost is reduced if you can just take your best workers and copy as many of them as you want to. All of that could shift us back to a situation where the return on investment for violence and conquest becomes higher than for peaceful trade.
What prevents AIs from owning and disassembling the entire planet because humans, at some point, are outcompeted and can’t offer anything worth the resources of the entire planet?
I was in the middle of writing a frustrated reply to Matthew’s comment when I realized he isn’t making very strong claims. I don’t think he’s claiming your scenario is not possible. Just that not all power seeking is socially destructive, and this is true just because most power seeking is only partially effective. Presumably he agrees that in the limit of perfect power acquisition most power seeking would indeed be socially destructive.
I claim that my scenario is not just possible, it’s default outcome (conditional on “there are multiple misaligned AIs which for some reason don’t just foom”).
I agree with this claim in some limits, depending on the details. In particular, if the cost of trade is non-negligible, and the cost of taking over the world is negligible, then I expect an agent to attempt world takeover. However, this scenario doesn’t seem very realistic to me for most agents who are remotely near human-level intelligence, and potentially even for superintelligent agents.
The claim that takeover is instrumentally beneficial is more plausible for superintelligent agents, who might have the ability to take over the world from humans. But I expect that by the time superintelligent agents exist, they will be in competition with other agents (including humans, human-level AIs, slightly-sub-superintelligent AIs, and other superintelligent AIs, etc.). This raises the bar for what’s needed to perform a world takeover, since “the world” is not identical to “humanity”.
The important point here is just that a predatory world takeover isn’t necessarily preferred to trade, as long as the costs of trade are smaller than the costs of theft. You can just have a situation in which the most powerful agents in the world accumulate 99.999% of the wealth through trade. There’s really no theorem that says that you need to steal the last 0.001%, if the costs of stealing it would outweigh the benefits of obtaining it. Since both the costs of theft and the benefits of theft in this case are small, world takeover is not at all guaranteed to be rational (although it is possibly rational in some situations).
Leaving an unaligned force (humans, here) in control of 0.001% of resources seems risky. There is a chance that you’ve underestimated how large the share of resources controlled by the unaligned force is, and probably more importantly, there is a chance that the unaligned force could use its tiny share of resources in some super-effective way that captures a much higher fraction of resources in the future. The actual effect on the economy of the unaligned force, other than the possibility of its being larger than thought or being used as a springboard to gain more control, seems negligible, so one should still expect full extermination unless there’s some positive reason for the strong force to leave the weak force intact.
Humans do have such reasons in some cazes (we like seeing animals, at least in zoos, and being able to study them, etc.; same thing for the Amish; plus we also at least sometimes place real value on the independence and self-determination of such beings and cultures), but there would need to be an argument made that AI will have such positive reasons (and a further argument why the AIs wouldn’t just “put whatever humans they wanted to preserve” in “zoos”, if one thinks that being in a zoo isn’t a great future). Otherwise, exterminating humans would be trivially easy with that large of a power gap. Even if there are multiple ASIs that aren’t fully aligned with one another, offense is probably easier than defense; if one AI perceives weak benefits to keeping humans around, but another AI perceives weak benefits to exterminating us, I’d assume we get exterminated and then the 2nd AI pays some trivial amount to the 1st for the inconvenience. Getting AI to strongly care about keeping humans around is, of course, one way to frame the alignment problem. I haven’t seen an argument that this will happen by default or that we have any idea how to do it; this seems more like an attempt to say it isn’t necessary.
Completely as an aside, coordination problems among ASI don’t go away, so this is a highly non trivial claim.
The share of income going to humans could simply tend towards zero if humans have no real wealth to offer in the economy. If humans own 0.001% of all wealth, for takeover to be rational, it needs to be the case that the benefit of taking that last 0.001% outweighs the costs. However, since both the costs and benefits are small, takeover is not necessarily rationally justified.
In the human world, we already see analogous situations in which groups could “take over” and yet choose not to because the (small) benefits of doing so do not outweigh the (similarly small) costs of doing so. Consider a small sub-unit of the economy, such as an individual person, a small town, or a small country. Given that these small sub-units are small, the rest of the world could—if they wanted to—coordinate to steal all the property from the sub-unit, i.e., they could “take over the world” from that person/town/country. This would be a takeover event because the rest of the world would go from owning <100% of the world prior to the theft, to owning 100% of the world, after the theft.
In the real world, various legal, social, and moral constraints generally prevent people from predating on small sub-units in the way I’ve described. But it’s not just morality: even if we assume agents are perfectly rational and self-interested, theft is not always worth it. Probably the biggest cost is simply coordinating to perform the theft. Even if the cost of coordination is small, to steal someone’s stuff, you might have to fight them. And if they don’t own lots of stuff, the cost of fighting them could easily outweigh the benefits you’d get from taking their stuff, even if you won the fight.
You are conflating “what humans own” with “what you can get by process with side effect of killing humans”. Humans are not going to own any significant chunk of Earth in the end, they are just going to live on its surface and die when this surface will evaporate during disassembling into Dyson swarm, and all of this 6*10^24 kg of silicon, hydrogen, oxygen and carbon are quite valuable. What does, exactly, prevent this scenario?
The environment in which digital minds thrive seem very different from the environment in which humans thrive. I don’t see a way to convert the mass of the earth into computronium without killing all the humans, without doing a lot more economic work than the humans are likely capable of producing.
All it takes is for humans to have enough wealth in absolute (not relative) terms afford their own habitable shelter and environment, which doesn’t seem implausible?
Anyway, my main objection here is that I expect we’re far away (in economic time) from anything like the Earth being disassembled. As a result, this seems like a long-run consideration, from the perspective of how different the world will be by the time it starts becoming relevant. My guess is that this risk could become significant if humans haven’t already migrated onto computers by this time, they lost all their capital ownership, they lack any social support networks that would be willing to bear these costs (including from potential ems living on computers at that time), and NIMBY political forces become irrelevant. But in most scenarios that I think are realistic, there are simply a lot of ways for the costs of killing humans to disassemble the Earth to be far greater than the benefits.
I’d love to see a scenario by you btw! Your own equivalent of What 2026 Looks Like, or failing that the shorter scenarios here. You’ve clearly thought about this in a decent amount of detail.
Okay, we have wildly different models of tech tree. In my understanding, to make mind uploads you need Awesome Nanotech and if you have misaligned AIs and not-so-awesome nanotech it’s sufficient to kill all humans and start to disassemble Earth. The only coherent scenario that I can imagine misaligned AIs actually participating in human economy in meaningful amounts is scenario where you can’t design nanotech without continent-sized supercomputers.
I think this would depend quite a bit on the agent’s utility function. Humans tend more toward satisficing than optimizing, especially as they grow older—someone who has established a nice business empire and feels like they’re getting all their wealth-related needs met likely doesn’t want to rock the boat and risk losing everything for what they perceive as limited gain.
As a result, even if discontinuities do exist (and it seems pretty clear to me that being able to permanently rid yourself of all your competitors should be a discontinuity), the kinds of humans who could potentially make use of them are unlikely to.
In contrast, an agent that was an optimizer and had an unbounded utility function might be ready to gamble all of its gains for just a 0.1% chance of success if the reward was big enough.
Risk-neutral agents also have a tendency to go bankrupt quickly, as they keep taking the equivalent of double-or-nothing gambles with 50% + epsilon probability of success until eventually landing on “nothing”. This makes such agents less important in the median world, since their chance of becoming extremely powerful is very small.