Thanks for engaging. I did read your linked post. I think you’re actually in the majority in your opinion on AI leading to a continuation and expansion of business as usual. I’ve long been curious about about this line of thinking; while it makes a good bit of sense to me for the near future, I become confused at the “indefinite” part of your prediction.
When you say that AI continues from the first step indefinitely, it seems to me that you must believe one or more of the following:
No one would ever tell their arbitrarily powerful AI to take over the world
Even if it might succeed
No arbitrarily powerful AI could succeed at taking over the world
Even if it was willing to do terrible damage in the process
We’ll have a limited number of humans controlling arbitrarily powerful AI
And an indefinitely stable balance-of-power agreement among them
By “indefinitely” you mean only until we create and proliferate really powerful AI
If I believed in any of those, I’d agree with you.
Or perhaps I’m missing some other belief we don’t share that leads to your conclusions.
Care to share?
Separately, in response to that post: your post you linked was titled AI values will be shaped by a variety of forces, not just the values of AI developers. In my prediction here, AI and AGI will not have values in any important sense; it will merely carry out the values of its principals (its creators, or the government that shows up to take control). This might just be terminological distinction, except for the following bit of implied logic: I don’t think AI needs to share clients’ values to be of immense economic and practical advantage to them. When (if) someone creates a highly capable AI system, they will instruct it to serve customers needs in certain ways, including following their requests within certain limits; this will not necessitate changing the A(G)I’s core values (if they exist) to use it to make enormous profits when licensed to clients. To the extent this is correct, we should go on assuming that AI will share or at least follow its creators’ values (or IMO more likely, take orders/values from the government that takes control, citing security concerns)
No arbitrarily powerful AI could succeed at taking over the world
This is closest to what I am saying. The current world appears to be in a state of inter-agent competition. Even as technology has gotten more advanced, and as agents have gotten powerful over time, no single unified agent has been able to obtain control over everything and win the entire pie, defeating all the other agents. I think we should expect this state of affairs to continue even as AGI gets invented and technology continues to get more powerful.
(One plausible exception to the idea that “no single agent has ever won the competition over the world” is the human species itself, which dominates over other animal species. But I don’t think the human species is well-described as a unified agent, and I think our power comes mostly from accumulated technological abilities, rather than raw intelligence by itself. This distinction is important because the effects of technological innovation generally diffuse across society rather than giving highly concentrated powers to the people who invent stuff. This generally makes the situation with humans vs. animals disanalogous to a hypothetical AGI foom in several important ways.)
Separately, I also think that even if an AGI agent could violently take over the world, it would likely not be rational for it to try, due to the fact that compromising with the rest of the world would be a less risky and more efficient way of achieving its goals. I’ve written about these ideas in a shortform thread here.
I read your linked shortform thread. I agreed with pretty most of your arguments against some common AGI takeover arguments. I agree that they won’t coordinate against us and won’t have “collective grudges” against us.
But I don’t think the arguments for continued stability are very thorough, either. I think we just don’t know how it will play out. And I think there’s a reason to be concerned that takeover will be rational for AGIs, where it’s not for humans.
The central difference in logic is the capacity for self-improvement. In your post, you addressed self-improvement by linking a Christiano piece on slow takeoff. But he noted at the start that he wasn’t arguing against self-improvement, only that the pace of self improvement would be more modest. But the potential implications for a balance of power in the world remain.
Humans are all locked to a similar level of cognitive and physical capabilities. That has implications for game theory where all of the competitors are humans. Cooperation often makes more sense for humans. But the same isn’t necessarily true of AGI. Their cognitive and physical capacities can potentially be expanded on. So it’s (very loosely) like the difference between game theory in chess, and chess where one of the moves is to add new capabilities to your pieces. We can’t learn much about the new game from theory of the old, particularly if we don’t even know all of the capabilities that a player might add to their pieces.
More concretely: it may be quite rational for a human controlling an AGI to tell it to try to self-improve and develop new capacities, strategies and technologies to potentially take over the world. With a first-mover advantage, such a takeover might be entirely possible. Its capacities might remain ahead of the rest of the world’s AI/AGIs if they hadn’t started to aggressively self-improve and develop the capacities to win conflicts. This would be particularly true if the aggressor AGI was willing to cause global catastrophe (e.g., EMPs, bringing down power grids).
The assumption of a stable balance of power in the face of competitors that can improve their capacities in dramatic ways seems unlikely to be true by default, and at the least, worthy of close inspection. Yet I’m afraid it’s the default assumption for many.
Your shortform post is more on-topic for this part of the discussion, so I’m copying this comment there and will continue there if you want. It’s worth more posts; I hope to write one myself if time allows.
Edit: It looks like there’s an extensive discussion there, including my points here, so I won’t bother copying this over. It looked like the point about self-improvement destabilizing the situation had been raised but not really addressed. So I continue to think it needs more thought before we accept a future that includes proliferation of AGI capable of RSI.
Thanks for engaging. I did read your linked post. I think you’re actually in the majority in your opinion on AI leading to a continuation and expansion of business as usual. I’ve long been curious about about this line of thinking; while it makes a good bit of sense to me for the near future, I become confused at the “indefinite” part of your prediction.
When you say that AI continues from the first step indefinitely, it seems to me that you must believe one or more of the following:
No one would ever tell their arbitrarily powerful AI to take over the world
Even if it might succeed
No arbitrarily powerful AI could succeed at taking over the world
Even if it was willing to do terrible damage in the process
We’ll have a limited number of humans controlling arbitrarily powerful AI
And an indefinitely stable balance-of-power agreement among them
By “indefinitely” you mean only until we create and proliferate really powerful AI
If I believed in any of those, I’d agree with you.
Or perhaps I’m missing some other belief we don’t share that leads to your conclusions.
Care to share?
Separately, in response to that post: your post you linked was titled AI values will be shaped by a variety of forces, not just the values of AI developers. In my prediction here, AI and AGI will not have values in any important sense; it will merely carry out the values of its principals (its creators, or the government that shows up to take control). This might just be terminological distinction, except for the following bit of implied logic: I don’t think AI needs to share clients’ values to be of immense economic and practical advantage to them. When (if) someone creates a highly capable AI system, they will instruct it to serve customers needs in certain ways, including following their requests within certain limits; this will not necessitate changing the A(G)I’s core values (if they exist) to use it to make enormous profits when licensed to clients. To the extent this is correct, we should go on assuming that AI will share or at least follow its creators’ values (or IMO more likely, take orders/values from the government that takes control, citing security concerns)
This is closest to what I am saying. The current world appears to be in a state of inter-agent competition. Even as technology has gotten more advanced, and as agents have gotten powerful over time, no single unified agent has been able to obtain control over everything and win the entire pie, defeating all the other agents. I think we should expect this state of affairs to continue even as AGI gets invented and technology continues to get more powerful.
(One plausible exception to the idea that “no single agent has ever won the competition over the world” is the human species itself, which dominates over other animal species. But I don’t think the human species is well-described as a unified agent, and I think our power comes mostly from accumulated technological abilities, rather than raw intelligence by itself. This distinction is important because the effects of technological innovation generally diffuse across society rather than giving highly concentrated powers to the people who invent stuff. This generally makes the situation with humans vs. animals disanalogous to a hypothetical AGI foom in several important ways.)
Separately, I also think that even if an AGI agent could violently take over the world, it would likely not be rational for it to try, due to the fact that compromising with the rest of the world would be a less risky and more efficient way of achieving its goals. I’ve written about these ideas in a shortform thread here.
I read your linked shortform thread. I agreed with pretty most of your arguments against some common AGI takeover arguments. I agree that they won’t coordinate against us and won’t have “collective grudges” against us.
But I don’t think the arguments for continued stability are very thorough, either. I think we just don’t know how it will play out. And I think there’s a reason to be concerned that takeover will be rational for AGIs, where it’s not for humans.
The central difference in logic is the capacity for self-improvement. In your post, you addressed self-improvement by linking a Christiano piece on slow takeoff. But he noted at the start that he wasn’t arguing against self-improvement, only that the pace of self improvement would be more modest. But the potential implications for a balance of power in the world remain.
Humans are all locked to a similar level of cognitive and physical capabilities. That has implications for game theory where all of the competitors are humans. Cooperation often makes more sense for humans. But the same isn’t necessarily true of AGI. Their cognitive and physical capacities can potentially be expanded on. So it’s (very loosely) like the difference between game theory in chess, and chess where one of the moves is to add new capabilities to your pieces. We can’t learn much about the new game from theory of the old, particularly if we don’t even know all of the capabilities that a player might add to their pieces.
More concretely: it may be quite rational for a human controlling an AGI to tell it to try to self-improve and develop new capacities, strategies and technologies to potentially take over the world. With a first-mover advantage, such a takeover might be entirely possible. Its capacities might remain ahead of the rest of the world’s AI/AGIs if they hadn’t started to aggressively self-improve and develop the capacities to win conflicts. This would be particularly true if the aggressor AGI was willing to cause global catastrophe (e.g., EMPs, bringing down power grids).
The assumption of a stable balance of power in the face of competitors that can improve their capacities in dramatic ways seems unlikely to be true by default, and at the least, worthy of close inspection. Yet I’m afraid it’s the default assumption for many.
Your shortform post is more on-topic for this part of the discussion, so I’m copying this comment there and will continue there if you want. It’s worth more posts; I hope to write one myself if time allows.
Edit: It looks like there’s an extensive discussion there, including my points here, so I won’t bother copying this over. It looked like the point about self-improvement destabilizing the situation had been raised but not really addressed. So I continue to think it needs more thought before we accept a future that includes proliferation of AGI capable of RSI.