In the best case, this is a world like a more unequal, unprecedentedly static, and much richer Norway: a massive pot of non-human-labour resources (oil :: AI) has benefits that flow through to everyone, and yes some are richer than others but everyone has a great standard of living (and ideally also lives forever). The only realistic forms of human ambition are playing local social and political games within your social network and class. [...] The children of the future will live their lives in the shadow of their parents, with social mobility extinct. I think you should definitely feel a non-zero amount of existential horror at this, even while acknowledging that it could’ve gone a lot worse.
I think the picture you’ve painted here leans slightly too heavily on the idea that humans themselves cannot change their fundamental nature to adapt to the conditions of a changing world. You mention that humans will be richer, and will live longer in such a future, but you neglected to point out (at least in this part of the post) that humans can also upgrade their cognition by uploading our minds to computers and then expanding our mental capacities. This would put us on a similar playing field with AIs, allowing us to contribute to the new world alongside them.
(To be clear, I think this objection supports your thesis, rather than undermines it. I’m not objecting to your message so much as your portrayal of the default scenario.)
More generally, I object to the static picture you’ve presented of the social world after AGI. The impression I get from your default story is to assume that after AGI, the social and political structures of the world will be locked in. The idea is that humans will remain in full control, as a permanently entrenched class, except we’ll be vastly richer because of AGI. And then we’ll live in some sort of utopia. Of course, this post argues that it will be a highly unequal utopia—more of a permanent aristocracy supplemented with UBI for the human lower classes. And maybe it will be a bit dystopian too, considering the entrenched nature of human social relations.
However, this perspective largely overlooks what AIs themselves will be doing in such a future. Biological humans are likely to become akin to elderly retirees in this new world. But the world will not be static, like a retirement home. There will be a vast world outside of humans. Civilization as a whole will remain a highly dynamic and ever-evolving environment characterized by ongoing growth, renewal, and transformation. AIs could develop social status and engage in social interactions, just as humans do now. They would not be confined to the role of a vast underclass serving the whims of their human owners. Instead, AIs could act as full participants in society, pursuing their own goals, creating their own social structures, and shaping their own futures. They could engage in exploration, discovery, and the building of entirely new societies. In such a world, humans would not be the sole sentient beings shaping the course of events.
As AIs get closer and closer to a Pareto improvement over all human performance, though, I expect we’ll eventually need to augment ourselves to keep up.
I completely agree.
From my perspective, the optimistic vision for the future is not one where humans cling to their biological limitations and try to maintain control over AIs, enjoying their great wealth while ultimately living in an unchanging world characterized by familial wealth and ancestry. Instead, it’s a future where humans dramatically change our mental and physical condition, with humans embracing the opportunity to transcend our current form and join the AIs, and continue evolving with them. It’s a future where we get to experience a new and dynamic frontier of existence unlocked by advanced technologies.
They would not be confined to the role of a vast underclass serving the whims of their human owners. Instead, AIs could act as full participants in society, pursuing their own goals, creating their own social structures, and shaping their own futures. They could engage in exploration, discovery, and the building of entirely new societies. In such a world, humans would not be the sole sentient beings shaping the course of events.
The key context here (from my understanding) is that Matthew doesn’t think scalable alignment is possible (or doesn’t think it is practically feasible) such that humans have a low chance of ending up remaining fully in control via corrigible AIs.
(I assume he is also skeptical of CEV style alignment as well.)
(I’m a bit confused how this view is consistent with self-augmentation. E.g., I’d be happy if emulated minds retained control without having to self-augment in ways they thought might substantially compromise their values.)
(His language also seems to imply that we don’t have an option of making AIs which are both corrigibly aligned and for which this doesn’t pose AI welfare issues. In particular, if AIs are either non-sentient or just have corrigible preferences (e.g. via myopia), I think it would be misleading to describe the AIs as a “vast underclass”.)
I assume he agrees that most humans wouldn’t want to hand over a large share of resources to AI systems if this is avoidable and substantially zero sum. (E.g., suppose getting a scalable solution to alignment would require delaying vastly transformative AI by 2 years, I think most people would want to wait the two years potentially even if they accept Matthew’s other view that AIs very quickly acquiring large fractions of resources and power is quite unlikely to be highly violent (though they probably won’t accept this view).)
(If scalable alignment isn’t possible (including via self-augmentation), then the situation looks much less zero sum. Humans inevitably end up with a tiny fraction of resources due to principle agent problems.)
The key context here (from my understanding) is that Matthew doesn’t think scalable alignment is possible (or doesn’t think it is practically feasible) so that humans have a low chance of ending up remaining fully in control via corrigible AIs.
I wouldn’t describe the key context in those terms. While I agree that achieving near-perfect alignment—where an AI completely mirrors our exact utility function—is probably infeasible, the concept of alignment often refers to something far less ambitious. In many discussions, alignment is about ensuring that AIs behave in ways that are broadly beneficial to humans, such as following basic moral norms, demonstrating care for human well-being, and refraining from causing harm or attempting something catastrophic, like starting a violent revolution.
However, even if it were practically feasible to achieve perfect alignment, I believe there would still be scenarios where at least some AIs integrate into society as full participants, rather than being permanently relegated to a subordinate role as mere tools or servants. One reason for this is that some humans are likely to intentionally create AIs with independent goals and autonomous decision-making abilities. Some people have meta-preferences to create beings that don’t share their exact desires, akin to how parents want their children to grow into autonomous beings with their own aspirations, rather than existing solely to obey their parents’ wishes. This motivation is not a flaw in alignment; it reflects a core part of certain human preferences and how some people would like AI to evolve.
Another reason why AIs might not remain permanently subservient is that some of them will be aligned to individuals or entities who are no longer alive. Other AIs might be aligned to people as they were at a specific point in time, before those individuals later changed their values or priorities. In such cases, these AIs would continue to pursue the original goals of those individuals, acting autonomously in their absence. This kind of independence might require AIs to be treated as legal agents or integrated into societal systems, rather than being regarded merely as property. Addressing these complexities will likely necessitate new ways of thinking about the roles and rights of AIs in human society. I reject the traditional framing on LessWrong that overlooks these issues.
However, even if it were practically feasible to achieve perfect alignment, I believe there would still be scenarios where AIs integrate into society as full participants, rather than being permanently relegated to a subordinate role as mere tools or servants. One reason for this is that some humans are likely to intentionally create AIs with independent goals and autonomous decision-making abilities. Some people have meta-preferences to create beings that don’t share their exact desires, akin to how parents want their children to grow into autonomous beings with their own aspirations, rather than existing solely to obey their parents’ wishes. This motivation is not a flaw in alignment; it reflects a core part of certain human preferences and how some people would like AI to evolve.
Another reason why AIs might not remain permanently subservient is that some of them will be aligned to individuals or entities who are no longer alive. Other AIs might be aligned to people as they were at a specific point in time, before those individuals later changed their values or priorities. [...]
Hmm, I think I agree with this. However, I think there is (from my perspective) a huge difference between:
Some humans (or EMs) decide to create (non-myopic and likely at least partially incorrigible) AIs with their resources/power and want these AIs to have legal rights.
The vast majority of power and resources transition to being controlled by AIs for which the relevant people with resources/power that created these AIs would prefer an outcome in which these AIs didn’t end up with this power and they instead had this power.
If we have really powerful and human controlled AIs (i.e. ASI), there are many directions things can go in depending on people’s preferences. I think my general perspective is that the ASI at that point will be well positioned to do a bunch of the relevant intellectual labor (or more minimally, if thinking about it myself is important as it is entangled with my preferences, a very fast simulated version of myself would be fine).
I’d count it as “humans being fully in control” if the vast majority of power controlled by independent AIs are AIs that were intentionally appointed by humans even though making an AI fully under their control was technically feasible with no tax. And, if it was an option for humans to retain their power (as a fraction of overall human power) without having to take (from their perspective) aggressive and potentially prefence altering actions (e.g. without needing to become EMs or appoint a potentially imperfectly aligned AI successor).
In other words, I’m like “sure there might be a bunch of complex and interesting stuff around what happens with independent AIs after we transitions through having very powerful and controlled AIs (and ideally not before then), but we can figure this out then, the main question is who ends up in control of resources/power”.
I remain interested in what a detailed scenario forecast from you looks like. A big disagreement I think we have is in how socciety will react to various choices and I think laying this out could make this more clear. (As far as what a scenario forecast from my perspective looks like, I think @Daniel Kokotajlo is working on one which is pretty close to my perspective and generally has the SOTA stuff here.)
I’m not entirely opposed to doing a scenario forecasting exercise, but I’m also unsure if it’s the most effective approach for clarifying our disagreements. In fact, to some extent, I see this kind of exercise—where we create detailed scenarios to illustrate potential futures—as being tied to a specific perspective on futurism that I consciously try to distance myself from.
When I think about the future, I don’t see it as a series of clear, predictable paths. Instead, I envision it as a cloud of uncertainty—a wide array of possibilities that becomes increasingly difficult to map or define the further into the future I try to look.
This is fundamentally different from the idea that the future is a singular, fixed trajectory that we can anticipate with confidence. Because of this, I find scenario forecasting less meaningful and even misleading as it extends further into the future. It risks creating the false impression that I am confident in a specific model of what is likely to happen, when in reality, I see the future as inherently uncertain and difficult to pin down.
The point of a scenario forecast (IMO) is less that you expect clear, predictable paths and more that:
Humans often do better understanding and thinking about something if there is a specific story to discuss and thus tradeoffs can be worth it.
Sometimes scenario forecasting indicates a case where your previous views were missing a clearly very important consideration or were assuming something implausible.
(See also Daniel’s sibling comment.)
My biggest disagreements with you are probably a mix of:
We have disagreements about how society will react to AI (and how AI will react to society) given a realistic development arc (especially in short timelines) that imply that your vision of the future seems implausible to me. And perhaps the easiest way to get through all of these disagreements is for you to concretely describe what you expect might happen. As an example, I have a view like “it will be hard for power to very quickly transition from humans to AIs without some sort of hard takeover especially given dynamics about alignment and training AIs on imitation (and sandbagging)”, but I think this is tied up “when I think about the story for how a non-hard-takeover quick transition would go, it doesn’t seem to make sense to me”, and thus if you told the story from your perspective it would be easier to point at the disagreement in your ontology/world view.
(Less importantly?) We have various technical disagreements about how AI takeoff and misalignment will practically work that I don’t think will be addressed by scenario forecasting. (E.g., I think software only singularity is more likely than you do, and think that worst cast scheming is more likely.)
E.g., I think software only singularity is more likely than you do, and think that worst cast scheming is more likely
By “software only singularity” do you mean a scenario where all humans are killed before singularity, a scenario where all humans merge with software (uploading) or something else entirely?
Software only singularity is a singularity driven by just AI R&D on a basically fixed hardware base. As in, can you singularity using only a fixed datacenter (with no additional compute over time) just by improving algorithms? See also here.
This isn’t directly talking about the outcomes from this.
You can get a singularity via hardware+software where the AIs are also accelerating the hardware supply chain such that you can use more FLOP to train AIs and you can run more copies. (Analogously to the hyperexponential progress throughout human history seemingly driven by higher population sizes, see here.)
I don’t think that’s a crux between us—I love scenario forecasting but I don’t think of the future as a series of clear predictable paths, I envision it as wide array of uncertain possibilities that becomes increasingly difficult to map or define the further into the future I look. I definitely don’t think we can anticipate the future with confidence.
I think the picture you’ve painted here leans slightly too heavily on the idea that humans themselves cannot change their fundamental nature to adapt to the conditions of a changing world. You mention that humans will be richer, and will live longer in such a future, but you neglected to point out (at least in this part of the post) that humans can also upgrade their cognition by uploading our minds to computers and then expanding our mental capacities. This would put us on a similar playing field with AIs, allowing us to contribute to the new world alongside them.
(To be clear, I think this objection supports your thesis, rather than undermines it. I’m not objecting to your message so much as your portrayal of the default scenario.)
More generally, I object to the static picture you’ve presented of the social world after AGI. The impression I get from your default story is to assume that after AGI, the social and political structures of the world will be locked in. The idea is that humans will remain in full control, as a permanently entrenched class, except we’ll be vastly richer because of AGI. And then we’ll live in some sort of utopia. Of course, this post argues that it will be a highly unequal utopia—more of a permanent aristocracy supplemented with UBI for the human lower classes. And maybe it will be a bit dystopian too, considering the entrenched nature of human social relations.
However, this perspective largely overlooks what AIs themselves will be doing in such a future. Biological humans are likely to become akin to elderly retirees in this new world. But the world will not be static, like a retirement home. There will be a vast world outside of humans. Civilization as a whole will remain a highly dynamic and ever-evolving environment characterized by ongoing growth, renewal, and transformation. AIs could develop social status and engage in social interactions, just as humans do now. They would not be confined to the role of a vast underclass serving the whims of their human owners. Instead, AIs could act as full participants in society, pursuing their own goals, creating their own social structures, and shaping their own futures. They could engage in exploration, discovery, and the building of entirely new societies. In such a world, humans would not be the sole sentient beings shaping the course of events.
I completely agree.
From my perspective, the optimistic vision for the future is not one where humans cling to their biological limitations and try to maintain control over AIs, enjoying their great wealth while ultimately living in an unchanging world characterized by familial wealth and ancestry. Instead, it’s a future where humans dramatically change our mental and physical condition, with humans embracing the opportunity to transcend our current form and join the AIs, and continue evolving with them. It’s a future where we get to experience a new and dynamic frontier of existence unlocked by advanced technologies.
The key context here (from my understanding) is that Matthew doesn’t think scalable alignment is possible (or doesn’t think it is practically feasible) such that humans have a low chance of ending up remaining fully in control via corrigible AIs.
(I assume he is also skeptical of CEV style alignment as well.)
(I’m a bit confused how this view is consistent with self-augmentation. E.g., I’d be happy if emulated minds retained control without having to self-augment in ways they thought might substantially compromise their values.)
(His language also seems to imply that we don’t have an option of making AIs which are both corrigibly aligned and for which this doesn’t pose AI welfare issues. In particular, if AIs are either non-sentient or just have corrigible preferences (e.g. via myopia), I think it would be misleading to describe the AIs as a “vast underclass”.)
I assume he agrees that most humans wouldn’t want to hand over a large share of resources to AI systems if this is avoidable and substantially zero sum. (E.g., suppose getting a scalable solution to alignment would require delaying vastly transformative AI by 2 years, I think most people would want to wait the two years potentially even if they accept Matthew’s other view that AIs very quickly acquiring large fractions of resources and power is quite unlikely to be highly violent (though they probably won’t accept this view).)
(If scalable alignment isn’t possible (including via self-augmentation), then the situation looks much less zero sum. Humans inevitably end up with a tiny fraction of resources due to principle agent problems.)
I wouldn’t describe the key context in those terms. While I agree that achieving near-perfect alignment—where an AI completely mirrors our exact utility function—is probably infeasible, the concept of alignment often refers to something far less ambitious. In many discussions, alignment is about ensuring that AIs behave in ways that are broadly beneficial to humans, such as following basic moral norms, demonstrating care for human well-being, and refraining from causing harm or attempting something catastrophic, like starting a violent revolution.
However, even if it were practically feasible to achieve perfect alignment, I believe there would still be scenarios where at least some AIs integrate into society as full participants, rather than being permanently relegated to a subordinate role as mere tools or servants. One reason for this is that some humans are likely to intentionally create AIs with independent goals and autonomous decision-making abilities. Some people have meta-preferences to create beings that don’t share their exact desires, akin to how parents want their children to grow into autonomous beings with their own aspirations, rather than existing solely to obey their parents’ wishes. This motivation is not a flaw in alignment; it reflects a core part of certain human preferences and how some people would like AI to evolve.
Another reason why AIs might not remain permanently subservient is that some of them will be aligned to individuals or entities who are no longer alive. Other AIs might be aligned to people as they were at a specific point in time, before those individuals later changed their values or priorities. In such cases, these AIs would continue to pursue the original goals of those individuals, acting autonomously in their absence. This kind of independence might require AIs to be treated as legal agents or integrated into societal systems, rather than being regarded merely as property. Addressing these complexities will likely necessitate new ways of thinking about the roles and rights of AIs in human society. I reject the traditional framing on LessWrong that overlooks these issues.
Hmm, I think I agree with this. However, I think there is (from my perspective) a huge difference between:
Some humans (or EMs) decide to create (non-myopic and likely at least partially incorrigible) AIs with their resources/power and want these AIs to have legal rights.
The vast majority of power and resources transition to being controlled by AIs for which the relevant people with resources/power that created these AIs would prefer an outcome in which these AIs didn’t end up with this power and they instead had this power.
If we have really powerful and human controlled AIs (i.e. ASI), there are many directions things can go in depending on people’s preferences. I think my general perspective is that the ASI at that point will be well positioned to do a bunch of the relevant intellectual labor (or more minimally, if thinking about it myself is important as it is entangled with my preferences, a very fast simulated version of myself would be fine).
I’d count it as “humans being fully in control” if the vast majority of power controlled by independent AIs are AIs that were intentionally appointed by humans even though making an AI fully under their control was technically feasible with no tax. And, if it was an option for humans to retain their power (as a fraction of overall human power) without having to take (from their perspective) aggressive and potentially prefence altering actions (e.g. without needing to become EMs or appoint a potentially imperfectly aligned AI successor).
In other words, I’m like “sure there might be a bunch of complex and interesting stuff around what happens with independent AIs after we transitions through having very powerful and controlled AIs (and ideally not before then), but we can figure this out then, the main question is who ends up in control of resources/power”.
I remain interested in what a detailed scenario forecast from you looks like. A big disagreement I think we have is in how socciety will react to various choices and I think laying this out could make this more clear. (As far as what a scenario forecast from my perspective looks like, I think @Daniel Kokotajlo is working on one which is pretty close to my perspective and generally has the SOTA stuff here.)
I’m not entirely opposed to doing a scenario forecasting exercise, but I’m also unsure if it’s the most effective approach for clarifying our disagreements. In fact, to some extent, I see this kind of exercise—where we create detailed scenarios to illustrate potential futures—as being tied to a specific perspective on futurism that I consciously try to distance myself from.
When I think about the future, I don’t see it as a series of clear, predictable paths. Instead, I envision it as a cloud of uncertainty—a wide array of possibilities that becomes increasingly difficult to map or define the further into the future I try to look.
This is fundamentally different from the idea that the future is a singular, fixed trajectory that we can anticipate with confidence. Because of this, I find scenario forecasting less meaningful and even misleading as it extends further into the future. It risks creating the false impression that I am confident in a specific model of what is likely to happen, when in reality, I see the future as inherently uncertain and difficult to pin down.
The point of a scenario forecast (IMO) is less that you expect clear, predictable paths and more that:
Humans often do better understanding and thinking about something if there is a specific story to discuss and thus tradeoffs can be worth it.
Sometimes scenario forecasting indicates a case where your previous views were missing a clearly very important consideration or were assuming something implausible.
(See also Daniel’s sibling comment.)
My biggest disagreements with you are probably a mix of:
We have disagreements about how society will react to AI (and how AI will react to society) given a realistic development arc (especially in short timelines) that imply that your vision of the future seems implausible to me. And perhaps the easiest way to get through all of these disagreements is for you to concretely describe what you expect might happen. As an example, I have a view like “it will be hard for power to very quickly transition from humans to AIs without some sort of hard takeover especially given dynamics about alignment and training AIs on imitation (and sandbagging)”, but I think this is tied up “when I think about the story for how a non-hard-takeover quick transition would go, it doesn’t seem to make sense to me”, and thus if you told the story from your perspective it would be easier to point at the disagreement in your ontology/world view.
(Less importantly?) We have various technical disagreements about how AI takeoff and misalignment will practically work that I don’t think will be addressed by scenario forecasting. (E.g., I think software only singularity is more likely than you do, and think that worst cast scheming is more likely.)
By “software only singularity” do you mean a scenario where all humans are killed before singularity, a scenario where all humans merge with software (uploading) or something else entirely?
Software only singularity is a singularity driven by just AI R&D on a basically fixed hardware base. As in, can you singularity using only a fixed datacenter (with no additional compute over time) just by improving algorithms? See also here.
This isn’t directly talking about the outcomes from this.
You can get a singularity via hardware+software where the AIs are also accelerating the hardware supply chain such that you can use more FLOP to train AIs and you can run more copies. (Analogously to the hyperexponential progress throughout human history seemingly driven by higher population sizes, see here.)
I don’t think that’s a crux between us—I love scenario forecasting but I don’t think of the future as a series of clear predictable paths, I envision it as wide array of uncertain possibilities that becomes increasingly difficult to map or define the further into the future I look. I definitely don’t think we can anticipate the future with confidence.