I think I feel a similar mix of love and frustration for your comment as I read your comment expressing with the post.
Let me be a bit theoretical for a moment. It makes sense for me to think of utilities as a sum U=aUa+bUb where Ua is the utility of things after singularity/superintelligence/etc and Ub the utility for things before then (assuming both are scaled to have similar magnitudes so the relative importance is given by the scaling factors). There’s no arguing about the shape of these or what factors people chose because there’s no arguing about utility functions (although people can be really bad at actually visualizing Ua).
Separately form this we have actions that look like optimizing for Ua (e.g. AI Safety research and raising awareness), and those that look like optimizing for Ub (e.g. having kids and investing in/for their education). The post argues that some things that look like optimizing for Ub are actually very useful for optimizing Ua (as I understand, it mostly because AI timelines are long enough and the optimization space muddled enough that most people contribute more in expectation from maintaining and improving their general capabilities in a sustainable way at the moment).
Your comment (the pedantic response part) talks about how optimizing for Ua is actually very useful for optimizing Ub. I’m much more sceptical of this claim. The reason is due to expected impact per unit of effort. Let’s consider the sending your kids to college. It looks like top US colleges cost around $50k more per year than state schools, adding up to $200k for a four year programme. This is maybe not several times better as the price tags suggests, but if your child is interested and able to get in to such a school it’s probably at least 10% better (to be quite conservative). A lot of people would be extremely excited for an opportunity to lower the existential risk from AI by 10% for $200k. Sure, sending your kids to college isn’t everything there is to Ub, but it looks like the sign remains the same for a couple of orders of magnitude.
Your talk of a pendulum makes it sound like you want to create a social environment that incentivizes things that look like optimizing for Ua regardless of whether they’re actually in anyone’s best interest. I’m sceptical of trying to get anyone to act against their interests. Rather than make everyone signal that a≫b it makes more sense to have space for people with a≈b or even a<b to optimize for their values and extract gains from trade. A successful AI Safety project probably looks a lot more like a network of very different people figuring out how to collaborate for mutual benefit than a cadre of self-sacrificing idealists.
I chose the college example because it’s especially jarring / especially disrespectful of trying to separate the world into two “pre-AGI versus post-AGI” magisteria.
A more obvious way to see that x-risk matters for ordinary day-to-day goals is that parents want their kids to have long, happy lives (and nearly all of the variance in length and happiness is, in real life, dependent on whether the AGI transition goes well or poorly). It’s not a separate goal; it’s the same goal, optimized without treating ‘AGI kills my kids’ as though it’s somehow better than ‘my kids die in a car accident’.
Your talk of a pendulum makes it sound like you want to create a social environment that incentivizes things that look like optimizing for Ua regardless of whether they’re actually in anyone’s best interest.
I and my kids not being killed by AGI is in my best interest!
A successful AI Safety project probably looks a lot more like a network of very different people figuring out how to collaborate for mutual benefit than a cadre of self-sacrificing idealists.
Not letting AGI kill me and everyone I love isn’t the “self-sacrificing” option! Allowing AGI to kill me is the “self-sacrificing” option — it is literally allowing myself to be sacrificed, albeit for ~zero gain. (Which is even worse than sacrificing yourself for a benefit!)
I’m not advocating for people to pretend they’re more altruistic than they are, and I don’t see myself as advocating against any of the concrete advice in the OP. I’m advocating for people to stop talking/thinking as though post-AGI life is a different magisterium from pre-AGI life, or as though AGI has no effect on their ability to realize the totally ordinary goals of their current life.
I think this would help with shrugging-at-xrisk psychological factors that aren’t ‘people aren’t altruistic enough’, but rather ‘people are more myopic than they wish they were’, ‘people don’t properly emotionally appreciate risks and opportunities that are novel and weird’, etc.
I’m advocating for people to stop talking/thinking as though post-AGI life is a different magisterium from pre-AGI life
Seems undignified to pretend that it isn’t? The balance of forces that make up our world isn’t stable. One way or the other, it’s not going to last. It would certainly be nice, if someone knew how, to arrange for there to be something of human value on the other side. But it’s not a coincidence that the college example is about delaying the phase transition to the other magisterium, rather than expecting as a matter of course that people in technologically mature civilizations will be going to college, even conditional on the somewhat dubious premise that technologically mature civilizations have “people” in them.
The physical world has phase transitions, but it doesn’t have magisteria. ‘Non-overlapping magisteria’, as I’m using the term, is a question about literary genres; about which inferences are allowed to propagate or transfer; about whether a thing feels near-mode or far-mode; etc.
The idea of “going to college” post-AGI sounds silly for two distinct reasons:
The post-singularity world will genuinely be very different from today’s world, and institutions like college are likely to be erased or wildly transformed on relatively short timescales.
The post-singularity world feels like an inherently “far-mode world” where everything that happens is fantastic and large-scale; none of the humdrum minutiae of a single person’s life, ambitions, day-to-day routine, etc. This includes ‘personal goals are near, altruistic goals are far’.
1 is reasonable, but 2 is not.
The original example was about “romantic and reproductive goals”. If the AGI transition goes well, it’s true that romance and reproduction may work radically differently post-AGI, or may be replaced with something wild and weird and new.
But it doesn’t follow from this that we should think of post-AGI-ish goals as a separate magisterium from romantic and reproductive goals. Making the transition to AGI go well is still a good way to ensure romantic and reproductive success (especially qua “long-term goals/flourishing”, as described in the OP), or success on goals that end up mattering even more to you than those things, if circumstances change in such a way that there’s now some crazy, even better posthuman opportunity that you prefer even more.
(I’m assuming here that we shouldn’t optimize goals like “kids get to go to college if they want” in totally qualitatively different ways than we optimize “kids get to go to college if they want, modulo the fact that circumstances might change in ways that bring other values to the fore instead”. I’m deliberately choosing an adorably circa-2022 goal that seems especially unlikely to carry over to a crazy post-AGI world, “college”, because I think the best way to reason about a goal like that is similar to the best way to reason about other goals where it’s more uncertain whether the goal will transfer over to the new phase.)
I think I feel a similar mix of love and frustration for your comment as I read your comment expressing with the post.
Let me be a bit theoretical for a moment. It makes sense for me to think of utilities as a sum U=aUa+bUb where Ua is the utility of things after singularity/superintelligence/etc and Ub the utility for things before then (assuming both are scaled to have similar magnitudes so the relative importance is given by the scaling factors). There’s no arguing about the shape of these or what factors people chose because there’s no arguing about utility functions (although people can be really bad at actually visualizing Ua).
Separately form this we have actions that look like optimizing for Ua (e.g. AI Safety research and raising awareness), and those that look like optimizing for Ub (e.g. having kids and investing in/for their education). The post argues that some things that look like optimizing for Ub are actually very useful for optimizing Ua (as I understand, it mostly because AI timelines are long enough and the optimization space muddled enough that most people contribute more in expectation from maintaining and improving their general capabilities in a sustainable way at the moment).
Your comment (the pedantic response part) talks about how optimizing for Ua is actually very useful for optimizing Ub. I’m much more sceptical of this claim. The reason is due to expected impact per unit of effort. Let’s consider the sending your kids to college. It looks like top US colleges cost around $50k more per year than state schools, adding up to $200k for a four year programme. This is maybe not several times better as the price tags suggests, but if your child is interested and able to get in to such a school it’s probably at least 10% better (to be quite conservative). A lot of people would be extremely excited for an opportunity to lower the existential risk from AI by 10% for $200k. Sure, sending your kids to college isn’t everything there is to Ub, but it looks like the sign remains the same for a couple of orders of magnitude.
Your talk of a pendulum makes it sound like you want to create a social environment that incentivizes things that look like optimizing for Ua regardless of whether they’re actually in anyone’s best interest. I’m sceptical of trying to get anyone to act against their interests. Rather than make everyone signal that a≫b it makes more sense to have space for people with a≈b or even a<b to optimize for their values and extract gains from trade. A successful AI Safety project probably looks a lot more like a network of very different people figuring out how to collaborate for mutual benefit than a cadre of self-sacrificing idealists.
I chose the college example because it’s especially jarring / especially disrespectful of trying to separate the world into two “pre-AGI versus post-AGI” magisteria.
A more obvious way to see that x-risk matters for ordinary day-to-day goals is that parents want their kids to have long, happy lives (and nearly all of the variance in length and happiness is, in real life, dependent on whether the AGI transition goes well or poorly). It’s not a separate goal; it’s the same goal, optimized without treating ‘AGI kills my kids’ as though it’s somehow better than ‘my kids die in a car accident’.
I and my kids not being killed by AGI is in my best interest!
Not letting AGI kill me and everyone I love isn’t the “self-sacrificing” option! Allowing AGI to kill me is the “self-sacrificing” option — it is literally allowing myself to be sacrificed, albeit for ~zero gain. (Which is even worse than sacrificing yourself for a benefit!)
I’m not advocating for people to pretend they’re more altruistic than they are, and I don’t see myself as advocating against any of the concrete advice in the OP. I’m advocating for people to stop talking/thinking as though post-AGI life is a different magisterium from pre-AGI life, or as though AGI has no effect on their ability to realize the totally ordinary goals of their current life.
I think this would help with shrugging-at-xrisk psychological factors that aren’t ‘people aren’t altruistic enough’, but rather ‘people are more myopic than they wish they were’, ‘people don’t properly emotionally appreciate risks and opportunities that are novel and weird’, etc.
Seems undignified to pretend that it isn’t? The balance of forces that make up our world isn’t stable. One way or the other, it’s not going to last. It would certainly be nice, if someone knew how, to arrange for there to be something of human value on the other side. But it’s not a coincidence that the college example is about delaying the phase transition to the other magisterium, rather than expecting as a matter of course that people in technologically mature civilizations will be going to college, even conditional on the somewhat dubious premise that technologically mature civilizations have “people” in them.
The physical world has phase transitions, but it doesn’t have magisteria. ‘Non-overlapping magisteria’, as I’m using the term, is a question about literary genres; about which inferences are allowed to propagate or transfer; about whether a thing feels near-mode or far-mode; etc.
The idea of “going to college” post-AGI sounds silly for two distinct reasons:
The post-singularity world will genuinely be very different from today’s world, and institutions like college are likely to be erased or wildly transformed on relatively short timescales.
The post-singularity world feels like an inherently “far-mode world” where everything that happens is fantastic and large-scale; none of the humdrum minutiae of a single person’s life, ambitions, day-to-day routine, etc. This includes ‘personal goals are near, altruistic goals are far’.
1 is reasonable, but 2 is not.
The original example was about “romantic and reproductive goals”. If the AGI transition goes well, it’s true that romance and reproduction may work radically differently post-AGI, or may be replaced with something wild and weird and new.
But it doesn’t follow from this that we should think of post-AGI-ish goals as a separate magisterium from romantic and reproductive goals. Making the transition to AGI go well is still a good way to ensure romantic and reproductive success (especially qua “long-term goals/flourishing”, as described in the OP), or success on goals that end up mattering even more to you than those things, if circumstances change in such a way that there’s now some crazy, even better posthuman opportunity that you prefer even more.
(I’m assuming here that we shouldn’t optimize goals like “kids get to go to college if they want” in totally qualitatively different ways than we optimize “kids get to go to college if they want, modulo the fact that circumstances might change in ways that bring other values to the fore instead”. I’m deliberately choosing an adorably circa-2022 goal that seems especially unlikely to carry over to a crazy post-AGI world, “college”, because I think the best way to reason about a goal like that is similar to the best way to reason about other goals where it’s more uncertain whether the goal will transfer over to the new phase.)