Building a paperclipper is low-value (from the point of view of total utilitarianism, or any other moral view that wants a big flourishing future) because paperclips are not sentient / are not conscious / are not moral patients / are not capable of flourishing. So filling the lightcone with paperclips is low-value. It maybe has some value for the sake of the paperclipper (if the paperclipper is a moral patient, or whatever the relevant category is) but way less than the future could have.
Your counter is that maybe building an aligned AI is also low-value (from the point of view of total utilitarianism, or any other moral view that wants a big flourishing future) because humans might not much care about having a big flourishing future, or might even actively prefer things like preserving nature.
If a total utilitarian (or someone who wants a big flourishing future in our lightcone) buys your counter, it seems like the appropriate response is: Oh no! It looks like we’re heading towards a future that is many orders of magnitude worse than I hoped, whether or not we solve the alignment problem. Is there some way to get a big flourishing future? Maybe there’s something else that we need to build into our AI designs, besides “alignment”. (Perhaps mixed with some amount of: Hmmm, maybe I’m confused about morality. If AI-assisted humanity won’t want to steer towards a big flourishing future then maybe I’ve been misguided in having that aim.)
Whereas this post seems to suggest the response of: Oh well, I guess it’s a dice roll regardless of what sort of AI we build. Which is giving up awfully quickly, as if we had exhausted the design space for possible AIs and seen that there was no way to move forward with a large chance at a big flourishing future. This response also doesn’t seem very quantitative—it goes very quickly from the idea that an aligned AI might not get a big flourishing future, to the view that alignment is “neutral” as if the chances of getting a big flourishing future were identically small under both options. But the obvious question for a total utilitarian who does wind up with just 2 options, each of which is a dice roll, is Which set of dice has better odds?
Whereas this post seems to suggest the response of: Oh well, I guess it’s a dice roll regardless of what sort of AI we build. Which is giving up awfully quickly, as if we had exhausted the design space for possible AIs and seen that there was no way to move forward with a large chance at a big flourishing future.
I dispute that I’m “giving up” in any meaningful sense here. I’m happy to consider alternative proposals for how we could make the future large and flourishing from a total utilitarian perspective rather than merely trying to solve technical alignment problems. The post itself was simply intended to discuss the moral implications of AI alignment (itself a massive topic), but it was not intended to be an exhaustive survey of everything we can do to make the future go better. I agree we should aim high, in any case.
This response also doesn’t seem very quantitative—it goes very quickly from the idea that an aligned AI might not get a big flourishing future, to the view that alignment is “neutral” as if the chances of getting a big flourishing future were identically small under both options. But the obvious question for a total utilitarian who does wind up with just 2 options, each of which is a dice roll, is Which set of dice has better odds?
I don’t think this choice is literally a coin flip in expected value, and I agree that one might lean in one direction over the other. However, I think it’s quite hard to quantify this question meaningfully. My personal conclusion is simply that Iam not swayed in any particular direction on this question; I am currently suspending judgement. I think one could reasonably still think that it’s more like 60-40 thing than a 40-60 thing or 50-50 coin flip. But I guess in this case, I wanted to let my readers decide for themselves which of these numbers they want to take away from what I wrote, rather than trying to pin down a specific number for them.
Building a paperclipper is low-value (from the point of view of total utilitarianism, or any other moral view that wants a big flourishing future) because paperclips are not sentient / are not conscious / are not moral patients / are not capable of flourishing. So filling the lightcone with paperclips is low-value. It maybe has some value for the sake of the paperclipper (if the paperclipper is a moral patient, or whatever the relevant category is) but way less than the future could have.
Your counter is that maybe building an aligned AI is also low-value (from the point of view of total utilitarianism, or any other moral view that wants a big flourishing future) because humans might not much care about having a big flourishing future, or might even actively prefer things like preserving nature.
If a total utilitarian (or someone who wants a big flourishing future in our lightcone) buys your counter, it seems like the appropriate response is: Oh no! It looks like we’re heading towards a future that is many orders of magnitude worse than I hoped, whether or not we solve the alignment problem. Is there some way to get a big flourishing future? Maybe there’s something else that we need to build into our AI designs, besides “alignment”. (Perhaps mixed with some amount of: Hmmm, maybe I’m confused about morality. If AI-assisted humanity won’t want to steer towards a big flourishing future then maybe I’ve been misguided in having that aim.)
Whereas this post seems to suggest the response of: Oh well, I guess it’s a dice roll regardless of what sort of AI we build. Which is giving up awfully quickly, as if we had exhausted the design space for possible AIs and seen that there was no way to move forward with a large chance at a big flourishing future. This response also doesn’t seem very quantitative—it goes very quickly from the idea that an aligned AI might not get a big flourishing future, to the view that alignment is “neutral” as if the chances of getting a big flourishing future were identically small under both options. But the obvious question for a total utilitarian who does wind up with just 2 options, each of which is a dice roll, is Which set of dice has better odds?
I dispute that I’m “giving up” in any meaningful sense here. I’m happy to consider alternative proposals for how we could make the future large and flourishing from a total utilitarian perspective rather than merely trying to solve technical alignment problems. The post itself was simply intended to discuss the moral implications of AI alignment (itself a massive topic), but it was not intended to be an exhaustive survey of everything we can do to make the future go better. I agree we should aim high, in any case.
I don’t think this choice is literally a coin flip in expected value, and I agree that one might lean in one direction over the other. However, I think it’s quite hard to quantify this question meaningfully. My personal conclusion is simply that I am not swayed in any particular direction on this question; I am currently suspending judgement. I think one could reasonably still think that it’s more like 60-40 thing than a 40-60 thing or 50-50 coin flip. But I guess in this case, I wanted to let my readers decide for themselves which of these numbers they want to take away from what I wrote, rather than trying to pin down a specific number for them.