I don’t understand the motivation for defining “okay” as 20% max value. The cosmic endowment, and the space of things that could be done with it, is very large compared to anything we can imagine. If we’re going to be talking about a subjective “okay” standard, what makes 20% okay, but 0.00002% not-okay?
I would expect 0.00002% (e.g., in scenarios where AI “‘pension[s] us off,’ giv[ing] us [a percentage] in exchange for being parents and tak[ing] the rest of the galaxy for verself”, as mentioned in “Creating Friendly AI” (2001)) to subjectively feel great. (To be clear, I understand that there are reasons to not expect to get a pension.)
If we’re going to be talking about a subjective “okay” standard, what makes 20% okay, but 0.00002% not-okay?
Scale sensitivity.
From our perspective today, 20% max value and 0.00002% max value both emotionally mean “infinity”, so they are like the same thing. When we get to the 0.00002% max value, the difference between “all that we can ever have” and “we could have had a million times more” will feel differently.
(Intuition: How would you feel if you found out that your life could have been literally million times better, but someone decided for you that both options are good enough so it makes no sense to fret about the difference?)
Counter-intuition, if I’m playing Russian Roulette while holding a lottery ticket in my other hand, then staying alive but not winning the lottery is an “okay” outcome.
Believing that ‘a perfected human civilization spanning hundreds of galaxies’ is a loss condition of AI, rather than a win condition, is not entirely obviously wrong, but certainly doesn’t seem obviously right.
And if you argue ‘AI is extraordinarily likely to lead to a bad outcome for humans’ while including ‘hundreds of galaxies of humans’ as a ‘bad outcome’, that seems fairly disingenuous.
In economics, “we can model utility as logarithmic in wealth”, even after adding human capital to wealth, feels like a silly asymptotic approximation that obviously breaks down in the other direction as wealth goes to zero and modeled utility to negative infinity.
In cosmology, though, the difference between “humanity only gets a millionth of its light cone” and “humanity goes extinct” actually does feel bigger than the difference between “humanity only gets a millionth of its light cone” and “humanity gets a fifth of its light cone”; not infinitely bigger, but a lot more than you’d expect by modeling marginal utility as a constant as wealth goes to zero.
This is all subjective; others’ feelings may differ.
(I’m also open in theory to valuing an appropriately-complete successor to humanity equally to humanity 1.0, whether the successor is carbon or silicon or whatever, but I don’t see how “appropriately-complete” is likely so I’m ignoring the possibility above.)
Arbitrary and personal. Given how bad things presently look, over 20% is about the level where I’m like “Yeah okay I will grab for that” and much under 20% is where I’m like “Not okay keep looking.”
I think this depends on whether one takes an egoistic or even person-affecting perspective (“how will current humans feel about this when this happens?”) or a welfare-maximising consequentialist perspective (“how does this look on the view from nowhere”): If one assumes welfare-maximised utility to be linear or near-linear in the number of galaxies controlled, the 0.00002% outcome is far far worse than the 20% outcome, even though I personally would still be happy with the former.
I don’t understand the motivation for defining “okay” as 20% max value. The cosmic endowment, and the space of things that could be done with it, is very large compared to anything we can imagine. If we’re going to be talking about a subjective “okay” standard, what makes 20% okay, but 0.00002% not-okay?
I would expect 0.00002% (e.g., in scenarios where AI “‘pension[s] us off,’ giv[ing] us [a percentage] in exchange for being parents and tak[ing] the rest of the galaxy for verself”, as mentioned in “Creating Friendly AI” (2001)) to subjectively feel great. (To be clear, I understand that there are reasons to not expect to get a pension.)
Scale sensitivity.
From our perspective today, 20% max value and 0.00002% max value both emotionally mean “infinity”, so they are like the same thing. When we get to the 0.00002% max value, the difference between “all that we can ever have” and “we could have had a million times more” will feel differently.
(Intuition: How would you feel if you found out that your life could have been literally million times better, but someone decided for you that both options are good enough so it makes no sense to fret about the difference?)
Counter-intuition, if I’m playing Russian Roulette while holding a lottery ticket in my other hand, then staying alive but not winning the lottery is an “okay” outcome.
Believing that ‘a perfected human civilization spanning hundreds of galaxies’ is a loss condition of AI, rather than a win condition, is not entirely obviously wrong, but certainly doesn’t seem obviously right.
And if you argue ‘AI is extraordinarily likely to lead to a bad outcome for humans’ while including ‘hundreds of galaxies of humans’ as a ‘bad outcome’, that seems fairly disingenuous.
In economics, “we can model utility as logarithmic in wealth”, even after adding human capital to wealth, feels like a silly asymptotic approximation that obviously breaks down in the other direction as wealth goes to zero and modeled utility to negative infinity.
In cosmology, though, the difference between “humanity only gets a millionth of its light cone” and “humanity goes extinct” actually does feel bigger than the difference between “humanity only gets a millionth of its light cone” and “humanity gets a fifth of its light cone”; not infinitely bigger, but a lot more than you’d expect by modeling marginal utility as a constant as wealth goes to zero.
This is all subjective; others’ feelings may differ.
(I’m also open in theory to valuing an appropriately-complete successor to humanity equally to humanity 1.0, whether the successor is carbon or silicon or whatever, but I don’t see how “appropriately-complete” is likely so I’m ignoring the possibility above.)
Arbitrary and personal. Given how bad things presently look, over 20% is about the level where I’m like “Yeah okay I will grab for that” and much under 20% is where I’m like “Not okay keep looking.”
I think this depends on whether one takes an egoistic or even person-affecting perspective (“how will current humans feel about this when this happens?”) or a welfare-maximising consequentialist perspective (“how does this look on the view from nowhere”): If one assumes welfare-maximised utility to be linear or near-linear in the number of galaxies controlled, the 0.00002% outcome is far far worse than the 20% outcome, even though I personally would still be happy with the former.