No one wants to be turned into paperclips. Or uploaded and copied into millions of deathless ems, to do rote computations at the wrapper-mind’s behest forever, or to act out roles in some strange hell.
This is a tangent (I have nothing to add to the “are wrapper-minds inevitable” discussion), but this quote raises a separate question that I have been concerned about and have not really seen anyone discuss directly, that I would greatly appreciate other opinions on.
My reaction to the above quote is essentially that, yes, obviously I don’t want to be turned into paperclips. That is zero utility forever.
But I much more strongly do not want to be tortured forever.
What frustrates me greatly that when people talk about the worst possible future in the event of an unfriendly AI, they usually talk about (something along the lines of) the universe being converted into paperclips.
And that is so, so, so far away from being the worst possible future. That is a future with exactly zero utility: there are no positive experiences, but there is no suffering either.
The worst possible future is a future where an enormous number of minds are tortured for eternity. And there is an infinite amount of space for futures that are not that bad, but are still vastly worse than a universe of zero utility. And an unfriendly AI would have the power to create any of those universes that it wants (whether it would want to is, of course, a separate question).
So—I am not an expert, but it is unclear to me why no one appears to take [the risk that the future universe is far worse than just a paperclip universe] seriously. The most logical explanation is that experts are very, very confident that an unfriendly AI would not create such a universe. But if so, I am not sure why that is the case, and I cannot remember ever seeing anyone explicitly make that claim or any argument for it. If someone wants to make this argument or has a relevant link, I would love to see it.
The keywords for this concern are s-risk / astronomical suffering. I think this is unlikely, since a wrapper-mind that would pursue suffering thereby cares about human-specific concerns, which requires alignment-structure. So more likely this either isn’t systematically pursued (other than as incidental mindcrime, which allows suffering but doesn’t optimize for it), or we get full alignment (for whatever reason).
We don’t know what goal(s) the AGI will ultimately have. (We can’t reliably ensure what those goals are.)
There is no particular reason to believe it will have any particular goal.
Looking at all the possible goals that it might have, goals of explicitly benefiting or harming human beings are not particularly likely.
On the other hand, because human beings use resources which the AGI might want to use for its own goals and/or might pose a threat to the AGI (by, e.g. creating other AGIs) there are reasons why an AGI not dedicated to harming or benefiting humanity might destroy humanity anyway. (This is an example or corollary of “instrumental convergence”.)
Because of 3, minds tortured for eternity is highly unlikely.
Because of 4, humanity being ended in the service of some alien goal which has zero utility from the perspective of humanity is far more likely.
This is a tangent (I have nothing to add to the “are wrapper-minds inevitable” discussion), but this quote raises a separate question that I have been concerned about and have not really seen anyone discuss directly, that I would greatly appreciate other opinions on.
My reaction to the above quote is essentially that, yes, obviously I don’t want to be turned into paperclips. That is zero utility forever.
But I much more strongly do not want to be tortured forever.
What frustrates me greatly that when people talk about the worst possible future in the event of an unfriendly AI, they usually talk about (something along the lines of) the universe being converted into paperclips.
And that is so, so, so far away from being the worst possible future. That is a future with exactly zero utility: there are no positive experiences, but there is no suffering either.
The worst possible future is a future where an enormous number of minds are tortured for eternity. And there is an infinite amount of space for futures that are not that bad, but are still vastly worse than a universe of zero utility. And an unfriendly AI would have the power to create any of those universes that it wants (whether it would want to is, of course, a separate question).
So—I am not an expert, but it is unclear to me why no one appears to take [the risk that the future universe is far worse than just a paperclip universe] seriously. The most logical explanation is that experts are very, very confident that an unfriendly AI would not create such a universe. But if so, I am not sure why that is the case, and I cannot remember ever seeing anyone explicitly make that claim or any argument for it. If someone wants to make this argument or has a relevant link, I would love to see it.
The keywords for this concern are s-risk / astronomical suffering. I think this is unlikely, since a wrapper-mind that would pursue suffering thereby cares about human-specific concerns, which requires alignment-structure. So more likely this either isn’t systematically pursued (other than as incidental mindcrime, which allows suffering but doesn’t optimize for it), or we get full alignment (for whatever reason).
One line of reasoning is as follows:
We don’t know what goal(s) the AGI will ultimately have. (We can’t reliably ensure what those goals are.)
There is no particular reason to believe it will have any particular goal.
Looking at all the possible goals that it might have, goals of explicitly benefiting or harming human beings are not particularly likely.
On the other hand, because human beings use resources which the AGI might want to use for its own goals and/or might pose a threat to the AGI (by, e.g. creating other AGIs) there are reasons why an AGI not dedicated to harming or benefiting humanity might destroy humanity anyway. (This is an example or corollary of “instrumental convergence”.)
Because of 3, minds tortured for eternity is highly unlikely.
Because of 4, humanity being ended in the service of some alien goal which has zero utility from the perspective of humanity is far more likely.