I’m having a failure of imagination here, I assume an aligned AGI would upload human minds, preventing them from dying, but beyond that it’s not clear to me what would happen. Would we just get wireheaded? Is there a CEV from human values that results in choice or life that has an analogue to a good life right now? Does anyone understand what would happen if you gave an AGI human values and told it to extrapolate from there?
[Question] What would the creation of aligned AGI look like for us?
I think we’re much closer to AGI than we are to being able to upload human minds.
So, although an aligned AGI would probably accelerate research for uploading, I don’t think it’s guaranteed it would succeed in short order or even ever.
This is a really good question!
Some of this is covered in the Fun Theory sequence, I believe.
In short: do you want to be wireheaded by FAI?
And if not, why would you assume it would do that, then?
Do you want to have no choice in the kind of life you live in utopia?
And if not, why do you assume FAI would leave you no choice?
I suppose you might think the extrapolated version of your values or of humanity’s values would want these things. But I strongly disagree. I don’t think it is the case at all.
If the utopia you are imagining sounds horrible to you, let’s not do that. Let’s figure out how to build a better utopia.
There are many possible ways to extrapolate from human values. How do we figure out which one we prefer? I’m not sure this problem has been completely solved yet. (have you read Eliezer’s 2004 paper on CEV? It makes an attempt at solving it).
But I do think it’s kinda of obvious that most humans would prefer not to be wireheaded? That we have other desires than to be reduced to a thoughtless entity feeling nothing but pleasure? That we, in fact, would find the idea of it repulsive and horrible?
I believe, almost certainly, that the answer is yes.
Thanks for the answer! As you suspected, I don’t think wireheading is a good thing, but after reading about infinite ethics and the repugnant conclusion I’m not entirely sure that there exists a stable mathematically expressible form of ethics we could give to an AGI. Obviously I think it’s possible if you specify exactly what you want and tell the AGI not to extrapolate. However I feel that realistically, it’s going to take our ethics and take it to its logical end, and there exists no ethical theory that really expresses how utility should be valued without causing paradoxes or problems we can’t solve. Unless we manage to build AGI using an evolutionary method to mimick human evolution, I believe that any training or theory given to it would subtly fail.
It depends on what is possible in principle. My opinion is that it will say something like this to it’s owner: “In fact, current technology level is very close to fundamental limits, so no nanorobots, no immortality, no space colonies et. ct. Maybe make some money on stock market for you?”