Why would you want to control a superintelligence aligned with our values? What would be the point of that?
Why would we want to allow for individual humans, who are less-than-perfectly-aligned with our values, to control a superintelligence that is perfectly-aligned-with-our-values?
A Superintelligence would be so far superior to any human or group of humans, and able to manipulate humans so well, that any “choice” humanity faces will be predetermined.
I guess the positive way to phrase this is, “FAI would create an environment where the natural results of our choices would typically be good outcomes” (typically, but not always, because being optimized too hard to succeed is not fun).
Talking about manipulation seems to imply that FAI would trick humans into making choices against their own best interest. I don’t think that, typically, is what would happen
I also see a scenario where FAI deliberately limits its ability to predict people’s actions, out of respect for people being upset over the feeling of their choices being “predetermined”.
But only a faint illusory glimmer of human choice would remain, while the open-ended, agentic power over the Universe would have left humanity with the creation of the first Superintelligence.
Meh. I’d rather have the FAI make the big-picture decisions, rather than some corrupt/flawed group of human officials falling prey to the usual bias in human thinking. Either way, I am not the one making the decisions, so what does it matter to me? At least FAI would actually make good decisions.
I didn’t mean to make 1. sound bad. I’m only trying to put my finger on a crux. My impression of most prosaic alignment work seems to be that they have 2. in mind, even though MIRI/Bostrom/LW seem to believe that 1. is actually what we should be aiming towards. Do prosaic alignment people think that work on human ‘control’ now will lead to scenario 1 in the long run, or do they just reject scenario 1?
I’m not sure I understand the “prosaic alignment” position well enough to answer this.
I guess, personally, I can see appeal of scenario 2, of keeping a super-optimizer under control and using it in limited ways to solve specific problems. I also find that scenario incredibly terrifying, because super-optimizers that don’t optimize for the full set of human values are dangerous.
Why would you want to control a superintelligence aligned with our values? What would be the point of that?
Why would we want to allow for individual humans, who are less-than-perfectly-aligned with our values, to control a superintelligence that is perfectly-aligned-with-our-values?
I guess the positive way to phrase this is, “FAI would create an environment where the natural results of our choices would typically be good outcomes” (typically, but not always, because being optimized too hard to succeed is not fun).
Talking about manipulation seems to imply that FAI would trick humans into making choices against their own best interest. I don’t think that, typically, is what would happen
I also see a scenario where FAI deliberately limits its ability to predict people’s actions, out of respect for people being upset over the feeling of their choices being “predetermined”.
Meh. I’d rather have the FAI make the big-picture decisions, rather than some corrupt/flawed group of human officials falling prey to the usual bias in human thinking. Either way, I am not the one making the decisions, so what does it matter to me? At least FAI would actually make good decisions.
I didn’t mean to make 1. sound bad. I’m only trying to put my finger on a crux. My impression of most prosaic alignment work seems to be that they have 2. in mind, even though MIRI/Bostrom/LW seem to believe that 1. is actually what we should be aiming towards. Do prosaic alignment people think that work on human ‘control’ now will lead to scenario 1 in the long run, or do they just reject scenario 1?
I’m not sure I understand the “prosaic alignment” position well enough to answer this.
I guess, personally, I can see appeal of scenario 2, of keeping a super-optimizer under control and using it in limited ways to solve specific problems. I also find that scenario incredibly terrifying, because super-optimizers that don’t optimize for the full set of human values are dangerous.