EY says it best in The Sheer Folly of Callow Youth, but essentially EY once thought, “If there is truly such a thing as moral value, then a superintelligence will likely discover what the correct moral values are. If there is no such thing as moral value, then the current reality is no more valuable than the reality where I make an AI that kills everyone. Therefore, I should strive to make an AI regardless of ethical problems.”
Then in the early 2000s he had an epiphany. The mechanics of his objection had to do with disproving the first part of the argument, that a superintelligence would automatically do the ‘right’ thing in a universe with ethics. This is because you could build an AI ‘foo’ which was a superintelligence, and an AI ‘bar’ which was ‘foo’ except with a little gnome who sat at the very beginning of the decision algorithm and changed all of the goals from “maximize value” to “minimize value”. This proves that it is possible for two superintelligences to do two completely different things, therefore an AI must be a Friendly AI in order to do the ‘right’ thing. This is when he realized how close he had come to perhaps causing an extinction event, and realized how important the FAI project was. (It was also when he coined the term FAI to begin with.)
I have information from the future!
EY says it best in The Sheer Folly of Callow Youth, but essentially EY once thought, “If there is truly such a thing as moral value, then a superintelligence will likely discover what the correct moral values are. If there is no such thing as moral value, then the current reality is no more valuable than the reality where I make an AI that kills everyone. Therefore, I should strive to make an AI regardless of ethical problems.”
Then in the early 2000s he had an epiphany. The mechanics of his objection had to do with disproving the first part of the argument, that a superintelligence would automatically do the ‘right’ thing in a universe with ethics. This is because you could build an AI ‘foo’ which was a superintelligence, and an AI ‘bar’ which was ‘foo’ except with a little gnome who sat at the very beginning of the decision algorithm and changed all of the goals from “maximize value” to “minimize value”. This proves that it is possible for two superintelligences to do two completely different things, therefore an AI must be a Friendly AI in order to do the ‘right’ thing. This is when he realized how close he had come to perhaps causing an extinction event, and realized how important the FAI project was. (It was also when he coined the term FAI to begin with.)