First question: how on Earth would we go about conducting a search through possible future universes, anyway? This thought experiment still feels too abstract to make my intuitions go click, in much the same way that Christiano’s original write-up of Indirect Normativity did.
Two main reasons for this: first, there is Christiano’s original write-up, which has this problem. Second, we may be in a situation where we ask an AI to simulate the consequences of its choice, have a glance at it, and then approve/disapprove. That’s less a search problem, and more the original siren world problem, and we should be aware of the problem.
Second, we may be in a situation where we ask an AI to simulate the consequences of its choice, have a glance at it, and then approve/disapprove. That’s less a search problem, and more the original siren world problem, and we should be aware of the problem.
This sounds extremely counterintuitive. If I have an Oracle AI that I can trust to answer more-or-less verbal requests (defined as: any request or “program specification” too vague for me to actually formalize), why have I not simply asked it to learn, from a large corpus of cultural artifacts, the Idea of the Good, and then explain to me what it has learned (again, verbally)? If I cannot trust the Oracle AI, dear God, why am I having it explore potential eutopian future worlds for me?
Two main reasons for this: first, there is Christiano’s original write-up, which has this problem. Second, we may be in a situation where we ask an AI to simulate the consequences of its choice, have a glance at it, and then approve/disapprove. That’s less a search problem, and more the original siren world problem, and we should be aware of the problem.
This sounds extremely counterintuitive. If I have an Oracle AI that I can trust to answer more-or-less verbal requests (defined as: any request or “program specification” too vague for me to actually formalize), why have I not simply asked it to learn, from a large corpus of cultural artifacts, the Idea of the Good, and then explain to me what it has learned (again, verbally)? If I cannot trust the Oracle AI, dear God, why am I having it explore potential eutopian future worlds for me?
Because I haven’t read Less Wrong? ^_^
This is another argument against using constrained but non-friendly AI to do stuff for us...