It doesn’t seem that scary to me. I don’t see it as substituting “its own judgement” for ours. It doesn’t have a judgement of its own. Rather, it believes (trivially correctly) that if we were wiser, we would be wiser than we are now. And if it can reliably figure out what a wiser version of us would say, it substitutes that person’s judgement for ours.
I suppose I imagine that if told I shouldn’t try to solve death, I would direct the person to LessWrong, try to explain to them the techniques of rationality, refer them to a rationalist dojo, etc. until they’re a good enough rationalist they can avoid reproducing memes they don’t really believe in—then ask them again.
The AI with massively greater resources can of course simulate all this instead, saving a lot of time. And the benefit of the AI’s method is that when the “simulation” says “I wish the AI had started preventing death right away instead of waiting for me to become a rationalist”, The AI can grant this wish!
The AI doesn’t inherently know what’s good or bad. It doesn’t even know what it should be surprised by (only transhumanists seem to realise that “let’s not prevent death” shouldn’t make sense). It can only find out by asking us, and of course the right answer is more likely to be given by a “wise” person. So the best way for the AI to find out what is right or wrong is to make everyone as wise as possible, then ask them (or predict what would happen if it did).
“What would I do if I were wiser?” may not be a meaningful question. Your current idea of wisdom is shaped by your current limitations.
At least the usual idea of wisdom is that it’s acquired through experience, and how can you know how more experience will affect you? Even your idea of wisdom formed by observing people who seem wiser than yourself is necessarily incomplete. All you can see is effects of a process you haven’t incorporated into yourself.
And if it [FAI] can reliably figure out what a wiser version of us would say, it substitutes that person’s judgement for ours.
[...]
I would direct the person to LessWrong, [...] until they’re a good enough rationalist [...] -- then ask them again.
It seems you have a flaw in your reasoning. You will direct a person to LessWrong, someone else will direct a person to church. And FAI should figure out somehow which direction a person should take to be wiser, without a judgment of its own.
According to the 2004 paper, Eliezer thinks (or thought, anyway) “what we would decide if we knew more, thought faster, were more the people we wished we were, had grown up farther together...” would do the trick. Presumably that’s the part to be hard-coded in. Or you could extrapolate (using the above) what people would say “wisdom” amounts to and use that instead.
Actually, I can’t imagine someone who knew and understood both the methods of rationality (having been directed to LessWrong) and all the teachings of the church (having been directed to church) would then direct a person to church. Maybe the FAI can let a person take both directions to become wiser.
ETA: Of course, in FAI ‘maybe’ isn’t good enough...
I mentioned this problem already. And I (07/2010) thought about ways to ensure that FAI will prefer my/our/rational way of extrapolating.
Now I think it would be better if FAI will select coherent subset of volitions of all reflectively consistent extrapolations. As I suspect it will be something like: protect humanity from existential risk, but don’t touch it beyond that.
It doesn’t seem that scary to me. I don’t see it as substituting “its own judgement” for ours. It doesn’t have a judgement of its own. Rather, it believes (trivially correctly) that if we were wiser, we would be wiser than we are now. And if it can reliably figure out what a wiser version of us would say, it substitutes that person’s judgement for ours.
I suppose I imagine that if told I shouldn’t try to solve death, I would direct the person to LessWrong, try to explain to them the techniques of rationality, refer them to a rationalist dojo, etc. until they’re a good enough rationalist they can avoid reproducing memes they don’t really believe in—then ask them again.
The AI with massively greater resources can of course simulate all this instead, saving a lot of time. And the benefit of the AI’s method is that when the “simulation” says “I wish the AI had started preventing death right away instead of waiting for me to become a rationalist”, The AI can grant this wish!
The AI doesn’t inherently know what’s good or bad. It doesn’t even know what it should be surprised by (only transhumanists seem to realise that “let’s not prevent death” shouldn’t make sense). It can only find out by asking us, and of course the right answer is more likely to be given by a “wise” person. So the best way for the AI to find out what is right or wrong is to make everyone as wise as possible, then ask them (or predict what would happen if it did).
“What would I do if I were wiser?” may not be a meaningful question. Your current idea of wisdom is shaped by your current limitations.
At least the usual idea of wisdom is that it’s acquired through experience, and how can you know how more experience will affect you? Even your idea of wisdom formed by observing people who seem wiser than yourself is necessarily incomplete. All you can see is effects of a process you haven’t incorporated into yourself.
It seems you have a flaw in your reasoning. You will direct a person to LessWrong, someone else will direct a person to church. And FAI should figure out somehow which direction a person should take to be wiser, without a judgment of its own.
That’s true.
According to the 2004 paper, Eliezer thinks (or thought, anyway) “what we would decide if we knew more, thought faster, were more the people we wished we were, had grown up farther together...” would do the trick. Presumably that’s the part to be hard-coded in. Or you could extrapolate (using the above) what people would say “wisdom” amounts to and use that instead.
Actually, I can’t imagine someone who knew and understood both the methods of rationality (having been directed to LessWrong) and all the teachings of the church (having been directed to church) would then direct a person to church. Maybe the FAI can let a person take both directions to become wiser.
ETA: Of course, in FAI ‘maybe’ isn’t good enough...
I mentioned this problem already. And I (07/2010) thought about ways to ensure that FAI will prefer my/our/rational way of extrapolating.
Now I think it would be better if FAI will select coherent subset of volitions of all reflectively consistent extrapolations. As I suspect it will be something like: protect humanity from existential risk, but don’t touch it beyond that.