The above is a caricature of ‘coherence’ as presented in the May 2004 document. If someone else can provide a better interpretation, that would be welcome.
That doesn’t sound like how I interpreted ‘coherent’. I assumed it meant a volition the vast majority of humanity agrees with / a measure of how much humanity’s volition agrees. If humanity really didn’t care about death, then that would be a coherent volition. So something like ‘collective’ indeed.
As for extrapolation, it’s not intended to literally look into the future. I thought the example of the diamond in the box was fairly enlightening. The human says ‘I want box 1’, thinking box 1 contains a diamond. The AI knows the diamond is in box 2, and can extrapolate (as humans do) that the human actually wants the diamond and would ask for box 2 if they knew where the diamond was. The smart AI therefore opens box 2, and the human is happy because they have a diamond. A dumb AI would just give the human box 1 “because they asked for it”, even if that’s what they didn’t really want.
When a lot of humans then say “the conquest of death is not a high priority” the AI extrapolates that if we knew more or had basic rationality training we would say conquest of death is a high priority. And therefore goes about solving death.
That is pretty much how I understood it too. It scares me. I would strongly prefer that it ask “Why not conquer death? I don’t understand.” Rather than just going ahead ignoring my stated preference. I dislike that it would substitute its judgment for mine simply because it believes it is wiser. You don’t discover the volition of mankind by ignoring what mankind tells you.
It doesn’t seem that scary to me. I don’t see it as substituting “its own judgement” for ours. It doesn’t have a judgement of its own. Rather, it believes (trivially correctly) that if we were wiser, we would be wiser than we are now. And if it can reliably figure out what a wiser version of us would say, it substitutes that person’s judgement for ours.
I suppose I imagine that if told I shouldn’t try to solve death, I would direct the person to LessWrong, try to explain to them the techniques of rationality, refer them to a rationalist dojo, etc. until they’re a good enough rationalist they can avoid reproducing memes they don’t really believe in—then ask them again.
The AI with massively greater resources can of course simulate all this instead, saving a lot of time. And the benefit of the AI’s method is that when the “simulation” says “I wish the AI had started preventing death right away instead of waiting for me to become a rationalist”, The AI can grant this wish!
The AI doesn’t inherently know what’s good or bad. It doesn’t even know what it should be surprised by (only transhumanists seem to realise that “let’s not prevent death” shouldn’t make sense). It can only find out by asking us, and of course the right answer is more likely to be given by a “wise” person. So the best way for the AI to find out what is right or wrong is to make everyone as wise as possible, then ask them (or predict what would happen if it did).
“What would I do if I were wiser?” may not be a meaningful question. Your current idea of wisdom is shaped by your current limitations.
At least the usual idea of wisdom is that it’s acquired through experience, and how can you know how more experience will affect you? Even your idea of wisdom formed by observing people who seem wiser than yourself is necessarily incomplete. All you can see is effects of a process you haven’t incorporated into yourself.
And if it [FAI] can reliably figure out what a wiser version of us would say, it substitutes that person’s judgement for ours.
[...]
I would direct the person to LessWrong, [...] until they’re a good enough rationalist [...] -- then ask them again.
It seems you have a flaw in your reasoning. You will direct a person to LessWrong, someone else will direct a person to church. And FAI should figure out somehow which direction a person should take to be wiser, without a judgment of its own.
According to the 2004 paper, Eliezer thinks (or thought, anyway) “what we would decide if we knew more, thought faster, were more the people we wished we were, had grown up farther together...” would do the trick. Presumably that’s the part to be hard-coded in. Or you could extrapolate (using the above) what people would say “wisdom” amounts to and use that instead.
Actually, I can’t imagine someone who knew and understood both the methods of rationality (having been directed to LessWrong) and all the teachings of the church (having been directed to church) would then direct a person to church. Maybe the FAI can let a person take both directions to become wiser.
ETA: Of course, in FAI ‘maybe’ isn’t good enough...
I mentioned this problem already. And I (07/2010) thought about ways to ensure that FAI will prefer my/our/rational way of extrapolating.
Now I think it would be better if FAI will select coherent subset of volitions of all reflectively consistent extrapolations. As I suspect it will be something like: protect humanity from existential risk, but don’t touch it beyond that.
Yes. The problem is, if you look at the biggest disagreements humans have had—slavery, abortion, regional independence, whom to tax, how much the state should help people who can’t help themselves, how much clothing women should wear, whether women should work outside the home—none of them can be resolved in this method. Religion, possibly; but only to the extent that a religion’s followers care about the end goal of getting into heaven, and not to the extent that they have internalized its values.
That doesn’t sound like how I interpreted ‘coherent’. I assumed it meant a volition the vast majority of humanity agrees with / a measure of how much humanity’s volition agrees. If humanity really didn’t care about death, then that would be a coherent volition. So something like ‘collective’ indeed.
As for extrapolation, it’s not intended to literally look into the future. I thought the example of the diamond in the box was fairly enlightening. The human says ‘I want box 1’, thinking box 1 contains a diamond. The AI knows the diamond is in box 2, and can extrapolate (as humans do) that the human actually wants the diamond and would ask for box 2 if they knew where the diamond was. The smart AI therefore opens box 2, and the human is happy because they have a diamond. A dumb AI would just give the human box 1 “because they asked for it”, even if that’s what they didn’t really want.
When a lot of humans then say “the conquest of death is not a high priority” the AI extrapolates that if we knew more or had basic rationality training we would say conquest of death is a high priority. And therefore goes about solving death.
At least that’s how I understood it.
That is pretty much how I understood it too. It scares me. I would strongly prefer that it ask “Why not conquer death? I don’t understand.” Rather than just going ahead ignoring my stated preference. I dislike that it would substitute its judgment for mine simply because it believes it is wiser. You don’t discover the volition of mankind by ignoring what mankind tells you.
It doesn’t seem that scary to me. I don’t see it as substituting “its own judgement” for ours. It doesn’t have a judgement of its own. Rather, it believes (trivially correctly) that if we were wiser, we would be wiser than we are now. And if it can reliably figure out what a wiser version of us would say, it substitutes that person’s judgement for ours.
I suppose I imagine that if told I shouldn’t try to solve death, I would direct the person to LessWrong, try to explain to them the techniques of rationality, refer them to a rationalist dojo, etc. until they’re a good enough rationalist they can avoid reproducing memes they don’t really believe in—then ask them again.
The AI with massively greater resources can of course simulate all this instead, saving a lot of time. And the benefit of the AI’s method is that when the “simulation” says “I wish the AI had started preventing death right away instead of waiting for me to become a rationalist”, The AI can grant this wish!
The AI doesn’t inherently know what’s good or bad. It doesn’t even know what it should be surprised by (only transhumanists seem to realise that “let’s not prevent death” shouldn’t make sense). It can only find out by asking us, and of course the right answer is more likely to be given by a “wise” person. So the best way for the AI to find out what is right or wrong is to make everyone as wise as possible, then ask them (or predict what would happen if it did).
“What would I do if I were wiser?” may not be a meaningful question. Your current idea of wisdom is shaped by your current limitations.
At least the usual idea of wisdom is that it’s acquired through experience, and how can you know how more experience will affect you? Even your idea of wisdom formed by observing people who seem wiser than yourself is necessarily incomplete. All you can see is effects of a process you haven’t incorporated into yourself.
It seems you have a flaw in your reasoning. You will direct a person to LessWrong, someone else will direct a person to church. And FAI should figure out somehow which direction a person should take to be wiser, without a judgment of its own.
That’s true.
According to the 2004 paper, Eliezer thinks (or thought, anyway) “what we would decide if we knew more, thought faster, were more the people we wished we were, had grown up farther together...” would do the trick. Presumably that’s the part to be hard-coded in. Or you could extrapolate (using the above) what people would say “wisdom” amounts to and use that instead.
Actually, I can’t imagine someone who knew and understood both the methods of rationality (having been directed to LessWrong) and all the teachings of the church (having been directed to church) would then direct a person to church. Maybe the FAI can let a person take both directions to become wiser.
ETA: Of course, in FAI ‘maybe’ isn’t good enough...
I mentioned this problem already. And I (07/2010) thought about ways to ensure that FAI will prefer my/our/rational way of extrapolating.
Now I think it would be better if FAI will select coherent subset of volitions of all reflectively consistent extrapolations. As I suspect it will be something like: protect humanity from existential risk, but don’t touch it beyond that.
Yes. The problem is, if you look at the biggest disagreements humans have had—slavery, abortion, regional independence, whom to tax, how much the state should help people who can’t help themselves, how much clothing women should wear, whether women should work outside the home—none of them can be resolved in this method. Religion, possibly; but only to the extent that a religion’s followers care about the end goal of getting into heaven, and not to the extent that they have internalized its values.