Type error. … Unless you mean to refer to whether or not our values are optimal in this environment with respect to evolutionary fitness, in which case obviously they are not, but that’s not very relevant to CEV.
You are right about the assumptions I made and I tend to agree it is erroneous.
Your post helps me refine my concern about CEV. It must be that I am expecting the CEV will NOT reflect MY values. In particular, I am suggesting that the CEV will be too conservative in the sense of over-valuing humanity as it currently is and therefore undervaluing humaity as it eventually would be with further evolution, further self-modification.
Probably what drives my fear of CEV not reflecting MY values is dopey, low probability. In my case it is an aspect of “Everything that comes from organized religion is automatically stupid.” To me, CEV and FAI are the modern dogma, man discovering his natural god does not exist, but deciding he can build his own. An all-loving (Friendly) all powerful (self-modifying AI after FOOM) father-figure to take care of us (totally bound by our CEV).
Of course there could be real reasons that CEV will not work. Is there any kind of existence proof for a non-trivial CEV? For the most part values such as “lying is wrong” “stealing is wrong” “help your neighbors” all seem like simplifying abstractions that are abandoned by the more intelligent because they are simply not flexible enough. The essence of nation-to-nation conflict is covert, illegal competition between powerful government organizations that takes place in the virtual absence of all other values other than “we must prevail.” I would presume a nation which refused to fight dirty at any level would be less likely to prevail and so such high mindedness would have no place in the future, and therefore no place in the CEV. That is, the fact that I with normal-ish intelligence can see that most values are a simple map for how humanity should interoperate to survive and the map is not the territory, an extrapolation to if we were MUCH smarter would likely remove all the simple landmarks we have on the maps suitable for our current distribution of IQ.
Then consider the value much of humanity places on accomplishment, and the understanding that coddling, keeping as pets, keeping safe, protecting, is at odds with accomplishment, and get really really smart about that and a CEV is likely to not have much in it about protecting us, even from ourselves.
So perhaps the CEV is a very sparse thing indeed, requiring only that humanity, its successors or assigns, survive. Perhaps FAI sits there not doing a whole hell of a lot that seems useful to us at our level of understanding, with its designers kicking it wondering where they went wrong.
I guess what I’m really getting too is perhaps our CEV, perhaps when you use as much intelligence as you can to extrapolate where our values go in the long long run, you get to the same place the blind idiot was going all along- survival. I understand many here will say no you are missing out on the bad vs good things in our current life, how we can cheat death but keep our taste for chocolate. Their hypothesis is that CEV has them still cheating death and keeping their taste for chocolate. I am hypothesizing that CEV might well have the juggernaut of the evolution of intelligence, and not any of the individuals or even species that are parts of that evolution, as its central value. I am not saying I know it will, what I am saying is I don’t know why everybody else has already decided they can safely predict that even a human 100X or 1000X as smart as they are doesn’t crush them the way we crush a bullfrog when his stream is in the way of our road project or shopping mall.
Evolution may be run by a blind idiot but it has gotten us this far. It is rare that something as obviously expensive as death would be kept in place for trivial reasons. Certainly the good news for those who hate death is the evidence is that lifespans are more valuable in smart species, I think we live twice as long as most other trends against other species would suggest we should, so maybe the optimum continues to go in that direction. But considering how increased intelligence and understanding is usually the enemy of hatred, it seems at least a possibility that needs to be considered that CEV doesn’t even stop us from dying.
It must be that I am expecting the CEV will NOT reflect MY values. In particular, I am suggesting that the CEV will be too conservative in the sense of over-valuing humanity as it currently is and therefore undervaluing humaity as it eventually would be with further evolution, further self-modification.
CEV is supposed to value the same thing that humanity values, not value humanity itself. Since you and other humans value future slightly-nonhuman entities living worthwhile lives, CEV would assign value to them by extension.
Is there any kind of existence proof for a non-trivial CEV?
That’s kind of a tricky question. Humans don’t actually have utility functions, which is why the “coherent extrapolated” part is important. We don’t really know of a way to extract an underlying utility function from non-utility-maximizing agents, so I guess you could say that the answer is no. On the other hand, humans are often capable of noticing when it is pointed out to them that their choices contradict each other, and, even if they don’t actually change their behavior, can at least endorse some more consistent strategy, so it seems reasonable that a human, given enough intelligence, working memory, time to think, and something to point out inconsistencies, could come up with a consistent utility function that fits human preferences about as well as a utility function can. As far as I understand, that’s basically what CEV is.
CEV is likely to not have much in it about protecting us, even from ourselves.
Do you want to die? No? Then humanity’s CEV would assign negative utility to you dying, so an AI maximizing it would protect you from dying.
I am not saying I know it will, what I am saying is I don’t know why everybody else has already decided they can safely predict that even a human 100X or 1000X as smart as they are doesn’t crush them the way we crush a bullfrog when his stream is in the way of our road project or shopping mall.
If some attempt to extract a CEV has a result that is horrible for us, that means that our method for computing the CEV was incorrect, not that CEV would be horrible to us. In the “what would a smarter version of me decide?” formulation, that smarter version of you is supposed to have the same values you do. That might be poorly defined since humans don’t have coherent values, but CEV is defined as that which it would be awesome from our perspective for a strong AI to maximize, and using the utility function that a smarter version of ourselves would come up with is a proposed method for determining it.
Criticisms of the form “an AI maximizing our CEV would do bad thing X” involve a misunderstanding of the CEV concept. Criticisms of the form “no one has unambiguously specified a method of computing our CEV that would be sure to work, or even gotten close to doing so” I agree with.
My thought on CEV not actually including much individual protection followed something like this: I don’t want to die. I don’t want to live in a walled garden taken care of as though I was a favored pet. Apply intelligence to that and my FAI does what for me? Mostly lets me be since it is smart enough to realize that a policy of protecting my life winds up turning me into a favored pet. This is sort of the distinction ask somewhat what they want you might get stories of candy and leisure, look at them when they are happiest you might see when they are doing meaningful and difficult work and living in a healthy manner. Apply high intelligence and you are unlikely to promote candy and leisure. Ultimately, I think humanity careening along on its very own planet as the peak species, creating intelligence in the universe where previously there was none is very possibly as good as it can get for humanity, and I think it plausible FAI would be smart enough to realize that and we might be surprised how little it seemed to interfere. I also think it is pretty hard working part time to predict what something 1000X smarter than I am will conclude about human values, so I hardly imagine what I am saying is powerfully convincing to anybody who doesn’t lean that way, I’m just explaining why or how an FAI could wind up doing almost nothing, i.e. how CEV could wind up being trivially empty in a way.
THe other aspect of being empty for CEV I was not thinking our own internal contradictions although that is a good point. I was thinking disagreement across humanity. Certainly we have seen broad ranges of valuations on human life and equality and broadly different ideas about what respect should look like and what punishment should look like. THese indicate to me that a human CEV as opposed to a French CEV or even a Paris CEV, might well be quite sparse when designed to keep only what is reasonably common to all humanity and all potential humanity. If morality turns out to be more culturally determined than genetically, we could still have a CEV, but we would have to stop claiming it was human and admit it was just us, and when we said FAI we meant friendly to us but unfriendly to you. The baby-eaters might turn out to be the Indonesians or the Inuits in this case.
I know how hard it is to reach consensus in a group of humans exceeding about 20, I’m just wondering how much a more rigorous process applied across billions is going to come up with.
You are right about the assumptions I made and I tend to agree it is erroneous.
Your post helps me refine my concern about CEV. It must be that I am expecting the CEV will NOT reflect MY values. In particular, I am suggesting that the CEV will be too conservative in the sense of over-valuing humanity as it currently is and therefore undervaluing humaity as it eventually would be with further evolution, further self-modification.
Probably what drives my fear of CEV not reflecting MY values is dopey, low probability. In my case it is an aspect of “Everything that comes from organized religion is automatically stupid.” To me, CEV and FAI are the modern dogma, man discovering his natural god does not exist, but deciding he can build his own. An all-loving (Friendly) all powerful (self-modifying AI after FOOM) father-figure to take care of us (totally bound by our CEV).
Of course there could be real reasons that CEV will not work. Is there any kind of existence proof for a non-trivial CEV? For the most part values such as “lying is wrong” “stealing is wrong” “help your neighbors” all seem like simplifying abstractions that are abandoned by the more intelligent because they are simply not flexible enough. The essence of nation-to-nation conflict is covert, illegal competition between powerful government organizations that takes place in the virtual absence of all other values other than “we must prevail.” I would presume a nation which refused to fight dirty at any level would be less likely to prevail and so such high mindedness would have no place in the future, and therefore no place in the CEV. That is, the fact that I with normal-ish intelligence can see that most values are a simple map for how humanity should interoperate to survive and the map is not the territory, an extrapolation to if we were MUCH smarter would likely remove all the simple landmarks we have on the maps suitable for our current distribution of IQ.
Then consider the value much of humanity places on accomplishment, and the understanding that coddling, keeping as pets, keeping safe, protecting, is at odds with accomplishment, and get really really smart about that and a CEV is likely to not have much in it about protecting us, even from ourselves.
So perhaps the CEV is a very sparse thing indeed, requiring only that humanity, its successors or assigns, survive. Perhaps FAI sits there not doing a whole hell of a lot that seems useful to us at our level of understanding, with its designers kicking it wondering where they went wrong.
I guess what I’m really getting too is perhaps our CEV, perhaps when you use as much intelligence as you can to extrapolate where our values go in the long long run, you get to the same place the blind idiot was going all along- survival. I understand many here will say no you are missing out on the bad vs good things in our current life, how we can cheat death but keep our taste for chocolate. Their hypothesis is that CEV has them still cheating death and keeping their taste for chocolate. I am hypothesizing that CEV might well have the juggernaut of the evolution of intelligence, and not any of the individuals or even species that are parts of that evolution, as its central value. I am not saying I know it will, what I am saying is I don’t know why everybody else has already decided they can safely predict that even a human 100X or 1000X as smart as they are doesn’t crush them the way we crush a bullfrog when his stream is in the way of our road project or shopping mall.
Evolution may be run by a blind idiot but it has gotten us this far. It is rare that something as obviously expensive as death would be kept in place for trivial reasons. Certainly the good news for those who hate death is the evidence is that lifespans are more valuable in smart species, I think we live twice as long as most other trends against other species would suggest we should, so maybe the optimum continues to go in that direction. But considering how increased intelligence and understanding is usually the enemy of hatred, it seems at least a possibility that needs to be considered that CEV doesn’t even stop us from dying.
CEV is supposed to value the same thing that humanity values, not value humanity itself. Since you and other humans value future slightly-nonhuman entities living worthwhile lives, CEV would assign value to them by extension.
That’s kind of a tricky question. Humans don’t actually have utility functions, which is why the “coherent extrapolated” part is important. We don’t really know of a way to extract an underlying utility function from non-utility-maximizing agents, so I guess you could say that the answer is no. On the other hand, humans are often capable of noticing when it is pointed out to them that their choices contradict each other, and, even if they don’t actually change their behavior, can at least endorse some more consistent strategy, so it seems reasonable that a human, given enough intelligence, working memory, time to think, and something to point out inconsistencies, could come up with a consistent utility function that fits human preferences about as well as a utility function can. As far as I understand, that’s basically what CEV is.
Do you want to die? No? Then humanity’s CEV would assign negative utility to you dying, so an AI maximizing it would protect you from dying.
If some attempt to extract a CEV has a result that is horrible for us, that means that our method for computing the CEV was incorrect, not that CEV would be horrible to us. In the “what would a smarter version of me decide?” formulation, that smarter version of you is supposed to have the same values you do. That might be poorly defined since humans don’t have coherent values, but CEV is defined as that which it would be awesome from our perspective for a strong AI to maximize, and using the utility function that a smarter version of ourselves would come up with is a proposed method for determining it.
Criticisms of the form “an AI maximizing our CEV would do bad thing X” involve a misunderstanding of the CEV concept. Criticisms of the form “no one has unambiguously specified a method of computing our CEV that would be sure to work, or even gotten close to doing so” I agree with.
My thought on CEV not actually including much individual protection followed something like this: I don’t want to die. I don’t want to live in a walled garden taken care of as though I was a favored pet. Apply intelligence to that and my FAI does what for me? Mostly lets me be since it is smart enough to realize that a policy of protecting my life winds up turning me into a favored pet. This is sort of the distinction ask somewhat what they want you might get stories of candy and leisure, look at them when they are happiest you might see when they are doing meaningful and difficult work and living in a healthy manner. Apply high intelligence and you are unlikely to promote candy and leisure. Ultimately, I think humanity careening along on its very own planet as the peak species, creating intelligence in the universe where previously there was none is very possibly as good as it can get for humanity, and I think it plausible FAI would be smart enough to realize that and we might be surprised how little it seemed to interfere. I also think it is pretty hard working part time to predict what something 1000X smarter than I am will conclude about human values, so I hardly imagine what I am saying is powerfully convincing to anybody who doesn’t lean that way, I’m just explaining why or how an FAI could wind up doing almost nothing, i.e. how CEV could wind up being trivially empty in a way.
THe other aspect of being empty for CEV I was not thinking our own internal contradictions although that is a good point. I was thinking disagreement across humanity. Certainly we have seen broad ranges of valuations on human life and equality and broadly different ideas about what respect should look like and what punishment should look like. THese indicate to me that a human CEV as opposed to a French CEV or even a Paris CEV, might well be quite sparse when designed to keep only what is reasonably common to all humanity and all potential humanity. If morality turns out to be more culturally determined than genetically, we could still have a CEV, but we would have to stop claiming it was human and admit it was just us, and when we said FAI we meant friendly to us but unfriendly to you. The baby-eaters might turn out to be the Indonesians or the Inuits in this case.
I know how hard it is to reach consensus in a group of humans exceeding about 20, I’m just wondering how much a more rigorous process applied across billions is going to come up with.
You can just average across each individual.
Yes, “humanity” should be interpreted as referring to the current population.