Thus a CEV would likely assign some value to the preferences of a possible human successor species by proxy through our values.
An interesting question, is the CEV dynamic? As we spent decades or millennia in the walled gardens built for us by the FAI would the FAI be allowed to drift its own values through some dynamic process of checking with the humans within its walls to see how its values might be drifting? I had been under the impression that it would not, but that might have been my own mistake.
No. CEV is the coherent extrapolation of what we-now value.
Edit: Dynamic value systems likely aren’t feasible for recursively self-improving AIs, since an agent with a dynamic goal system has incentive to modify into an agent with a static goal system, as that is what would best fulfill its current goals.
It’s not dynamic. It isn’t our values in the sense of what we’d prefer right now. It’s what we’d prefer if we were smarter, faster, and more the people that we wished we were. In short, it’s what we’d end up with if it was dynamic.
It’s not dynamic. It isn’t our values in the sense of what we’d prefer right now. It’s what we’d prefer if we were smarter, faster, and more the people that we wished we were. In short, it’s what we’d end up with if it was dynamic.
Unless the FAI freezes our current evolutionary state, at least as involves our values, the result we would wind up with if CEV derivation was dynamic would be different from what we would end up with if it is just some extrapolation from what current humans want now.
Even if there were some reason to think our current values were optimal for our current environment, which there is actually reason to think they are NOT, we would still have no reason to think they were optimal in a future environment.
Of course being effectively kept in a really really nice zoo by the FAI, we would not be experiencing any kind of NATURAL selection anymore, and evidence certainly suggests that our volition is to be taller, smarter, have bigger dicks and boobs, be blonder, tanner, and happier, all of which our zookeeper FAI should be able to move us (or our descendants) towards while carrying out necessary eugenics to keep our genome healthy in the absence of natural selection pressures. Certainly CEV keeps us from wanting defective, crippled, and genetically diseased children, so this seems a fairly safe prediction.
It would seem as defined that CEV would have to be fixed at the value it was set at when FAI was created. That no matter how smart, how tall, how blond, how curvaceous or how pudendous we became we would still be constantly pruned back to the CEV of 2045 humans.
As to our values not even being optimal for our current environment fuhgedaboud our future environment, it is pretty widely recognized that we are evolved for the hunter gatherer world of 10,000 years ago, with familial groups of a few hundred, the necessity for survival of hostile reaction against outsiders, and systems which allow fear to distort in extreme ways our rational estimations of things.
I wonder if the FAI will be sad to not be able to see what evolution in its unlimited ignorance would have come up with for us? Maybe they will push a few other species to become intelligent and social and let them duke it out and have natural selection run with them. As long as their species that our CEV didn’t feel too overly warm and fuzzy about this shouldn’t be a problem for them. And certain as a human in the walled garden I would LOVE to be studying what evolution does beyond what it has done to us, so this would seem like a fine and fun thing for the FAI to do to keep at least my part of the CEV entertained.
Even if there were some reason to think our current values were optimal for our current environment, which there is actually reason to think they are NOT, we would still have no reason to think they were optimal in a future environment.
Type error. You can evaluate the optimality of actions in an environment with respect to values. Values being optimal with respect to an environment is not a thing that makes sense. Unless you mean to refer to whether or not our values are optimal in this environment with respect to evolutionary fitness, in which case obviously they are not, but that’s not very relevant to CEV.
all of which our zookeeper FAI should be able to move us (or our descendants) towards while carrying out necessary eugenics to keep our genome healthy in the absence of natural selection pressures.
An FAI can be far more direct than that. Think something more along the lines of “doing surgery to make our bodies work the way we want them to” than “eugenics”.
Type error. … Unless you mean to refer to whether or not our values are optimal in this environment with respect to evolutionary fitness, in which case obviously they are not, but that’s not very relevant to CEV.
You are right about the assumptions I made and I tend to agree it is erroneous.
Your post helps me refine my concern about CEV. It must be that I am expecting the CEV will NOT reflect MY values. In particular, I am suggesting that the CEV will be too conservative in the sense of over-valuing humanity as it currently is and therefore undervaluing humaity as it eventually would be with further evolution, further self-modification.
Probably what drives my fear of CEV not reflecting MY values is dopey, low probability. In my case it is an aspect of “Everything that comes from organized religion is automatically stupid.” To me, CEV and FAI are the modern dogma, man discovering his natural god does not exist, but deciding he can build his own. An all-loving (Friendly) all powerful (self-modifying AI after FOOM) father-figure to take care of us (totally bound by our CEV).
Of course there could be real reasons that CEV will not work. Is there any kind of existence proof for a non-trivial CEV? For the most part values such as “lying is wrong” “stealing is wrong” “help your neighbors” all seem like simplifying abstractions that are abandoned by the more intelligent because they are simply not flexible enough. The essence of nation-to-nation conflict is covert, illegal competition between powerful government organizations that takes place in the virtual absence of all other values other than “we must prevail.” I would presume a nation which refused to fight dirty at any level would be less likely to prevail and so such high mindedness would have no place in the future, and therefore no place in the CEV. That is, the fact that I with normal-ish intelligence can see that most values are a simple map for how humanity should interoperate to survive and the map is not the territory, an extrapolation to if we were MUCH smarter would likely remove all the simple landmarks we have on the maps suitable for our current distribution of IQ.
Then consider the value much of humanity places on accomplishment, and the understanding that coddling, keeping as pets, keeping safe, protecting, is at odds with accomplishment, and get really really smart about that and a CEV is likely to not have much in it about protecting us, even from ourselves.
So perhaps the CEV is a very sparse thing indeed, requiring only that humanity, its successors or assigns, survive. Perhaps FAI sits there not doing a whole hell of a lot that seems useful to us at our level of understanding, with its designers kicking it wondering where they went wrong.
I guess what I’m really getting too is perhaps our CEV, perhaps when you use as much intelligence as you can to extrapolate where our values go in the long long run, you get to the same place the blind idiot was going all along- survival. I understand many here will say no you are missing out on the bad vs good things in our current life, how we can cheat death but keep our taste for chocolate. Their hypothesis is that CEV has them still cheating death and keeping their taste for chocolate. I am hypothesizing that CEV might well have the juggernaut of the evolution of intelligence, and not any of the individuals or even species that are parts of that evolution, as its central value. I am not saying I know it will, what I am saying is I don’t know why everybody else has already decided they can safely predict that even a human 100X or 1000X as smart as they are doesn’t crush them the way we crush a bullfrog when his stream is in the way of our road project or shopping mall.
Evolution may be run by a blind idiot but it has gotten us this far. It is rare that something as obviously expensive as death would be kept in place for trivial reasons. Certainly the good news for those who hate death is the evidence is that lifespans are more valuable in smart species, I think we live twice as long as most other trends against other species would suggest we should, so maybe the optimum continues to go in that direction. But considering how increased intelligence and understanding is usually the enemy of hatred, it seems at least a possibility that needs to be considered that CEV doesn’t even stop us from dying.
It must be that I am expecting the CEV will NOT reflect MY values. In particular, I am suggesting that the CEV will be too conservative in the sense of over-valuing humanity as it currently is and therefore undervaluing humaity as it eventually would be with further evolution, further self-modification.
CEV is supposed to value the same thing that humanity values, not value humanity itself. Since you and other humans value future slightly-nonhuman entities living worthwhile lives, CEV would assign value to them by extension.
Is there any kind of existence proof for a non-trivial CEV?
That’s kind of a tricky question. Humans don’t actually have utility functions, which is why the “coherent extrapolated” part is important. We don’t really know of a way to extract an underlying utility function from non-utility-maximizing agents, so I guess you could say that the answer is no. On the other hand, humans are often capable of noticing when it is pointed out to them that their choices contradict each other, and, even if they don’t actually change their behavior, can at least endorse some more consistent strategy, so it seems reasonable that a human, given enough intelligence, working memory, time to think, and something to point out inconsistencies, could come up with a consistent utility function that fits human preferences about as well as a utility function can. As far as I understand, that’s basically what CEV is.
CEV is likely to not have much in it about protecting us, even from ourselves.
Do you want to die? No? Then humanity’s CEV would assign negative utility to you dying, so an AI maximizing it would protect you from dying.
I am not saying I know it will, what I am saying is I don’t know why everybody else has already decided they can safely predict that even a human 100X or 1000X as smart as they are doesn’t crush them the way we crush a bullfrog when his stream is in the way of our road project or shopping mall.
If some attempt to extract a CEV has a result that is horrible for us, that means that our method for computing the CEV was incorrect, not that CEV would be horrible to us. In the “what would a smarter version of me decide?” formulation, that smarter version of you is supposed to have the same values you do. That might be poorly defined since humans don’t have coherent values, but CEV is defined as that which it would be awesome from our perspective for a strong AI to maximize, and using the utility function that a smarter version of ourselves would come up with is a proposed method for determining it.
Criticisms of the form “an AI maximizing our CEV would do bad thing X” involve a misunderstanding of the CEV concept. Criticisms of the form “no one has unambiguously specified a method of computing our CEV that would be sure to work, or even gotten close to doing so” I agree with.
My thought on CEV not actually including much individual protection followed something like this: I don’t want to die. I don’t want to live in a walled garden taken care of as though I was a favored pet. Apply intelligence to that and my FAI does what for me? Mostly lets me be since it is smart enough to realize that a policy of protecting my life winds up turning me into a favored pet. This is sort of the distinction ask somewhat what they want you might get stories of candy and leisure, look at them when they are happiest you might see when they are doing meaningful and difficult work and living in a healthy manner. Apply high intelligence and you are unlikely to promote candy and leisure. Ultimately, I think humanity careening along on its very own planet as the peak species, creating intelligence in the universe where previously there was none is very possibly as good as it can get for humanity, and I think it plausible FAI would be smart enough to realize that and we might be surprised how little it seemed to interfere. I also think it is pretty hard working part time to predict what something 1000X smarter than I am will conclude about human values, so I hardly imagine what I am saying is powerfully convincing to anybody who doesn’t lean that way, I’m just explaining why or how an FAI could wind up doing almost nothing, i.e. how CEV could wind up being trivially empty in a way.
THe other aspect of being empty for CEV I was not thinking our own internal contradictions although that is a good point. I was thinking disagreement across humanity. Certainly we have seen broad ranges of valuations on human life and equality and broadly different ideas about what respect should look like and what punishment should look like. THese indicate to me that a human CEV as opposed to a French CEV or even a Paris CEV, might well be quite sparse when designed to keep only what is reasonably common to all humanity and all potential humanity. If morality turns out to be more culturally determined than genetically, we could still have a CEV, but we would have to stop claiming it was human and admit it was just us, and when we said FAI we meant friendly to us but unfriendly to you. The baby-eaters might turn out to be the Indonesians or the Inuits in this case.
I know how hard it is to reach consensus in a group of humans exceeding about 20, I’m just wondering how much a more rigorous process applied across billions is going to come up with.
we would still be constantly pruned back to the CEV of 2045 humans
Two connotational objections: 1) I don’t think that “constantly pruned back” is an appropriate metaphor for “getting everything you have ever desired”. The only thing that would prevent us from doing X would be the fact that after reflection we love non-X. 2) The extrapolated 2045 humans would be probably as different from the real 2045 humans, as the 2045 humans are different from the MINUS 2045 humans.
I wonder if the FAI will be sad to not be able to see what evolution in its unlimited ignorance would have come up with for us?
Sad? Why, unless we program it to be? Also, with superior recursively self-improving intelligence it could probably make a good estimate of what would have happened in an alternative reality where all AIs are magically destroyed. But such estimate would most likely be a probability distribution of many different possibilities, not one specific goal.
I’m dubious about the extrapolation—the universe is more complex than the AI, and the AI may not be able to model how our values would change as a result of unmediated choices and experiense.
I am not sure how obvious is the part that there are multiple possible futures. Most likely, the AI would not be able to model all of them. However, without AI most of them wouldn’t happen anyway.
It’s like saying “if I don’t roll a die, I lose the chance of rolling 6”, to which I add “and if you do roll the die, you still have 5⁄6 probability of not rolling 6″. Just to make it clear that by avoiding the “spontaneous” future of humankind, we are not avoiding one specific future magically prepared for us by destiny. We are avoiding the whole probability distribution, which contains many possible futures, both nice and ugly.
Just because AI can model something imperfectly, it does not mean that without the AI the future would be perfect, or even better on average than with the AI.
‘Unmediated’ may not have been quite the word to convey what I meant.
My impression is that CEV is permanently established very early in the AI’s history, but I believe that what people are and want (including what we would want if we knew more, thought faster, were more the people we wished we were, and had grown up closer together) will change, both because people will be doing self-modification and because they will learn more.
What I mean is that if you looked at what people valued, and gave them the ability to self-modify, and somehow kept them from messing up and accidentally doing something that they didn’t want to do, you’d have something like CEV but dynamic. CEV is the end result of this.
An interesting question, is the CEV dynamic? As we spent decades or millennia in the walled gardens built for us by the FAI would the FAI be allowed to drift its own values through some dynamic process of checking with the humans within its walls to see how its values might be drifting? I had been under the impression that it would not, but that might have been my own mistake.
No. CEV is the coherent extrapolation of what we-now value.
Edit: Dynamic value systems likely aren’t feasible for recursively self-improving AIs, since an agent with a dynamic goal system has incentive to modify into an agent with a static goal system, as that is what would best fulfill its current goals.
It’s not dynamic. It isn’t our values in the sense of what we’d prefer right now. It’s what we’d prefer if we were smarter, faster, and more the people that we wished we were. In short, it’s what we’d end up with if it was dynamic.
Unless the FAI freezes our current evolutionary state, at least as involves our values, the result we would wind up with if CEV derivation was dynamic would be different from what we would end up with if it is just some extrapolation from what current humans want now.
Even if there were some reason to think our current values were optimal for our current environment, which there is actually reason to think they are NOT, we would still have no reason to think they were optimal in a future environment.
Of course being effectively kept in a really really nice zoo by the FAI, we would not be experiencing any kind of NATURAL selection anymore, and evidence certainly suggests that our volition is to be taller, smarter, have bigger dicks and boobs, be blonder, tanner, and happier, all of which our zookeeper FAI should be able to move us (or our descendants) towards while carrying out necessary eugenics to keep our genome healthy in the absence of natural selection pressures. Certainly CEV keeps us from wanting defective, crippled, and genetically diseased children, so this seems a fairly safe prediction.
It would seem as defined that CEV would have to be fixed at the value it was set at when FAI was created. That no matter how smart, how tall, how blond, how curvaceous or how pudendous we became we would still be constantly pruned back to the CEV of 2045 humans.
As to our values not even being optimal for our current environment fuhgedaboud our future environment, it is pretty widely recognized that we are evolved for the hunter gatherer world of 10,000 years ago, with familial groups of a few hundred, the necessity for survival of hostile reaction against outsiders, and systems which allow fear to distort in extreme ways our rational estimations of things.
I wonder if the FAI will be sad to not be able to see what evolution in its unlimited ignorance would have come up with for us? Maybe they will push a few other species to become intelligent and social and let them duke it out and have natural selection run with them. As long as their species that our CEV didn’t feel too overly warm and fuzzy about this shouldn’t be a problem for them. And certain as a human in the walled garden I would LOVE to be studying what evolution does beyond what it has done to us, so this would seem like a fine and fun thing for the FAI to do to keep at least my part of the CEV entertained.
Type error. You can evaluate the optimality of actions in an environment with respect to values. Values being optimal with respect to an environment is not a thing that makes sense. Unless you mean to refer to whether or not our values are optimal in this environment with respect to evolutionary fitness, in which case obviously they are not, but that’s not very relevant to CEV.
An FAI can be far more direct than that. Think something more along the lines of “doing surgery to make our bodies work the way we want them to” than “eugenics”.
Do not anthropomorphize an AI.
You are right about the assumptions I made and I tend to agree it is erroneous.
Your post helps me refine my concern about CEV. It must be that I am expecting the CEV will NOT reflect MY values. In particular, I am suggesting that the CEV will be too conservative in the sense of over-valuing humanity as it currently is and therefore undervaluing humaity as it eventually would be with further evolution, further self-modification.
Probably what drives my fear of CEV not reflecting MY values is dopey, low probability. In my case it is an aspect of “Everything that comes from organized religion is automatically stupid.” To me, CEV and FAI are the modern dogma, man discovering his natural god does not exist, but deciding he can build his own. An all-loving (Friendly) all powerful (self-modifying AI after FOOM) father-figure to take care of us (totally bound by our CEV).
Of course there could be real reasons that CEV will not work. Is there any kind of existence proof for a non-trivial CEV? For the most part values such as “lying is wrong” “stealing is wrong” “help your neighbors” all seem like simplifying abstractions that are abandoned by the more intelligent because they are simply not flexible enough. The essence of nation-to-nation conflict is covert, illegal competition between powerful government organizations that takes place in the virtual absence of all other values other than “we must prevail.” I would presume a nation which refused to fight dirty at any level would be less likely to prevail and so such high mindedness would have no place in the future, and therefore no place in the CEV. That is, the fact that I with normal-ish intelligence can see that most values are a simple map for how humanity should interoperate to survive and the map is not the territory, an extrapolation to if we were MUCH smarter would likely remove all the simple landmarks we have on the maps suitable for our current distribution of IQ.
Then consider the value much of humanity places on accomplishment, and the understanding that coddling, keeping as pets, keeping safe, protecting, is at odds with accomplishment, and get really really smart about that and a CEV is likely to not have much in it about protecting us, even from ourselves.
So perhaps the CEV is a very sparse thing indeed, requiring only that humanity, its successors or assigns, survive. Perhaps FAI sits there not doing a whole hell of a lot that seems useful to us at our level of understanding, with its designers kicking it wondering where they went wrong.
I guess what I’m really getting too is perhaps our CEV, perhaps when you use as much intelligence as you can to extrapolate where our values go in the long long run, you get to the same place the blind idiot was going all along- survival. I understand many here will say no you are missing out on the bad vs good things in our current life, how we can cheat death but keep our taste for chocolate. Their hypothesis is that CEV has them still cheating death and keeping their taste for chocolate. I am hypothesizing that CEV might well have the juggernaut of the evolution of intelligence, and not any of the individuals or even species that are parts of that evolution, as its central value. I am not saying I know it will, what I am saying is I don’t know why everybody else has already decided they can safely predict that even a human 100X or 1000X as smart as they are doesn’t crush them the way we crush a bullfrog when his stream is in the way of our road project or shopping mall.
Evolution may be run by a blind idiot but it has gotten us this far. It is rare that something as obviously expensive as death would be kept in place for trivial reasons. Certainly the good news for those who hate death is the evidence is that lifespans are more valuable in smart species, I think we live twice as long as most other trends against other species would suggest we should, so maybe the optimum continues to go in that direction. But considering how increased intelligence and understanding is usually the enemy of hatred, it seems at least a possibility that needs to be considered that CEV doesn’t even stop us from dying.
CEV is supposed to value the same thing that humanity values, not value humanity itself. Since you and other humans value future slightly-nonhuman entities living worthwhile lives, CEV would assign value to them by extension.
That’s kind of a tricky question. Humans don’t actually have utility functions, which is why the “coherent extrapolated” part is important. We don’t really know of a way to extract an underlying utility function from non-utility-maximizing agents, so I guess you could say that the answer is no. On the other hand, humans are often capable of noticing when it is pointed out to them that their choices contradict each other, and, even if they don’t actually change their behavior, can at least endorse some more consistent strategy, so it seems reasonable that a human, given enough intelligence, working memory, time to think, and something to point out inconsistencies, could come up with a consistent utility function that fits human preferences about as well as a utility function can. As far as I understand, that’s basically what CEV is.
Do you want to die? No? Then humanity’s CEV would assign negative utility to you dying, so an AI maximizing it would protect you from dying.
If some attempt to extract a CEV has a result that is horrible for us, that means that our method for computing the CEV was incorrect, not that CEV would be horrible to us. In the “what would a smarter version of me decide?” formulation, that smarter version of you is supposed to have the same values you do. That might be poorly defined since humans don’t have coherent values, but CEV is defined as that which it would be awesome from our perspective for a strong AI to maximize, and using the utility function that a smarter version of ourselves would come up with is a proposed method for determining it.
Criticisms of the form “an AI maximizing our CEV would do bad thing X” involve a misunderstanding of the CEV concept. Criticisms of the form “no one has unambiguously specified a method of computing our CEV that would be sure to work, or even gotten close to doing so” I agree with.
My thought on CEV not actually including much individual protection followed something like this: I don’t want to die. I don’t want to live in a walled garden taken care of as though I was a favored pet. Apply intelligence to that and my FAI does what for me? Mostly lets me be since it is smart enough to realize that a policy of protecting my life winds up turning me into a favored pet. This is sort of the distinction ask somewhat what they want you might get stories of candy and leisure, look at them when they are happiest you might see when they are doing meaningful and difficult work and living in a healthy manner. Apply high intelligence and you are unlikely to promote candy and leisure. Ultimately, I think humanity careening along on its very own planet as the peak species, creating intelligence in the universe where previously there was none is very possibly as good as it can get for humanity, and I think it plausible FAI would be smart enough to realize that and we might be surprised how little it seemed to interfere. I also think it is pretty hard working part time to predict what something 1000X smarter than I am will conclude about human values, so I hardly imagine what I am saying is powerfully convincing to anybody who doesn’t lean that way, I’m just explaining why or how an FAI could wind up doing almost nothing, i.e. how CEV could wind up being trivially empty in a way.
THe other aspect of being empty for CEV I was not thinking our own internal contradictions although that is a good point. I was thinking disagreement across humanity. Certainly we have seen broad ranges of valuations on human life and equality and broadly different ideas about what respect should look like and what punishment should look like. THese indicate to me that a human CEV as opposed to a French CEV or even a Paris CEV, might well be quite sparse when designed to keep only what is reasonably common to all humanity and all potential humanity. If morality turns out to be more culturally determined than genetically, we could still have a CEV, but we would have to stop claiming it was human and admit it was just us, and when we said FAI we meant friendly to us but unfriendly to you. The baby-eaters might turn out to be the Indonesians or the Inuits in this case.
I know how hard it is to reach consensus in a group of humans exceeding about 20, I’m just wondering how much a more rigorous process applied across billions is going to come up with.
You can just average across each individual.
Yes, “humanity” should be interpreted as referring to the current population.
Two connotational objections: 1) I don’t think that “constantly pruned back” is an appropriate metaphor for “getting everything you have ever desired”. The only thing that would prevent us from doing X would be the fact that after reflection we love non-X. 2) The extrapolated 2045 humans would be probably as different from the real 2045 humans, as the 2045 humans are different from the MINUS 2045 humans.
Sad? Why, unless we program it to be? Also, with superior recursively self-improving intelligence it could probably make a good estimate of what would have happened in an alternative reality where all AIs are magically destroyed. But such estimate would most likely be a probability distribution of many different possibilities, not one specific goal.
I’m dubious about the extrapolation—the universe is more complex than the AI, and the AI may not be able to model how our values would change as a result of unmediated choices and experiense.
I am not sure how obvious is the part that there are multiple possible futures. Most likely, the AI would not be able to model all of them. However, without AI most of them wouldn’t happen anyway.
It’s like saying “if I don’t roll a die, I lose the chance of rolling 6”, to which I add “and if you do roll the die, you still have 5⁄6 probability of not rolling 6″. Just to make it clear that by avoiding the “spontaneous” future of humankind, we are not avoiding one specific future magically prepared for us by destiny. We are avoiding the whole probability distribution, which contains many possible futures, both nice and ugly.
Just because AI can model something imperfectly, it does not mean that without the AI the future would be perfect, or even better on average than with the AI.
‘Unmediated’ may not have been quite the word to convey what I meant.
My impression is that CEV is permanently established very early in the AI’s history, but I believe that what people are and want (including what we would want if we knew more, thought faster, were more the people we wished we were, and had grown up closer together) will change, both because people will be doing self-modification and because they will learn more.
The overwhelming majority of dynamic value systems do not end in CEV.
What I mean is that if you looked at what people valued, and gave them the ability to self-modify, and somehow kept them from messing up and accidentally doing something that they didn’t want to do, you’d have something like CEV but dynamic. CEV is the end result of this.