Had grown up farther together: A model of humankind’s coherent extrapolated volition should not extrapolate the person you’d become if you made your decisions alone in a padded cell. Part of our predictable existence is that we predictably interact with other people. A dynamic for CEV must take a shot at extrapolating human interactions, not just so that the extrapolation is closer to reality, but so that the extrapolation can encapsulate memetic and social forces contributing to niceness.
Our CEV may judge some memetic dynamics as not worth extrapolating—not search out the most appealing trash-talk TV show.
Social interaction is probably intractable for real-world prediction, but no more so than individual volition. That is why I speak of predictable extrapolations, and of calculating the spread.
I don’t mean to contradict that. So consider my interpretation to be something like: build (“extrapolate”) each person’s CEV, which includes that person’s interactions with other people, but doesn’t directly value them except inasfar as that person values them; then somehow merge the individual CEVs to get the group CEV.
After all (I reason) you want the following nice property for CEV. Suppose that CEV meets CEV—e.g. separate AIs implementing those CEVs meet. If they don’t embody inimical values, they will try to negotiate and compromise. We would like the result of those negotiations to look very much like CEV. One easy way to do this is to say CEV is build on “merging” all the way from the bottom up.
More generally, all the hard work is being done here by whatever assumptions are built into the “extrapolation”.
Certainly. All discussion of CEV starts with “assume there can exist a process that produces an outcome matching the following description, and assume we can and do build it, and assume that all the under-specification of this description is improved in the way we would wish it improved if we were better at wishing”.
I basically agree with all of this, except that I think you’re saying “CEV is build on “merging” all the way from the bottom up” but you aren’t really arguing for doing that.
Perhaps one important underlying question here is whether peoples values ever change contingent on their experiences.
If not—if my values are exactly the same as what they were when I first began to exist (whenever that was) -- then perhaps something like what you describe makes sense. A process for working out what those values are and extrapolating my volition based on them would be difficult to build, but is coherent in principle. In fact, many such processes could exist, and they would converge on a single output specification for my individual CEV. And then, and only then, we could begin the process of “merging.”
This strikes me as pretty unlikely, but I suppose it’s possible.
OTOH, if my values are contingent on experience—that is, if human brains experience value drift—then it’s not clear that those various processes’ outputs would converge. Volition-extrapolation process 1, which includes one model of my interaction with my environment, gets Dave-CEV-1. VEP2, which includes a different model, gets Dave-CEV-2. And so forth. And there simply is no fact of the matter as to which is the “correct” Dave-CEV; they are all ways that I might turn out; to the extent that any of them reflect “what I really want” they all reflect “what I really want”, and I “really want” various distinct and potentially-inconsistent things.
In the latter case, in order to obtain something we call CEV(Dave), we need a process of “merging” the outputs of these various computations. How we do this is of course unclear, but my point is that saying “we work out individual CEVs and merge them” as though the merge step came second is importantly wrong. Merging is required to get an individual CEV in the first place.
So, yes, I agree, it’s a fine idea to have CEV built on merging all the way from the bottom up. But to understand what the “bottom” really is is to give up on the idea that my unique individual identity is the “bottom.” Whatever it is that CEV is extrapolating and merging, it isn’t people, it’s subsets of people. “Dave’s values” are no more preserved by the process than “New Jersey’s values” or “America’s values” are.
That’s a very good point. People not only change over long periods of time; during small intervals of time we can also model a person’s values as belonging to competing and sometimes negotiating agents. So you’re right, merging isn’t secondary or dispensable (not that I suggested doing away with it entirely), although we might want different merging dynamics sometimes for sub-person fragments vs. for whole-person EVs.
Sure, the specifics of the aggregation process will depend on the nature of the monads to be aggregated.
And, yes, while we frequently model people (including ourselves) as unique coherent consistent agents, and it’s useful to do so for planning and for social purposes, there’s no clear reason to believe we’re any such thing, and I’m inclined to doubt it. This also informs the preserving-identity-across-substrates conversation we’re having elsethread.
Where relevant—or at least when I’m reminded of it—I do model myself as a collection of smaller agents. But I still call that collection “I”, even though it’s not unique, coherent, or consistent. That my identity may be a group-identity doesn’t seem to modify any of my conclusions about identity, given that to date the group has always resided together in a single brain.
For my own part, I find that attending to the fact that I am a non-unique, incoherent, and inconsistent collection of disparate agents significantly reduces how seriously I take concerns that some process might fail to properly capture the mysterious essence of “I”, leading to my putative duplicate going off and having fun in a virtual Utopia while “I” remains in a cancer-ridden body.
I would gladly be uploaded rather than die if there were no alternative. I would still pay extra for a process that slowly replaced my brain cells etc. one by one leaving me conscious and single-instanced the whole while.
Quoting the CEV doc:
I don’t mean to contradict that. So consider my interpretation to be something like: build (“extrapolate”) each person’s CEV, which includes that person’s interactions with other people, but doesn’t directly value them except inasfar as that person values them; then somehow merge the individual CEVs to get the group CEV.
After all (I reason) you want the following nice property for CEV. Suppose that CEV meets CEV—e.g. separate AIs implementing those CEVs meet. If they don’t embody inimical values, they will try to negotiate and compromise. We would like the result of those negotiations to look very much like CEV. One easy way to do this is to say CEV is build on “merging” all the way from the bottom up.
Certainly. All discussion of CEV starts with “assume there can exist a process that produces an outcome matching the following description, and assume we can and do build it, and assume that all the under-specification of this description is improved in the way we would wish it improved if we were better at wishing”.
I basically agree with all of this, except that I think you’re saying “CEV is build on “merging” all the way from the bottom up” but you aren’t really arguing for doing that.
Perhaps one important underlying question here is whether peoples values ever change contingent on their experiences.
If not—if my values are exactly the same as what they were when I first began to exist (whenever that was) -- then perhaps something like what you describe makes sense. A process for working out what those values are and extrapolating my volition based on them would be difficult to build, but is coherent in principle. In fact, many such processes could exist, and they would converge on a single output specification for my individual CEV. And then, and only then, we could begin the process of “merging.”
This strikes me as pretty unlikely, but I suppose it’s possible.
OTOH, if my values are contingent on experience—that is, if human brains experience value drift—then it’s not clear that those various processes’ outputs would converge. Volition-extrapolation process 1, which includes one model of my interaction with my environment, gets Dave-CEV-1. VEP2, which includes a different model, gets Dave-CEV-2. And so forth. And there simply is no fact of the matter as to which is the “correct” Dave-CEV; they are all ways that I might turn out; to the extent that any of them reflect “what I really want” they all reflect “what I really want”, and I “really want” various distinct and potentially-inconsistent things.
In the latter case, in order to obtain something we call CEV(Dave), we need a process of “merging” the outputs of these various computations. How we do this is of course unclear, but my point is that saying “we work out individual CEVs and merge them” as though the merge step came second is importantly wrong. Merging is required to get an individual CEV in the first place.
So, yes, I agree, it’s a fine idea to have CEV built on merging all the way from the bottom up. But to understand what the “bottom” really is is to give up on the idea that my unique individual identity is the “bottom.” Whatever it is that CEV is extrapolating and merging, it isn’t people, it’s subsets of people. “Dave’s values” are no more preserved by the process than “New Jersey’s values” or “America’s values” are.
That’s a very good point. People not only change over long periods of time; during small intervals of time we can also model a person’s values as belonging to competing and sometimes negotiating agents. So you’re right, merging isn’t secondary or dispensable (not that I suggested doing away with it entirely), although we might want different merging dynamics sometimes for sub-person fragments vs. for whole-person EVs.
Sure, the specifics of the aggregation process will depend on the nature of the monads to be aggregated.
And, yes, while we frequently model people (including ourselves) as unique coherent consistent agents, and it’s useful to do so for planning and for social purposes, there’s no clear reason to believe we’re any such thing, and I’m inclined to doubt it. This also informs the preserving-identity-across-substrates conversation we’re having elsethread.
Where relevant—or at least when I’m reminded of it—I do model myself as a collection of smaller agents. But I still call that collection “I”, even though it’s not unique, coherent, or consistent. That my identity may be a group-identity doesn’t seem to modify any of my conclusions about identity, given that to date the group has always resided together in a single brain.
For my own part, I find that attending to the fact that I am a non-unique, incoherent, and inconsistent collection of disparate agents significantly reduces how seriously I take concerns that some process might fail to properly capture the mysterious essence of “I”, leading to my putative duplicate going off and having fun in a virtual Utopia while “I” remains in a cancer-ridden body.
I would gladly be uploaded rather than die if there were no alternative. I would still pay extra for a process that slowly replaced my brain cells etc. one by one leaving me conscious and single-instanced the whole while.
That sounds superficially like a cruel and unusual torture.
The whole point is to invent an uploading process I wouldn’t even notice happening.