What if we used a two-tiered CEV? A CEV applied to a small, hand selected group of moral philosophers could be used to determine weighting rules and ad hoc exceptions to the CEV that runs on all of humanity to determine the utility function of the FAI.
Then when the CEV encounters the trillion Dr Evil uploads, it will consult what the group of moral philosophers would have wanted to handle it if “they knew more, thought faster, were more the people we wished we were, had grown up farther together”, which would be be something like weight them together as one person.
What if we used a two-tiered CEV? A CEV applied to a small, hand selected group of moral philosophers
And who would select the initial group? Oh, I know! We can make it a 3-tiered system, and have CEV applied to an even smaller group choose the group of moral philosophers!
Wait… my spidey-sense is tingling… I think it’s trying to tell me that maybe there’s a problem with this plan
A small, select group will be designing and implementing the FAI anyways.
Yes, but a small, select group determining its moral values is a different thing entirely, and seems to defeat the whole purpose of CEV. At that point you might as well just have the small group of moral philosophers explicitly write out a “10 Commandments” style moral code and abandon CEV altogether.
Yes, but a small, select group determining its moral values is a different thing entirely
That is not what I am proposing. You are attacking a straw man. The CEV of the small group of moral philosophers does not determine the utility function directly. It only determines the rules used to run the larger CEV on all of humanity, based on what the moral philosophers consider a fair way of combining utility functions, not what they want the answer to be.
The CEV of the small group of moral philosophers does not determine the utility function directly
That may not be the intention, but if they’re empowered to create ad hoc exceptions to CEV, that could end up being the effect.
Basically, my problem is that you’re proposing to fix to a (possible) problem with CEV, by using CEV. If there really is a problem here with CEV (and I’m not convinced there is), then that problem should be fixed—just running a meta-CEV doesn’t solve it. All you’re really doing is substituting one problem for another: the problem of who would be the “right” people to choose for the initial bootstrap. Thats a Really Hard Problem, and if we knew how to solve it, then we wouldn’t really even need CEV in the first place—we could just let the Right People choose the FAI’s utility function directly.
That may not be the intention, but if they’re empowered to create ad hoc exceptions to CEV, that could end up being the effect.
You seem to be imagining the subjects of the CEV acting as agents within some negotiating process, making decisions to steer the result to their prefered outcome. Consider instead that the CEV is able to ask the subjects questions, which could be about the fairness (not the impact on the final result) of treating a subject of the larger CEV in a certain way, and get honest answers. If your thinking process has a form like “This would be best for me, but that wouldn’t really be fair to this other person”, the CEV can focus in on the “but that wouldn’t really be fair to this other person”. Even better, it can ask the question “Is it fair to that other person”, and figure out what your honest answer would be.
Basically, my problem is that you’re proposing to fix to a (possible) problem with CEV, by using CEV.
No, I am trying to solve a problem with CEV applied to an unknown set of subjects with a CEV applied to a known set of subjects.
All you’re really doing is substituting one problem for another: the problem of who would be the “right” people to choose for the initial bootstrap. Thats a Really Hard Problem, and if we knew how to solve it, then we wouldn’t really even need CEV in the first place—we could just let the Right People choose the FAI’s utility function directly.
The problem of selecting a small group of subjects for the first CEV is orders of magnitude easier than specifying a Friendly utility function. These subjects do not have to write out the utilty function, or even directly care about all things that humanity as a whole cares about. They just have to care about the problem of fairly weighting everyone in the final CEV.
If your thinking process has a form like “This would be best for me, but that wouldn’t really be fair to this other person”, the CEV can focus in on the “but that wouldn’t really be fair to this other person”. Even better, it can ask the question “Is it fair to that other person”, and figure out what your honest answer would be.
I think this is an even better point than you make it out to be. It obviates the need to consult the small group of subjects in the first place. It can be asked of everyone. When this question is asked of the Dr. Evil clones, the honest answer would be “I don’t give a care what’s fair,” and the rules for the larger CEV will then be selected without any “votes” from Evil clones.
torekp!CEV: “Is it fair to other people that Dr. Evil becomes the supreme ruler of the universe?” Dr. Evil clone #574,837,904,521: “Yes, it is. As an actually evil person, I honestly believe it.”
And right there is the reason why the plan would not work...!
The wishes of the evil clones would not converge on any particular Dr. Evil. You’d get a trillion separate little volitions, which would be outweighed by the COHERENT volition of the remaining 1%.
That might be true if Dr. Evil’s goal is to rule the world. But if Dr. Evil’s goals are either a) for the world to be ruled by a Dr. Evil or b) to destroy the world, then this is still a problem. Both of those seem like much less likely failure modes more out of something from a comic book or the like (the fact that we are calling this fellow Dr. Evil doesn’t help matters) but it does suggest that there are serious general failures of the CEV protocol.
Both of those seem like much less likely failure modes more out of something from a comic book or the like (the fact that we are calling this fellow Dr. Evil doesn’t help matters)
It could be worse: The reason why there are only two Sith, a master and apprentice, is because The Force can be used to visualize the CEV of a particular group, and The Sith have mastered this and determined that 2 is the largest reliably stable population.
An excellent idea! Of course, the CEV of a small group would probably be less precise, but I expect it’s good enough for determining the actual CEV procedure.
What if it turns out we’re ultimately preference utilitarians?
What if we used a two-tiered CEV? A CEV applied to a small, hand selected group of moral philosophers could be used to determine weighting rules and ad hoc exceptions to the CEV that runs on all of humanity to determine the utility function of the FAI.
Then when the CEV encounters the trillion Dr Evil uploads, it will consult what the group of moral philosophers would have wanted to handle it if “they knew more, thought faster, were more the people we wished we were, had grown up farther together”, which would be be something like weight them together as one person.
And who would select the initial group? Oh, I know! We can make it a 3-tiered system, and have CEV applied to an even smaller group choose the group of moral philosophers!
Wait… my spidey-sense is tingling… I think it’s trying to tell me that maybe there’s a problem with this plan
SIAI, whose researchers came up with the idea of CEV because they have the goal representing all of humanity with the FAI they want to create.
Ultimately, you have to trust someone to make these decisions. A small, select group will be designing and implementing the FAI anyways.
Yes, but a small, select group determining its moral values is a different thing entirely, and seems to defeat the whole purpose of CEV. At that point you might as well just have the small group of moral philosophers explicitly write out a “10 Commandments” style moral code and abandon CEV altogether.
That is not what I am proposing. You are attacking a straw man. The CEV of the small group of moral philosophers does not determine the utility function directly. It only determines the rules used to run the larger CEV on all of humanity, based on what the moral philosophers consider a fair way of combining utility functions, not what they want the answer to be.
That may not be the intention, but if they’re empowered to create ad hoc exceptions to CEV, that could end up being the effect.
Basically, my problem is that you’re proposing to fix to a (possible) problem with CEV, by using CEV. If there really is a problem here with CEV (and I’m not convinced there is), then that problem should be fixed—just running a meta-CEV doesn’t solve it. All you’re really doing is substituting one problem for another: the problem of who would be the “right” people to choose for the initial bootstrap. Thats a Really Hard Problem, and if we knew how to solve it, then we wouldn’t really even need CEV in the first place—we could just let the Right People choose the FAI’s utility function directly.
You seem to be imagining the subjects of the CEV acting as agents within some negotiating process, making decisions to steer the result to their prefered outcome. Consider instead that the CEV is able to ask the subjects questions, which could be about the fairness (not the impact on the final result) of treating a subject of the larger CEV in a certain way, and get honest answers. If your thinking process has a form like “This would be best for me, but that wouldn’t really be fair to this other person”, the CEV can focus in on the “but that wouldn’t really be fair to this other person”. Even better, it can ask the question “Is it fair to that other person”, and figure out what your honest answer would be.
No, I am trying to solve a problem with CEV applied to an unknown set of subjects with a CEV applied to a known set of subjects.
The problem of selecting a small group of subjects for the first CEV is orders of magnitude easier than specifying a Friendly utility function. These subjects do not have to write out the utilty function, or even directly care about all things that humanity as a whole cares about. They just have to care about the problem of fairly weighting everyone in the final CEV.
I think this is an even better point than you make it out to be. It obviates the need to consult the small group of subjects in the first place. It can be asked of everyone. When this question is asked of the Dr. Evil clones, the honest answer would be “I don’t give a care what’s fair,” and the rules for the larger CEV will then be selected without any “votes” from Evil clones.
torekp!CEV: “Is it fair to other people that Dr. Evil becomes the supreme ruler of the universe?”
Dr. Evil clone #574,837,904,521: “Yes, it is. As an actually evil person, I honestly believe it.”
And right there is the reason why the plan would not work...!
The wishes of the evil clones would not converge on any particular Dr. Evil. You’d get a trillion separate little volitions, which would be outweighed by the COHERENT volition of the remaining 1%.
That might be true if Dr. Evil’s goal is to rule the world. But if Dr. Evil’s goals are either a) for the world to be ruled by a Dr. Evil or b) to destroy the world, then this is still a problem. Both of those seem like much less likely failure modes more out of something from a comic book or the like (the fact that we are calling this fellow Dr. Evil doesn’t help matters) but it does suggest that there are serious general failures of the CEV protocol.
It could be worse: The reason why there are only two Sith, a master and apprentice, is because The Force can be used to visualize the CEV of a particular group, and The Sith have mastered this and determined that 2 is the largest reliably stable population.
An excellent idea! Of course, the CEV of a small group would probably be less precise, but I expect it’s good enough for determining the actual CEV procedure.
What if it turns out we’re ultimately preference utilitarians?