But… well, consider the set G of goals in my CEV that aren’t in humanity’s CEV. It’s clear that the goals in G aren’t shared by all human minds… but why is that a good reason to prevent an AGI from implementing them? What if some subset of G is right?
You need to distinguish between goals you have which the rest of humanity doesn’t like, from goals you have which the rest of humanity doesn’t care about. Since you are part of humanity, the only way that one of your goals could be excluded from the CEV is if someone else (or humanity in general) has a goal that’s incompatible and which is more highly weighted. If one of your goals is to have a candy bar, no one else really cares whether you have one or not, so the CEV will bring you one; but if one of your goals is to kill someone, then that goal would be excluded because it’s incompatible with other peoples’ goal of not dying.
The most common way for goals to be incompatible is to require the same resources. In that case, the CEV would do some balancing—if a human has the goal “maximize paperclips”, the CEV will allocate a limited amount of resources to making paperclips, but not so many that it can’t also make nice houses for all the humans who want them and fulfill various other goals.
Balancing resources among otherwise-compatible goals makes sense, sort of. It becomes tricky if resources are relevantly finite, but I can see where this would work.
Balancing resources among incompatible goals (e.g., A wants to kill B, B wants to live forever) is, of course, a bigger problem. Excluding incompatible goals seems a fine response. (Especially if we’re talking about actual volition.)
I had not yet come across the weighting aspect of CEV; I’d thought the idea was the CEV-implementing algorithm eliminates all goals that are incompatible with one another, not that it chooses one of them based on goal-weights and eliminates the others.
I haven’t a clue how that weighting happens. A naive answer is some function of the number of people whose CEV includes that goal… that is, some form of majority rule. Presumably there are better answers out there. Anyway, yes, I can see how that could work, sort of.
All of which is cool, and thank you, but it leaves me with the same question, relative to Eliezer’s post, that I had in the first place. Restated: if a goal G1 is right (1) but is incompatible with a higher-weighted goal that isn’t right, do we want to eliminate G1? Or does the weighting algorithm somehow prevent this?
==
(1) I’m using “right” here the same way Eliezer does, even though I think it’s a problematic usage, because the concept seems really important to this sequence… it comes up again and again. My own inclination is to throw the term away, personally.
Maybe CEV is intended to get some Right stuff done.
It would be kind of impossible, given we are right due to a gift of nature, not due to a tendency of nature, do design an algorithm which would actually be able to sort all into the Right, Not right, and Borderline categories.
I suppose Eliezer is assuming that the moral gift we have will be a bigger part of CEV than it would be of some other division of current moralities.
You need to distinguish between goals you have which the rest of humanity doesn’t like, from goals you have which the rest of humanity doesn’t care about. Since you are part of humanity, the only way that one of your goals could be excluded from the CEV is if someone else (or humanity in general) has a goal that’s incompatible and which is more highly weighted. If one of your goals is to have a candy bar, no one else really cares whether you have one or not, so the CEV will bring you one; but if one of your goals is to kill someone, then that goal would be excluded because it’s incompatible with other peoples’ goal of not dying.
The most common way for goals to be incompatible is to require the same resources. In that case, the CEV would do some balancing—if a human has the goal “maximize paperclips”, the CEV will allocate a limited amount of resources to making paperclips, but not so many that it can’t also make nice houses for all the humans who want them and fulfill various other goals.
Balancing resources among otherwise-compatible goals makes sense, sort of. It becomes tricky if resources are relevantly finite, but I can see where this would work.
Balancing resources among incompatible goals (e.g., A wants to kill B, B wants to live forever) is, of course, a bigger problem. Excluding incompatible goals seems a fine response. (Especially if we’re talking about actual volition.)
I had not yet come across the weighting aspect of CEV; I’d thought the idea was the CEV-implementing algorithm eliminates all goals that are incompatible with one another, not that it chooses one of them based on goal-weights and eliminates the others.
I haven’t a clue how that weighting happens. A naive answer is some function of the number of people whose CEV includes that goal… that is, some form of majority rule. Presumably there are better answers out there. Anyway, yes, I can see how that could work, sort of.
All of which is cool, and thank you, but it leaves me with the same question, relative to Eliezer’s post, that I had in the first place. Restated: if a goal G1 is right (1) but is incompatible with a higher-weighted goal that isn’t right, do we want to eliminate G1? Or does the weighting algorithm somehow prevent this?
==
(1) I’m using “right” here the same way Eliezer does, even though I think it’s a problematic usage, because the concept seems really important to this sequence… it comes up again and again. My own inclination is to throw the term away, personally.
Maybe CEV is intended to get some Right stuff done.
It would be kind of impossible, given we are right due to a gift of nature, not due to a tendency of nature, do design an algorithm which would actually be able to sort all into the Right, Not right, and Borderline categories.
I suppose Eliezer is assuming that the moral gift we have will be a bigger part of CEV than it would be of some other division of current moralities.
Thus rendering CEV a local optimum within a given set of gifted minds.