You’re not dealing with the actual values the people I described have; you’re saying they should have different values.
Your definition of value is not sufficient to encompass how human beings actually process values. We can have both positive and negative responses to the same “value”—and they are largely independent. People who compulsively seek nonconsensual domination of others are not (if we exclude sociopaths and clinical sadists) acting out of a desire to gain pleasure, but rather to avoid pain.
Specifically, a pain that will never actually happen, because it’s based on an incorrect belief. And correcting that belief is not the same thing as changing what’s actually valued.
In other words, what I’m saying is: you’re mistaken if you think an emotionally healthy, non-sociopathic/psychopathic human actually positively values bossing people around just to watch them jump. IMO, such a person is actually doing it to avoid losing something else—and that problem can be fixed without actually changing what the person values (positively or negatively).
In other words, what I’m saying is: you’re mistaken if you think an emotionally healthy, non-sociopathic/psychopathic human actually positively values bossing people around just to watch them jump.
Look at how value-laden that sentence is: “healthy”, “non-sociopathic/psychopathic”. You’re just asserting your values, and insisting that people with other values are wrong (“unhealthy”).
Well, hang on. pjeby probably would benefit from finding a less judgmental vocab, but he has a valid point: not every human action should be counted as evidence of a human value for the usual result of that action, because some humans have systematically erroneous beliefs about the way actions lead to results.
You may be hitting pjeby with a second-order ad hominem attack! Just because he uses vocab that’s often used to delegitimize other people doesn’t mean his arguments should be deligitimized.
But people “who compulsively seek nonconsensual domination of others” and “actually positively values bossing people around just to watch them jump” exist and are quite successful, arguably as a result of these values at other things which most humans value (sex, status, wealth). Pjeby is describing traits that are common in politicians, managers and high school teachers.
Pjeby is describing traits that are common in politicians, managers and high school teachers.
And I’m asserting that the subset of those individuals who are doing it for a direct feeling-reward (as opposed to strategic reasons) are what we would call sociopaths or psychopaths. The remainder (other than Machiavellian strategists and ’opaths) are actually doing it to avoid negative feeling-hits, rather than to obtain positive ones.
Look at how value-laden that sentence is: “healthy”, “non-sociopathic/psychopathic”. You’re just asserting your values, and insisting that people with other values are wrong (“unhealthy”).
I assume we would want CEV to exclude the preferences of sociopaths and psychopaths, as well as those of people who are actually mistaken about the beliefs underlying their preferences.
Healthy human beings experience guilt when they act like jerks, counterbalancing the pleasure, unless there are extenuating circumstances (like the other guy being a jerk, too).
And when a person doesn’t experience guilt, we call that sociopathy.
I assume we would want CEV to exclude the preferences of sociopaths and psychopaths, as well as those of people who are actually mistaken about the beliefs underlying their preferences.
“Knew more, thought faster” should make the second irrelevant (and make the “theocracy” part of “xenophobic theocracy” implausible even given majority voting). The original CEV document actually opposed excluding anyone’s preferences or loading the outcome in any way as an abuse of the programmers’ power, but IIRC, Eliezer has made comments here indicating that he’s backed down from that some.
I assume we would want CEV to exclude the preferences of sociopaths and psychopaths, as well as those of people who are actually mistaken about the beliefs underlying their preferences.
I thought the idea was that, under CEV, sociopaths would just get outvoted. People with mutant moralities wouldn’t be excluded, but, just by virtue of being mutants, their votes would be almost entirely drowned out by those with more usual moralities.
[ETA: Eliezer would object to calling these mutant moralities “moralities”, because he reserves the word “morality” for the action-preferring algorithm (or whatever the general term ought to be) that he himself would find compelling in the limit of knowledge and reflection. As I understand him, he believes that he shares this algorithm with nearly all humans.]
I thought the idea was that, under CEV, sociopaths would just get outvoted. People with mutant moralities wouldn’t be excluded, but, just by virtue of being mutants, their votes would be almost entirely drowned out by those with more usual moralities.
If it were (just) a matter of voting, then I imagine we’d end up with the AI creating a xenophobic theocracy.
If it were (just) a matter of voting, then I imagine we’d end up with the AI creating a xenophobic theocracy.
As I understand it, it’s not just a matter of voting. It’s more as though a simulated* version of each us had the opportunity to know everything relevant that the FAI knows, and to reflect fully on all that information and on his or her values to reach a fully coherent preference. In that case, it’s plausible that nearly all of us would be convinced that xenophobic theocracy was not the way to go. However, if many of us were convinced that xenophobic theocracy was right, and there were enough such people to outweigh the rest, then that would mean that we ought to go with xenophobic theocracy, and you and I are just mistaken to think otherwise.
* I think that Eliezer in fact would not want the FAI to simulate us to determine our CEV. The concern is that, if the AI is simulating us before it’s learned morality by extrapolating our volition, then the simulations would very likely lead to very many tortured minds.
I wonder if, under the current plan, CEV would take into account people’s volition about how CEV should work — i.e. if the extrapolated human race would want CEV to exclude the preferences of sociopaths/psychopaths/other moral mutants, would it do so, or does it only take into account people’s first-order volition about the properties of the FAI it will build?
In the original document, “extrapolated as we wish that extrapolated, interpreted as we wish that interpreted” sounds like it covers this, in combination with the collective nature of the extrapolation.
Your definition of value is not sufficient to encompass how human beings actually process values. We can have both positive and negative responses to the same “value”—and they are largely independent. People who compulsively seek nonconsensual domination of others are not (if we exclude sociopaths and clinical sadists) acting out of a desire to gain pleasure, but rather to avoid pain.
Specifically, a pain that will never actually happen, because it’s based on an incorrect belief. And correcting that belief is not the same thing as changing what’s actually valued.
In other words, what I’m saying is: you’re mistaken if you think an emotionally healthy, non-sociopathic/psychopathic human actually positively values bossing people around just to watch them jump. IMO, such a person is actually doing it to avoid losing something else—and that problem can be fixed without actually changing what the person values (positively or negatively).
Look at how value-laden that sentence is: “healthy”, “non-sociopathic/psychopathic”. You’re just asserting your values, and insisting that people with other values are wrong (“unhealthy”).
Well, hang on. pjeby probably would benefit from finding a less judgmental vocab, but he has a valid point: not every human action should be counted as evidence of a human value for the usual result of that action, because some humans have systematically erroneous beliefs about the way actions lead to results.
You may be hitting pjeby with a second-order ad hominem attack! Just because he uses vocab that’s often used to delegitimize other people doesn’t mean his arguments should be deligitimized.
But people “who compulsively seek nonconsensual domination of others” and “actually positively values bossing people around just to watch them jump” exist and are quite successful, arguably as a result of these values at other things which most humans value (sex, status, wealth). Pjeby is describing traits that are common in politicians, managers and high school teachers.
And I’m asserting that the subset of those individuals who are doing it for a direct feeling-reward (as opposed to strategic reasons) are what we would call sociopaths or psychopaths. The remainder (other than Machiavellian strategists and ’opaths) are actually doing it to avoid negative feeling-hits, rather than to obtain positive ones.
I assume we would want CEV to exclude the preferences of sociopaths and psychopaths, as well as those of people who are actually mistaken about the beliefs underlying their preferences.
Healthy human beings experience guilt when they act like jerks, counterbalancing the pleasure, unless there are extenuating circumstances (like the other guy being a jerk, too).
And when a person doesn’t experience guilt, we call that sociopathy.
“Knew more, thought faster” should make the second irrelevant (and make the “theocracy” part of “xenophobic theocracy” implausible even given majority voting). The original CEV document actually opposed excluding anyone’s preferences or loading the outcome in any way as an abuse of the programmers’ power, but IIRC, Eliezer has made comments here indicating that he’s backed down from that some.
I thought the idea was that, under CEV, sociopaths would just get outvoted. People with mutant moralities wouldn’t be excluded, but, just by virtue of being mutants, their votes would be almost entirely drowned out by those with more usual moralities.
[ETA: Eliezer would object to calling these mutant moralities “moralities”, because he reserves the word “morality” for the action-preferring algorithm (or whatever the general term ought to be) that he himself would find compelling in the limit of knowledge and reflection. As I understand him, he believes that he shares this algorithm with nearly all humans.]
If it were (just) a matter of voting, then I imagine we’d end up with the AI creating a xenophobic theocracy.
As I understand it, it’s not just a matter of voting. It’s more as though a simulated* version of each us had the opportunity to know everything relevant that the FAI knows, and to reflect fully on all that information and on his or her values to reach a fully coherent preference. In that case, it’s plausible that nearly all of us would be convinced that xenophobic theocracy was not the way to go. However, if many of us were convinced that xenophobic theocracy was right, and there were enough such people to outweigh the rest, then that would mean that we ought to go with xenophobic theocracy, and you and I are just mistaken to think otherwise.
* I think that Eliezer in fact would not want the FAI to simulate us to determine our CEV. The concern is that, if the AI is simulating us before it’s learned morality by extrapolating our volition, then the simulations would very likely lead to very many tortured minds.
I wonder if, under the current plan, CEV would take into account people’s volition about how CEV should work — i.e. if the extrapolated human race would want CEV to exclude the preferences of sociopaths/psychopaths/other moral mutants, would it do so, or does it only take into account people’s first-order volition about the properties of the FAI it will build?
Ultimately, preference is about properties of the world, and AI with all its properties is part of the world.
In the original document, “extrapolated as we wish that extrapolated, interpreted as we wish that interpreted” sounds like it covers this, in combination with the collective nature of the extrapolation.