Anyway, your analysis here (as with many others on LW) conflates feelings of status with some sort of actual position in some kind of dominance hierarchy. But this is a classification error. There are people who feel quite respectable, important, and proud, without needing to outwardly be “superior” in some fashion.
Those aren’t the people I’m talking about.
Truth is, if you’re worried about your place in the dominance hierarchy (by which I mean you have feelings about it, not that you’re merely curious or concerned with it for tactical or strategic reasons), that’s prima facie evidence of something that needs immediate fixing, and without waiting for an AI to modify your brain or convince you of something. Identify and eliminate the irrational perceived threat from your belief system.
You’re not dealing with the actual values the people I described have; you’re saying they should have different values. Which is unFriendly!
Phil, you’re right that there’s a difference between giving people their mutually unsatisfiable values and giving them the feeling that they’ve been satisfied. But there’s a mechanism missing from this picture:
Even if I wouldn’t want to try running an AI to have conversations with humans worldwide to convert them to more mutually satisfiable value systems, and even though I don’t want a machine to wire-head everybody into a state of illusory high status, I certainly trust humans to convince other humans to convert to mutually satisfiable values. In fact, I do it all the time. I consider it one of the most proselytism-worthy ideas ever.
So I see your post as describing a very important initiative we should all be taking, as people: convince others to find happiness in positive-sum games :)
(If I were an AI, or even just an I, perhaps you would hence define me as “unFreindly”. If so, okay then. I’m still going to go around convincing people to be better at happiness, rational-human-style.)
So I see your post as describing a very important initiative we should all be taking, as people: convince others to find happiness in positive-sum games
It’s an error to assume that human brains are actually wired for zero or negative sum games in the first place, vs. having adaptations that tend towards such a situation. Humans aren’t true maximizers; they’re maximizer-satisficers. E.g., people don’t seek the best possible mate: they seek the best mate they think they can get.
(Ironically, the greater mobility and choices in our current era often lead to decreased happiness, as our perceptions of what we ought to be able to “get” have increased.)
Anyway, ISTM that any sort of monomaniacal maximizing behavior (e.g. OCD, paranoia, etc.) is indicative of an unhealthy brain. Simple game theory suggests that putting one value so much higher than others is unlikely to be an evolutionarily stable strategy.
You’re not dealing with the actual values the people I described have; you’re saying they should have different values.
Your definition of value is not sufficient to encompass how human beings actually process values. We can have both positive and negative responses to the same “value”—and they are largely independent. People who compulsively seek nonconsensual domination of others are not (if we exclude sociopaths and clinical sadists) acting out of a desire to gain pleasure, but rather to avoid pain.
Specifically, a pain that will never actually happen, because it’s based on an incorrect belief. And correcting that belief is not the same thing as changing what’s actually valued.
In other words, what I’m saying is: you’re mistaken if you think an emotionally healthy, non-sociopathic/psychopathic human actually positively values bossing people around just to watch them jump. IMO, such a person is actually doing it to avoid losing something else—and that problem can be fixed without actually changing what the person values (positively or negatively).
In other words, what I’m saying is: you’re mistaken if you think an emotionally healthy, non-sociopathic/psychopathic human actually positively values bossing people around just to watch them jump.
Look at how value-laden that sentence is: “healthy”, “non-sociopathic/psychopathic”. You’re just asserting your values, and insisting that people with other values are wrong (“unhealthy”).
Well, hang on. pjeby probably would benefit from finding a less judgmental vocab, but he has a valid point: not every human action should be counted as evidence of a human value for the usual result of that action, because some humans have systematically erroneous beliefs about the way actions lead to results.
You may be hitting pjeby with a second-order ad hominem attack! Just because he uses vocab that’s often used to delegitimize other people doesn’t mean his arguments should be deligitimized.
But people “who compulsively seek nonconsensual domination of others” and “actually positively values bossing people around just to watch them jump” exist and are quite successful, arguably as a result of these values at other things which most humans value (sex, status, wealth). Pjeby is describing traits that are common in politicians, managers and high school teachers.
Pjeby is describing traits that are common in politicians, managers and high school teachers.
And I’m asserting that the subset of those individuals who are doing it for a direct feeling-reward (as opposed to strategic reasons) are what we would call sociopaths or psychopaths. The remainder (other than Machiavellian strategists and ’opaths) are actually doing it to avoid negative feeling-hits, rather than to obtain positive ones.
Look at how value-laden that sentence is: “healthy”, “non-sociopathic/psychopathic”. You’re just asserting your values, and insisting that people with other values are wrong (“unhealthy”).
I assume we would want CEV to exclude the preferences of sociopaths and psychopaths, as well as those of people who are actually mistaken about the beliefs underlying their preferences.
Healthy human beings experience guilt when they act like jerks, counterbalancing the pleasure, unless there are extenuating circumstances (like the other guy being a jerk, too).
And when a person doesn’t experience guilt, we call that sociopathy.
I assume we would want CEV to exclude the preferences of sociopaths and psychopaths, as well as those of people who are actually mistaken about the beliefs underlying their preferences.
“Knew more, thought faster” should make the second irrelevant (and make the “theocracy” part of “xenophobic theocracy” implausible even given majority voting). The original CEV document actually opposed excluding anyone’s preferences or loading the outcome in any way as an abuse of the programmers’ power, but IIRC, Eliezer has made comments here indicating that he’s backed down from that some.
I assume we would want CEV to exclude the preferences of sociopaths and psychopaths, as well as those of people who are actually mistaken about the beliefs underlying their preferences.
I thought the idea was that, under CEV, sociopaths would just get outvoted. People with mutant moralities wouldn’t be excluded, but, just by virtue of being mutants, their votes would be almost entirely drowned out by those with more usual moralities.
[ETA: Eliezer would object to calling these mutant moralities “moralities”, because he reserves the word “morality” for the action-preferring algorithm (or whatever the general term ought to be) that he himself would find compelling in the limit of knowledge and reflection. As I understand him, he believes that he shares this algorithm with nearly all humans.]
I thought the idea was that, under CEV, sociopaths would just get outvoted. People with mutant moralities wouldn’t be excluded, but, just by virtue of being mutants, their votes would be almost entirely drowned out by those with more usual moralities.
If it were (just) a matter of voting, then I imagine we’d end up with the AI creating a xenophobic theocracy.
If it were (just) a matter of voting, then I imagine we’d end up with the AI creating a xenophobic theocracy.
As I understand it, it’s not just a matter of voting. It’s more as though a simulated* version of each us had the opportunity to know everything relevant that the FAI knows, and to reflect fully on all that information and on his or her values to reach a fully coherent preference. In that case, it’s plausible that nearly all of us would be convinced that xenophobic theocracy was not the way to go. However, if many of us were convinced that xenophobic theocracy was right, and there were enough such people to outweigh the rest, then that would mean that we ought to go with xenophobic theocracy, and you and I are just mistaken to think otherwise.
* I think that Eliezer in fact would not want the FAI to simulate us to determine our CEV. The concern is that, if the AI is simulating us before it’s learned morality by extrapolating our volition, then the simulations would very likely lead to very many tortured minds.
I wonder if, under the current plan, CEV would take into account people’s volition about how CEV should work — i.e. if the extrapolated human race would want CEV to exclude the preferences of sociopaths/psychopaths/other moral mutants, would it do so, or does it only take into account people’s first-order volition about the properties of the FAI it will build?
In the original document, “extrapolated as we wish that extrapolated, interpreted as we wish that interpreted” sounds like it covers this, in combination with the collective nature of the extrapolation.
Those aren’t the people I’m talking about.
You’re not dealing with the actual values the people I described have; you’re saying they should have different values. Which is unFriendly!
Could be saying “I think that, upon further reflection, they would have different values in this way”.
Phil, you’re right that there’s a difference between giving people their mutually unsatisfiable values and giving them the feeling that they’ve been satisfied. But there’s a mechanism missing from this picture:
Even if I wouldn’t want to try running an AI to have conversations with humans worldwide to convert them to more mutually satisfiable value systems, and even though I don’t want a machine to wire-head everybody into a state of illusory high status, I certainly trust humans to convince other humans to convert to mutually satisfiable values. In fact, I do it all the time. I consider it one of the most proselytism-worthy ideas ever.
So I see your post as describing a very important initiative we should all be taking, as people: convince others to find happiness in positive-sum games :)
(If I were an AI, or even just an I, perhaps you would hence define me as “unFreindly”. If so, okay then. I’m still going to go around convincing people to be better at happiness, rational-human-style.)
It’s an error to assume that human brains are actually wired for zero or negative sum games in the first place, vs. having adaptations that tend towards such a situation. Humans aren’t true maximizers; they’re maximizer-satisficers. E.g., people don’t seek the best possible mate: they seek the best mate they think they can get.
(Ironically, the greater mobility and choices in our current era often lead to decreased happiness, as our perceptions of what we ought to be able to “get” have increased.)
Anyway, ISTM that any sort of monomaniacal maximizing behavior (e.g. OCD, paranoia, etc.) is indicative of an unhealthy brain. Simple game theory suggests that putting one value so much higher than others is unlikely to be an evolutionarily stable strategy.
Your definition of value is not sufficient to encompass how human beings actually process values. We can have both positive and negative responses to the same “value”—and they are largely independent. People who compulsively seek nonconsensual domination of others are not (if we exclude sociopaths and clinical sadists) acting out of a desire to gain pleasure, but rather to avoid pain.
Specifically, a pain that will never actually happen, because it’s based on an incorrect belief. And correcting that belief is not the same thing as changing what’s actually valued.
In other words, what I’m saying is: you’re mistaken if you think an emotionally healthy, non-sociopathic/psychopathic human actually positively values bossing people around just to watch them jump. IMO, such a person is actually doing it to avoid losing something else—and that problem can be fixed without actually changing what the person values (positively or negatively).
Look at how value-laden that sentence is: “healthy”, “non-sociopathic/psychopathic”. You’re just asserting your values, and insisting that people with other values are wrong (“unhealthy”).
Well, hang on. pjeby probably would benefit from finding a less judgmental vocab, but he has a valid point: not every human action should be counted as evidence of a human value for the usual result of that action, because some humans have systematically erroneous beliefs about the way actions lead to results.
You may be hitting pjeby with a second-order ad hominem attack! Just because he uses vocab that’s often used to delegitimize other people doesn’t mean his arguments should be deligitimized.
But people “who compulsively seek nonconsensual domination of others” and “actually positively values bossing people around just to watch them jump” exist and are quite successful, arguably as a result of these values at other things which most humans value (sex, status, wealth). Pjeby is describing traits that are common in politicians, managers and high school teachers.
And I’m asserting that the subset of those individuals who are doing it for a direct feeling-reward (as opposed to strategic reasons) are what we would call sociopaths or psychopaths. The remainder (other than Machiavellian strategists and ’opaths) are actually doing it to avoid negative feeling-hits, rather than to obtain positive ones.
I assume we would want CEV to exclude the preferences of sociopaths and psychopaths, as well as those of people who are actually mistaken about the beliefs underlying their preferences.
Healthy human beings experience guilt when they act like jerks, counterbalancing the pleasure, unless there are extenuating circumstances (like the other guy being a jerk, too).
And when a person doesn’t experience guilt, we call that sociopathy.
“Knew more, thought faster” should make the second irrelevant (and make the “theocracy” part of “xenophobic theocracy” implausible even given majority voting). The original CEV document actually opposed excluding anyone’s preferences or loading the outcome in any way as an abuse of the programmers’ power, but IIRC, Eliezer has made comments here indicating that he’s backed down from that some.
I thought the idea was that, under CEV, sociopaths would just get outvoted. People with mutant moralities wouldn’t be excluded, but, just by virtue of being mutants, their votes would be almost entirely drowned out by those with more usual moralities.
[ETA: Eliezer would object to calling these mutant moralities “moralities”, because he reserves the word “morality” for the action-preferring algorithm (or whatever the general term ought to be) that he himself would find compelling in the limit of knowledge and reflection. As I understand him, he believes that he shares this algorithm with nearly all humans.]
If it were (just) a matter of voting, then I imagine we’d end up with the AI creating a xenophobic theocracy.
As I understand it, it’s not just a matter of voting. It’s more as though a simulated* version of each us had the opportunity to know everything relevant that the FAI knows, and to reflect fully on all that information and on his or her values to reach a fully coherent preference. In that case, it’s plausible that nearly all of us would be convinced that xenophobic theocracy was not the way to go. However, if many of us were convinced that xenophobic theocracy was right, and there were enough such people to outweigh the rest, then that would mean that we ought to go with xenophobic theocracy, and you and I are just mistaken to think otherwise.
* I think that Eliezer in fact would not want the FAI to simulate us to determine our CEV. The concern is that, if the AI is simulating us before it’s learned morality by extrapolating our volition, then the simulations would very likely lead to very many tortured minds.
I wonder if, under the current plan, CEV would take into account people’s volition about how CEV should work — i.e. if the extrapolated human race would want CEV to exclude the preferences of sociopaths/psychopaths/other moral mutants, would it do so, or does it only take into account people’s first-order volition about the properties of the FAI it will build?
Ultimately, preference is about properties of the world, and AI with all its properties is part of the world.
In the original document, “extrapolated as we wish that extrapolated, interpreted as we wish that interpreted” sounds like it covers this, in combination with the collective nature of the extrapolation.