Since Dr. Evil is human, it shouldn’t be that bad. Extrapolated volition kicks in, making his current evil intentions irrelevant, possibly even preferring to reverse the voting exploit.
The conservative assumption to make here is that Dr. Evil is, in fact, evil; and that, after sufficient reflection, he is still evil. Perhaps he’s brain-damaged; it doesn’t matter.
Given that, will he be able to hijack CEV? Probably not, now that the scenario has been pointed out, but what other scenarios might be overlooked?
If I were to implement CEV, I’d start with only biological, non-uploaded, non-duplicated, non-mentally ill, naturally born adult humans, and then let their CEV decide whether to include others.
Is there a biological tech level you’re expecting when building an FAI becomes possible?
I don’t know. We don’t actually need any technology other than Python and vi. ;) But it’s possible uploads, cloning, genetic engineering, and so forth will be common then.
What do you mean by “naturally born”? Are artificial wombs a problem?
Yes, just to be safe, we should avoid anyone born through IVF, for instance, or whose birth was created or assisted in a lab, or who experienced any genetic modification. I’m not sure exactly where to draw the line: fertility drugs might be ok. I meant anyone conceived through normal intercourse without any technological intervention. Such people can be added in later if the CEV of the others wants them added.
It’s conceivable that children have important input for what children need that adults have for the most part forgotten.
Yes, this is a really good point, but CEV adds in what we would add in if we knew more and remembered more.
What do you mean by “naturally born”? Are artificial wombs a problem?
Yes, just to be safe, we should avoid anyone born through IVF, for instance, or whose birth was created or assisted in a lab, or who experienced any genetic modification. I’m not sure exactly where to draw the line: fertility drugs might be ok. I meant anyone conceived through normal intercourse without any technological intervention
That’s terrible. You’re letting in people who are mutated in all sorts of ways through stupid, random, ‘natural’ processes, but not those who have the power of human intelligence overriding the choices of the blind idiot god. If the extropians/transhumanists make any headway with germline genetic engineering, I want those people in charge.
Exclude people who aren’t different or problematic in any perceptible way because of your yuck factor?
Minor point, but are turkey basters technology?
Aside from the problem of leaving out what seems to be obviously part of the human range, I think that institutionalizing that distinction for something so crucial would lead to prejudice.
I have no particular yuck factor involving IVF. And you’re right that it’s not obvious where to draw the line with things like turkey basters. To be safe, I’d exclude them.
Keep in mind that this is just for the first round, and the first round group would presumably decide to expand the pool of people. It’s not permanently institutionalized. It’s just a safety precaution, because the future of humanity is at stake.
Something like the Dr. Evil CEV hack described in the main post. Essentially, we want to block out any way of creating new humans that could be used to override CEV, so it makes sense to start by blocking out all humans created artificially. It might also be a good idea to require the humans to have been born before a certain time, say 2005, so no humans created after 2005 can affect CEV (at least in the first round).
Turkey basters are probably not a threat. However, there’s an advantage to being overly conservative here. The very small number of people created or modified through some sort of artificial means for non-CEV-hacking reasons can be added in after subsequent rounds of CEV. But if the first round includes ten trillion hacked humans by mistake, it will be too late to remove them because they’ll outvote everyone else.
Requiring that people have been incubated in a human womb seems like enough of a bottleneck, though even that’s politically problematic if there are artificial wombs or tech for incubation in non-humans.
The associations between prevalence and cultural dimensions are consistent with the prediction that T. gondii can influence human culture. Just as individuals infected with T. gondii score themselves higher in the neurotic factor guilt-proneness, nations with high T. gondii prevalence had a higher aggregate neuroticism score. In addition, Western nations with high T. gondii prevalence were higher in the ‘neurotic’ cultural dimensions of masculine sex roles and uncertainty avoidance. These results were predicted by a logical scaling-up from individuals to aggregate personalities to cultural dimensions.
Requiring that people have been incubated in a human womb seems like enough of a bottleneck
You’re probably right. It probably is. But we lose nothing by being more conservative, because the first round of CEV will add in all the turkey baster babies.
You might find that deciding who’s mentally ill is a little harder, but the other criteria should be reasonably easy to define, and there are no obvious failure conditions. Let me think this over for a bit. :)
Ten thousand years later, postkangaroo children learn from their history books about Kutta, the one who has chosen to share the future with his marsupial brothers and sisters :)
If an upload remembers having had legs, and/or is motivated to acquire for itself a body with exactly two legs and no feathers, please explain either how this definition would adequately exclude uploads or why you are opposed to equal rights for very young children (not yet capable of walking upright) and amputees.
A sexbot is a robot for sex—still a human under the featherless biped definition as long as has two legs and no feathers.
If the point is to exclude “uploaded versions”, what counts as uploaded? How about if I transfer my mind (brain state) to another human body? If that makes me still a human, what rational basis is there for defining a mind-body system as human or not based on the kind of the body it is running in?
Moreover, the CEV of one trillion Evil-clones will likely be vastly different from the CEV of one Dr. Evil. For instance, Dr. Evil may have a strong desire to rule all human-like beings, but for each given copy this desire will be canceled out by the desire of the 1 trillion other copies not to be ruled by that copy.
The hack troubled me and I read part of the CEV document again.
The terms “Were more the people we wished we were” , “Extrapolated as we wish that extrapolated”, and “Interpreted as we wish that interpreted” are present in the CEV document explaining extrapolation. These pretty much guarantee that a hack such like what Wei Dai mentioned would be an extremely potent one.
However, the conservatism in the rest of the document, with phrases like below seem to take care of it fairly well.
“It should be easier to counter coherence than to create coherence. ”
“The narrower the slice of the future that our CEV wants to actively steer humanity into, the more consensus required. ”
“the initial dynamic for CEV should be conservative about saying “yes”, and listen carefully for “no”. ”
I just hope the actual numbers when entered match that. If they are, then I think the CEV might just come back with to the programmers saying “I see something weird. Kindly explain”
The narrower the slice of the future that our CEV wants to actively steer humanity into, the more consensus required.
This sounded really good when I read it in the CEV paper. But now I realize that I have no idea what it means. What is the area being measured for “narrowness”?
My understanding of narrower future is more choices taken away weighted by the number of people they are taken away from, compared to the matrix of choices present at the time of activation of CEV.
(1) it does not know how to weight choices of people not yet alive at time of activation.
(2) it does not know how to determine which choices count. For example, is Baskin Robbins to be preferred to Alinea, because Baskin Robbins offers 31 choices while Alinea offers just one (12 courses or 24)? Or Baskin Robbins^^^3 for most vs 4 free years of schooling in a subject of choice for all? Does it improve the future to give everyone additional unpalatable choices, even if few will choose them? I understand that CEV is supposed to be roughly the sum over what people would want, so some of the more absurd meanings would be screened off. But I don’t understand how this criterion is specific enough that if I were a Friendly superpower, I could use it to help me make decisions.
Why? And, more importantly, why should he care? It’s in his interest to have the FAI follow his extrapolated volition, not his revealed preference, be it in the form of his own belief about his extrapolated volition or not.
Because the power of moral philosophy to actually change things like the desire for status is limited, even in very intelligent individuals interested in moral philosophy. The hypothesis that thinking much faster, knowing much more, etc, will radically change that has little empirical support, and no strong non-empirical arguments to produce an extreme credence.
When we are speaking about what to do with the world, which is what formal preference (extrapolated volition) is ultimately about, this is different in character (domain of application) from any heuristics that a human person has for what he personally should be doing. Any human consequentialist is a hopeless dogmatic deontologist in comparison with their personal FAI. Even if we take both views as representations of the same formal object, syntactically they have little in common. We are not comparing what a human will do with what advice that human will give to himself if he knew more. Extrapolated volition is a very different kind of wish, a kind of wish that can’t be comprehended by a human, and so no heuristics already in mind will resemble heuristics about that wish.
But you seem to have the heuristic that the extrapolated volition of even the most evil human “won’t be that bad”. Where does that come from?
That’s not a heuristic in the sense I use the word in the comment above, it’s (rather weakly) descriptive of a goal and not rules for achieving it.
The main argument (and I changed my mind on this recently) is the same as for why another normal human’s preference isn’t that bad: sympathy. If human preference has a component of sympathy, of caring about other human-like persons’ preferences, then there is always a sizable slice of the control of the universe pie going to everyone’s preference, even if orders of magnitude smaller than for the preference in control. I don’t expect that even the most twisted human can have a whole aspect of preference completely absent, even if manifested to smaller degree than usual.
This apparently changes my position on the danger of value drift, and modifying minds of uploads in particular. Even though we will lose preference to the value drift, we won’t lose it completely, so long as people holding the original preference persist.
I don’t expect that even the most twisted human can have a whole aspect of preference completely absent, even if manifested to smaller degree than usual.
Humans also have other preferences that are in conflict with sympathy, for example the desire to see one’s enemies suffer. If sympathy is manifested to a sufficiently small degree, then it won’t be enough to override those other preferences.
It seems to me there’s a pretty strong correlation between philosophical competence and endorsement of utilitarian (vs egoist) values, and also that most who endorse egoist values do so because they’re confused about e.g. various issues around personal identity and the difference between pursuing one’s self-interest and following one’s own goals.
Can we taboo utilitarian since nobody ever seems to be able to agree what it means? Also, do you have any references to strong arguments for whatever you mean by utilitarianism? I’ve yet to encounter any good arguments in favour of it but given how many apparently intelligent people seem to consider themselves utilitarians they presumably exist somewhere.
Utility is just a basic way to describe “happiness” (or, if you prefer, “preferences”) in an economic context. Sometimes the measurement of utility is a utilon. To say you are a Utilitarian just means that you’d prefer an outcome that results in the largest total number of utilons over tthe human population. (Or in the universe, if you allow for Babyeaters, Clippies, Utility Monsters, Super Happies ,
and so on.)
Alicorn, who I think is more of an expert on this topic than most, had this to say:
I’m taking an entire course called “Weird Forms of Consequentialism”, so please clarify—when you say “utilitarianism”, do you speak here of direct, actual-consequence, evaluative, hedonic, maximizing, aggregative, total, universal, equal, agent-neutral consequentialism?
Just the other day I debated with PhilGoetz whether utilitarianism is supposed to imply agent-neutrality or not. I still don’t know what most people mean on that issue.
Even assuming agent neutrality there is a major difference between average and total utilitarianism. Then there are questions about whether you weight agents equally or differently based on some criteria. The question of whether/how to weight animals or other non-human entities is a subset of that question.
Given all these questions it tells me very little about what ethical system is being discussed when someone uses the word ‘utilitarian’.
Given all these questions it tells me very little about what ethical system is being discussed when someone uses the word ‘utilitarian’.
It does substantially reduce the decision space. For example, it is generally a safe-bet that the individual is not going to subscribe to deontological claims that say “killing humans is always bad.” I’d thus be very surprised to ever meet a pacifist utilitarian.
It probably is fair to say that given the space of ethical systems generally discussed on LW, talking about utilitarianism doesn’t narrow the field down much from that space.
Depending on how you define ‘philosophical competence’ the results of the PhilPapers survey may be relevant.
The PhilPapers Survey was a survey of professional philosophers and others on their philosophical views, carried out in November 2009. The Survey was taken by 3226 respondents, including 1803 philosophy faculty members and/or PhDs and 829 philosophy graduate students.
Here are the stats for Philosophy Faculty or PhD, All Respondents
Normative ethics: deontology, consequentialism, or virtue ethics?
Other 558 / 1803 (30.9%) Accept or lean toward: consequentialism 435 / 1803 (24.1%) Accept or lean toward: virtue ethics 406 / 1803 (22.5%) Accept or lean toward: deontology 404 / 1803 (22.4%)
And for Philosophy Faculty or PhD, Area of Specialty Normative Ethics
Normative ethics: deontology, consequentialism, or virtue ethics?
Other 80 / 274 (29.1%) Accept or lean toward: deontology 78 / 274 (28.4%) Accept or lean toward: consequentialism 66 / 274 (24%) Accept or lean toward: virtue ethics 50 / 274 (18.2%)
As utilitarianism is a subset of consequentialism it appears you could conclude that utilitarians are a minority in this sample.
Unfortunately the survey doesn’t directly address the main distinction in the original post since utilitarianism and egoism are both forms of consequentialism.
Since Dr. Evil is human, it shouldn’t be that bad. Extrapolated volition kicks in, making his current evil intentions irrelevant, possibly even preferring to reverse the voting exploit.
That is not the most inconvenient possible world.
The conservative assumption to make here is that Dr. Evil is, in fact, evil; and that, after sufficient reflection, he is still evil. Perhaps he’s brain-damaged; it doesn’t matter.
Given that, will he be able to hijack CEV? Probably not, now that the scenario has been pointed out, but what other scenarios might be overlooked?
I agree. (Presumably we shouldn’t include volitions of tigers in the mix, and the same should go for the actually evil alien mutants.)
So, how do we decide who’s evil?
A surprisingly good heuristic would be “choose only humans”.
If I were to implement CEV, I’d start with only biological, non-uploaded, non-duplicated, non-mentally ill, naturally born adult humans, and then let their CEV decide whether to include others.
Is there a biological tech level you’re expecting when building an FAI becomes possible?
What do you mean by “naturally born”? Are artificial wombs a problem?
It’s conceivable that children have important input for what children need that adults have for the most part forgotten.
I don’t know. We don’t actually need any technology other than Python and vi. ;) But it’s possible uploads, cloning, genetic engineering, and so forth will be common then.
Yes, just to be safe, we should avoid anyone born through IVF, for instance, or whose birth was created or assisted in a lab, or who experienced any genetic modification. I’m not sure exactly where to draw the line: fertility drugs might be ok. I meant anyone conceived through normal intercourse without any technological intervention. Such people can be added in later if the CEV of the others wants them added.
Yes, this is a really good point, but CEV adds in what we would add in if we knew more and remembered more.
That’s terrible. You’re letting in people who are mutated in all sorts of ways through stupid, random, ‘natural’ processes, but not those who have the power of human intelligence overriding the choices of the blind idiot god. If the extropians/transhumanists make any headway with germline genetic engineering, I want those people in charge.
Exclude people who aren’t different or problematic in any perceptible way because of your yuck factor?
Minor point, but are turkey basters technology?
Aside from the problem of leaving out what seems to be obviously part of the human range, I think that institutionalizing that distinction for something so crucial would lead to prejudice.
I have no particular yuck factor involving IVF. And you’re right that it’s not obvious where to draw the line with things like turkey basters. To be safe, I’d exclude them.
Keep in mind that this is just for the first round, and the first round group would presumably decide to expand the pool of people. It’s not permanently institutionalized. It’s just a safety precaution, because the future of humanity is at stake.
What risk are you trying to protect against?
Something like the Dr. Evil CEV hack described in the main post. Essentially, we want to block out any way of creating new humans that could be used to override CEV, so it makes sense to start by blocking out all humans created artificially. It might also be a good idea to require the humans to have been born before a certain time, say 2005, so no humans created after 2005 can affect CEV (at least in the first round).
Turkey basters are probably not a threat. However, there’s an advantage to being overly conservative here. The very small number of people created or modified through some sort of artificial means for non-CEV-hacking reasons can be added in after subsequent rounds of CEV. But if the first round includes ten trillion hacked humans by mistake, it will be too late to remove them because they’ll outvote everyone else.
Requiring that people have been incubated in a human womb seems like enough of a bottleneck, though even that’s politically problematic if there are artificial wombs or tech for incubation in non-humans.
However, I’m more concerned that uncaring inhuman forces already have a vote.
You may also be interested in this article:
Can the common brain parasite, Toxoplasma gondii, influence human culture?
You’re probably right. It probably is. But we lose nothing by being more conservative, because the first round of CEV will add in all the turkey baster babies.
What consitutes mental ilness is a horrible can of worms. Even defining the borders of what consitutes brain damage is terribly hard.
Ha. Okay, that’s a good one.
You might find that deciding who’s mentally ill is a little harder, but the other criteria should be reasonably easy to define, and there are no obvious failure conditions. Let me think this over for a bit. :)
Define human.
Featherless biped.
Ten thousand years later, postkangaroo children learn from their history books about Kutta, the one who has chosen to share the future with his marsupial brothers and sisters :)
If an upload remembers having had legs, and/or is motivated to acquire for itself a body with exactly two legs and no feathers, please explain either how this definition would adequately exclude uploads or why you are opposed to equal rights for very young children (not yet capable of walking upright) and amputees.
Includes sexbots, and excludes uploaded versions of me.
The point is to exclude uploaded versions of you. I’m more concerned about including plucked chickens.
BTW, what is the difference between a sexbot and a catgirl?
A sexbot is a robot for sex—still a human under the featherless biped definition as long as has two legs and no feathers.
If the point is to exclude “uploaded versions”, what counts as uploaded? How about if I transfer my mind (brain state) to another human body? If that makes me still a human, what rational basis is there for defining a mind-body system as human or not based on the kind of the body it is running in?
Moreover, the CEV of one trillion Evil-clones will likely be vastly different from the CEV of one Dr. Evil. For instance, Dr. Evil may have a strong desire to rule all human-like beings, but for each given copy this desire will be canceled out by the desire of the 1 trillion other copies not to be ruled by that copy.
No matter what he currently thinks, it doesn’t follow that it’s his extrapolated volition to rule.
The hack troubled me and I read part of the CEV document again.
The terms “Were more the people we wished we were” , “Extrapolated as we wish that extrapolated”, and “Interpreted as we wish that interpreted” are present in the CEV document explaining extrapolation. These pretty much guarantee that a hack such like what Wei Dai mentioned would be an extremely potent one.
However, the conservatism in the rest of the document, with phrases like below seem to take care of it fairly well.
“It should be easier to counter coherence than to create coherence. ” “The narrower the slice of the future that our CEV wants to actively steer humanity into, the more consensus required. ” “the initial dynamic for CEV should be conservative about saying “yes”, and listen carefully for “no”. ”
I just hope the actual numbers when entered match that. If they are, then I think the CEV might just come back with to the programmers saying “I see something weird. Kindly explain”
This sounded really good when I read it in the CEV paper. But now I realize that I have no idea what it means. What is the area being measured for “narrowness”?
My understanding of narrower future is more choices taken away weighted by the number of people they are taken away from, compared to the matrix of choices present at the time of activation of CEV.
There are many problems with this definition:
(1) it does not know how to weight choices of people not yet alive at time of activation. (2) it does not know how to determine which choices count. For example, is Baskin Robbins to be preferred to Alinea, because Baskin Robbins offers 31 choices while Alinea offers just one (12 courses or 24)? Or Baskin Robbins^^^3 for most vs 4 free years of schooling in a subject of choice for all? Does it improve the future to give everyone additional unpalatable choices, even if few will choose them? I understand that CEV is supposed to be roughly the sum over what people would want, so some of the more absurd meanings would be screened off. But I don’t understand how this criterion is specific enough that if I were a Friendly superpower, I could use it to help me make decisions.
But he should still give sizable credence to that desire persisting.
Why? And, more importantly, why should he care? It’s in his interest to have the FAI follow his extrapolated volition, not his revealed preference, be it in the form of his own belief about his extrapolated volition or not.
Because the power of moral philosophy to actually change things like the desire for status is limited, even in very intelligent individuals interested in moral philosophy. The hypothesis that thinking much faster, knowing much more, etc, will radically change that has little empirical support, and no strong non-empirical arguments to produce an extreme credence.
When we are speaking about what to do with the world, which is what formal preference (extrapolated volition) is ultimately about, this is different in character (domain of application) from any heuristics that a human person has for what he personally should be doing. Any human consequentialist is a hopeless dogmatic deontologist in comparison with their personal FAI. Even if we take both views as representations of the same formal object, syntactically they have little in common. We are not comparing what a human will do with what advice that human will give to himself if he knew more. Extrapolated volition is a very different kind of wish, a kind of wish that can’t be comprehended by a human, and so no heuristics already in mind will resemble heuristics about that wish.
But you seem to have the heuristic that the extrapolated volition of even the most evil human “won’t be that bad”. Where does that come from?
That’s not a heuristic in the sense I use the word in the comment above, it’s (rather weakly) descriptive of a goal and not rules for achieving it.
The main argument (and I changed my mind on this recently) is the same as for why another normal human’s preference isn’t that bad: sympathy. If human preference has a component of sympathy, of caring about other human-like persons’ preferences, then there is always a sizable slice of the control of the universe pie going to everyone’s preference, even if orders of magnitude smaller than for the preference in control. I don’t expect that even the most twisted human can have a whole aspect of preference completely absent, even if manifested to smaller degree than usual.
This apparently changes my position on the danger of value drift, and modifying minds of uploads in particular. Even though we will lose preference to the value drift, we won’t lose it completely, so long as people holding the original preference persist.
Humans also have other preferences that are in conflict with sympathy, for example the desire to see one’s enemies suffer. If sympathy is manifested to a sufficiently small degree, then it won’t be enough to override those other preferences.
Are you aware of what has been happening in Congo, for example?
It seems to me there’s a pretty strong correlation between philosophical competence and endorsement of utilitarian (vs egoist) values, and also that most who endorse egoist values do so because they’re confused about e.g. various issues around personal identity and the difference between pursuing one’s self-interest and following one’s own goals.
Can we taboo utilitarian since nobody ever seems to be able to agree what it means? Also, do you have any references to strong arguments for whatever you mean by utilitarianism? I’ve yet to encounter any good arguments in favour of it but given how many apparently intelligent people seem to consider themselves utilitarians they presumably exist somewhere.
Utility is just a basic way to describe “happiness” (or, if you prefer, “preferences”) in an economic context. Sometimes the measurement of utility is a utilon. To say you are a Utilitarian just means that you’d prefer an outcome that results in the largest total number of utilons over tthe human population. (Or in the universe, if you allow for Babyeaters, Clippies, Utility Monsters, Super Happies , and so on.)
Alicorn, who I think is more of an expert on this topic than most, had this to say:
Just the other day I debated with PhilGoetz whether utilitarianism is supposed to imply agent-neutrality or not. I still don’t know what most people mean on that issue.
Even assuming agent neutrality there is a major difference between average and total utilitarianism. Then there are questions about whether you weight agents equally or differently based on some criteria. The question of whether/how to weight animals or other non-human entities is a subset of that question.
Given all these questions it tells me very little about what ethical system is being discussed when someone uses the word ‘utilitarian’.
It does substantially reduce the decision space. For example, it is generally a safe-bet that the individual is not going to subscribe to deontological claims that say “killing humans is always bad.” I’d thus be very surprised to ever meet a pacifist utilitarian.
It probably is fair to say that given the space of ethical systems generally discussed on LW, talking about utilitarianism doesn’t narrow the field down much from that space.
I haven’t seen any stats on that issue. Is there any evidence relating to the topic?
Depending on how you define ‘philosophical competence’ the results of the PhilPapers survey may be relevant.
Here are the stats for Philosophy Faculty or PhD, All Respondents
And for Philosophy Faculty or PhD, Area of Specialty Normative Ethics
As utilitarianism is a subset of consequentialism it appears you could conclude that utilitarians are a minority in this sample.
Thanks! For perspective:
http://en.wikipedia.org/wiki/Consequentialism#Varieties_of_consequentialism
Unfortunately the survey doesn’t directly address the main distinction in the original post since utilitarianism and egoism are both forms of consequentialism.