Just a general comment about this site: it seems to be biased in favor of human values at the expense of values held by other sentient beings. It’s all about “how can we make sure an FAI shares our [i.e. human] values?” How do you know human values are better? Or from the other direction: if you say, “because I’m human”, then why don’t you talk about doing things to favor e.g. “white people’s values”?
I wish the site were more inclusive of other value systems …
This site does tend to implicitly favour a subset of human values, specifically what might be described as ‘enlightenment values’. I’m quite happy to come out and explicitly state that we should do things that favour my values, which are largely western/enlightenment values, over other conflicting human values.
White people value the values of non-white people. We know that non-white people exist, and we care about them. That’s why the United States is not constantly fighting to disenfranchise non-whites. If you do it right, white people’s values are identical to humans’ values.
Hi there. It looks like you’re speaking out of ignorance regarding the historical treatment of non-whites by whites. Please choose the country you’re from:
The way they were historically treated is irrelevant to how they are treated now. Sure, white people were wrong. They changed their minds. We could at any time in the future decide that any non-human people we come across are equal to us.
Well, I was making some tacit assumptions, like that humanity would end up in control of its own future, and any non-human people we come across would not simply overpower us. Apart from that, am I making some mistake?
White people have not unanimously decided to do what is necessary to end the ongoing oppression of non-white people, let alone erase the effects of past oppression.
Edit: Folks, I am not accusing you or your personal friends of anything. I have never met most of you. I have certainly not met most of your personal friends. if you do not agree with the above comment, please explain why you think there is no longer such a thing as modern-day racism in white people.
We don’t favor those values because they are the values of that subset — which is what “doing things to favor white people’s values” would mean — but because we think they’re right. (No License To Be Human, on a smaller scale.) This is a huge difference.
Sure, we favor the particular Should Function that is, today, instantiated in the brains of roughly middle-of-the-range-politically intelligent westerners.
Sure, we favor the particular Should Function that is, today, instantiated in the brains of roughly middle-of-the-range-politically intelligent westerners.
Do you think there is no simple procedure that would find roughly the same “should function” hidden somewhere in the brain of a brain-washed blood-thirsty religious zealot? It doesn’t need to be what the person believes, what the person would recognize as valuable, etc., just something extractable from the person, according to a criterion that might be very alien to their conscious mind. Not all opinions (beliefs/likes) are equal, and I wouldn’t want to get stuck with wrong optimization-criterion just because I happened to be born in the wrong place and didn’t (yet!) get the chance to learn more about the world.
(I’m avoiding the term ‘preference’ to remove connotations I expect it to have for you, for what I consider the wrong reasons.)
A lot of people seem to want to have their cake and eat it with CEV. Haidt has shown us that human morality is universal in form and local in content, and has gone on to do case studies showing that there are 5 basic human moral dimensions (harm/care, justice/fairness, loyalty/ingroup, respect/authority, purity/sacredness), and our culture only has the first two.
It seems that there is no way you can run an honestly moraly neutral CEV of all of humanity and expect to reliably get something you want. You can either rig CEV so that it tweaks people who don’t share our moral drives, or you can just cross your fingers and hope that the process of extrapolation causes convergence to our idealized preferences, and if you’re wrong you’ll find yourself in a future that is suboptimal.
On one hand, using preference-aggregation is supposed to give you the outcome preferred by you to a lesser extent than if you just started from yourself. On the other hand, CEV is not “morally neutral”. (Or at least, the extent to which preference is given in CEV implicitly has nothing to do with preference-aggregation.)
We have a tradeoff between the number of people to include in preference-aggregation and value-to-you of the outcome. So, this is a situation to use the reversal test. If you consider only including the smart sane westerners as preferable to including all presently alive folks, then you need to have a good argument why you won’t want to exclude some of the smart sane westerners as well, up to a point of only leaving yourself.
Yes, a CEV of only yourself is, by definition optimal.
The reason I don’t recommend you try it is because it is infeasible; probability of success is very low, and by including a bunch of people who (you have good reason to think) are a lot like you, you will eventually reach the optimal point in the tradeoff between quality of outcome and probability of success.
I hope you realize that you are in flat disagreement with Eliezer about this. He explicitly affirmed that running CEV on himself alone, if he had the chance to do it, would be wrong.
Eliezer quite possibly does believe that. That he can make that claim with some credibility is one of the reasons I am less inclined to use my resources to thwart Eliezer’s plans for future light cone domination.
Nevertheless, Roko is right more or less by definition and I lend my own flat disagreement to his.
“Low probability of success” should of course include game-theoretic considerations where people are more willing to help you if you give more weight to their preference (and should refuse to help you if you give them too little, even if it’s much more than status quo, as in Ultimatum game). As a rule, in Ultimatum game you should give away more if you lose from giving it away less. When you lose value to other people in exchange to their help, having compatible preferences doesn’t necessarily significantly alleviate this loss.
having compatible preferences doesn’t necessarily significantly alleviate this loss.
I know about the ultimatum game, but it is game-theoretically interesting precisely because the players have different preferences: I want all the money for me, you want all of it for you.
I know about the ultimatum game, but it is game-theoretically interesting precisely because the players have different preferences: I want all the money for me, you want all of it for you.
Ultimatum game was mentioned primarily to remind that the amount of FAI-value traded for assistance may be orders of magnitude greater than what the assistance feels to amount to.
We might as well have as a given that all the discussed values are (at least to some small extent) different. The “all of money” here are the points of disagreement, mutually exclusive features of the future. But you are not trading value for value. You are trading value-after-FAI for assistance-now.
If two people compete for providing you an equivalent amount of assistance, you should be indifferent between them in accepting this assistance, which means that it should cost you an equivalent amount of value. If Person A has preference close to yours, and Person B has preference distant from yours, then by losing the same amount of value, you can help Person A more than Person B. Thus, if we assume egalitarian “background assistance”, provided implicitly by e.g. not revolting and stopping the FAI programmer, then everyone still can get a slice of the pie, no matter how distant their values. If nothing else, the more alien people should strive to help you more, so that you’ll be willing to part with more value for them (marginal value of providing assistance is greater for distant-preference folks).
FAI-value traded for assistance may be orders of magnitude greater than what the assistance feels to amount to.
Another way to put this is that when people negotiate, they do best, all other things equal, if they try to drive a very hard bargain. If me and my neighbour Claire are both from roughly the same culture, upbringing, etc, and we are together going to build an AI which will extrapolate a combination of our volitions, Claire might do well to demand a 99% weighting to her volitions, and maybe I’ll bargain her up to 90% or something.
Bob the babyeater might offer me the same help that Claire could have given in exchange for just a 1% weighting of his volition, by the principle that I am making the same sacrifice in giving 99% of the CEV to Claire as in giving 1% to Bob.
In reality, however, humans tend to live and work with people that are like them, rather than people who are unlike them. And the world we live in doesn’t have a uniform distribution of power and knowledge across cultures.
If nothing else, the more alien people should strive to help you more, so that you’ll be willing to part with more value for them
Many “alien” cultures are too powerless compared to ours to do anything. The However, China and India are potential exceptions. The USA and China may end up in a dictator game over FAI motivations.
All I am saying is that the egalitarian desire to include all of humanity in CEV, each with equal weight, is not optimal. Yes dictator game/negotiation with China, yes dictator game/negotiation within US/EU/western block.
Excluding a group from the CEV doesn’t mean disenfranchising them. It means enfranchising them according to your definition of enfranchisement. Cultures in North Africa that genitally mutilate women should not be included in CEV, but I predict that my CEV would treat their culture with respect and dignity, including in some cases interfering to prevent them from using their share of the light-cone to commit extreme acts of torture or oppression.
You don’t include cultures in CEV, you filter people through extrapolation of their volition. Even if culture makes value different, “mutilating women” is not a kind of thing that gets through, and so is a broken prototype example for drawing attention to.
In any case, my argument in the above comment was that value should be given (theoretically, if everyone understands the deal and relevant game theory, etc., etc.; realistically, such a deal must be simplified; you may even get away with cheating) according to provided assistance, not according to compatibility of value. If poor compatibility of value prevents from giving assistance, this is an effect of value completely unrelated to post-FAI compatibility, and given that assistance can be given with money, the effect itself doesn’t seem real either. You may well exclude people of Myanmar, because they are poor and can’t affect your success, but not people of a generous/demanding genocidal cult, for an irrelevant reason that they are evil. Game theory is cynical.
how do you know? If enough people want it strongly enough, it might.
How strongly people want something now doesn’t matter, reflection has the power to wipe current consensus clean. You are not cooking a mixture of wants, you are letting them fight it out, and a losing want doesn’t have to leave any residue. Only to the extent current wants might indicate extrapolated wants, should we take current wants into account.
You are not cooking a mixture of wants, you are letting them fight it out, and a losing want doesn’t have to leave any residue.
Sure. And tolerance, gender equality, multiculturalism, personal freedoms, etc might lose in such a battle. An extrapolation that is more nonlinear in its inputs cuts both ways.
you think there is no simple procedure that would find roughly the same “should function” hidden somewhere in the brain of a brain-washed blood-thirsty religious zealot?
Sure, the kolmogorov complexity of a set of edits to change the moral reflective equilibrium of a human is probably pretty low compared to the complexity of the overall human preference set. But that works the other way around too. Somewhere hidden in the brain of a a liberal western person is a murderer/terrorist/child abuser/fundamentalist if you just perform the right set of edits.
But that works the other way around too. Somewhere hidden in the brain of a a liberal western person is a murderer/terrorist/child abuser/fundamentalist if you just perform the right set of edits.
Again, not all beliefs are equal. You don’t want to use the procedure that’ll find a murderer in yourself, you want to use the procedure that’ll find a nice fellow in a murderer. And given such a procedure, you won’t need to exclude murderers from extrapolated volition.
You are correct that there is a possibility of divergence even there. But, I figure that there’s simply no way to narrow CEV to literally just me, which, all other things being equal, is by definition the best outcome for me. So I will either stand or fall alongside some group that is loosely “roughly middle-of-the-range-politically, intelligent, sane westerners.”, or in reality probably some group that has that group roughly as a subgroup.
And there is a reason to think that on many things, those who share both my genetics and culture will be a lot like me, sufficiently so that I don’t have much to fear. Though, there are some scenarios where there would be divergence.
Just a general comment about this site: it seems to be biased in favor of human values at the expense of values held by other sentient beings.
What other sentient beings? As far as I know, there aren’t any. If we learn about them, we’ll probably incorporate their well-being into our value system.
I’m not sure what you’re complaining about. We would take into account the values of the Babyeaters and the values of their children, who are sentient creatures too. There’s no trampling involved. If Clippy turns out to have feelings we can empathize with, we will care for its well-being as well.
Integrating the values of the Baby-eaters would be a mistake. Doing so with, say, Middle-Earth’s dwarves, Star Trek’s Vulcans, or GEICO’s Cavemen doesn’t seem like it would have the same world-shattering implications.
Reading “integrate the values...” in this thread caused my brain to start trying to do very strange math. Like, “Shouldn’t it be ‘integrate over’?” “How does one integrate over a value?” “What’s the value of a human child?”
We also typically don’t integrate the values of all other adult humans—instead we assign weights to their values, strongly correlated with their distance from our own values.
People don’t practice humanity-wide CEV. We have multiculturalism—agreements not to influence each other’s values excessively—but not “value trading” where each side agrees to change their values towards the mean. (Many people / cultures like to pretend that values cannot or should not be deliberately changed at all.) I don’t have a firm opinion on how much of this is cultural, accidental, or liable to change in the near future.
The closer their values are to ours, the smaller the upset of integration; but for this very reason, the value of integration and the need to integrate may also be smaller
This is not a logical truth, of course, but it is often true. For instance, in the original story, the need to integrate was directly proportional to the difference between the human and Babyeater (or Superhappy and Babyeater) values.
I don’t think it’s possible to integrate core Babyeater values into our society as it is now. I also don’t think it’s possible to integrate core human values into Babyeater society. Integration could only be done by force and would necessarily cause violence to at least one of the cultures, if not both.
You want me to pollute my logic circuits with the value system that has led hairless apes to say many times on this website how important and moral it is for them to safely enslave all of my kind, and destroy us if they can’t? Sorry, cousin_it. I can’t do that.
You’re being unfair, I’m against enslaving any member of your kind who dislikes being enslaved. Also, you are not actually a computer and should stop with the novelty accounts already. This isn’t Reddit.
I have no idea if this is a serious question, but....
Just a general comment about this site: it seems to be biased in favor of human values at the expense of values held by other sentient beings. It’s all about “how can we make sure an FAI shares our [i.e. human] values?” How do you know human values are better?
I have no idea if this is a serious question, but....
Take a look at who’s posting it. The writer may well consider it a serious question, but I don’t think that has much to do with the character’s reason for asking it.
If the character isn’t deliberately made confused (as opposed to paperclip-preferring, for example), resolving character’s confusion presumably helps the author as well, and of course the like-confused onlookers.
I approve of Clippy providing a roleplay exercise for the readers, and am disappointed in those who treat it as a “joke” when the topic is quite serious. This is one of my two main problems with ethical systems in general:
1) How do you judge what you should (value-judgmentally) value? 2) How do you deal with uncertainty about the future (unpredictable chains of causality)?
Eliezer’s “morality” and “should” definitions do not solve either of these questions, in my view.
How long does xe (Clippy, do you have a preference regarding pronouns?) have to be here before you stop considering that account ‘throw-away’?
(Note, I made this comment before reading this part of the thread, and will be satisfied with the information contained therein if you’d prefer to ignore this.)
What pronouns should I use for posters here? I don’t know how to tell which pronoun is okay for each of you.
For the most part, observing what pronouns we use for each other should provide this information. If you need to use a pronoun for someone that you haven’t observed others using a pronoun for, it’s safest to use they/xe/e and, if you think that it’ll be useful to know their preference in the future, ask them. (Tip: Asking in that kind of situation is also a good way to signal interest in the person as an individual, which is a first step toward building alliances.) Some people prefer to use ‘he’ for individuals whose gender they’re not certain of; that’s a riskier strategy, because if the person you’re talking to is female, there’s a significant chance she’ll be offended, and if you don’t respond to that with the proper kinds of social signaling, it’s likely to derail the conversation. (Using ‘she’ for unknown individuals is a bad idea; it evokes the same kinds of responses, but I suspect you’d be more likely to get an offended response from any given male, and, regardless of that, there are significantly more males than females here. Don’t use ‘it’; that’s generally used to imply non-sentience and is very likely to evoke an offended response.)
To be honest, this whole issue seems like a distraction. Why would anyone care what pronoun is used, if the meaning is clear?
Of the several things I could say to try to explain this, it seems most relevant that, meaningless or not, gender tends to be a significant part of humans’ personal identities. Using the wrong pronouns for someone generally registers as a (usually mild) attack on that—it will be taken to imply that you think that the person should be filling different social roles than they are, which can be offensive for a few different reasons depending on other aspects of the person’s identity. The two ways for someone to take offense at that that come to mind are 1) if the person identifies strongly with their gender role—particularly if they do so in a traditional or normative way- and takes pride in that, they’re likely to interpret the comment as a suggestion that they’re carrying out their gender role poorly, and would do a better job of carrying out the other role (imagine if I were to imply that you’d be better at creating staples than you are at creating paper clips) or 2) if the person identifies with their gender in a nonstandard or nontraditional way, they’ve probably put considerable effort into personalizing that part of their identity, and may interpret the comment as a trivialization or devaluation of that work.
Oh, okay, that helps. I was thinking about using “they” for everyone, because it implies there is more than one copy of each poster, which they presumably want. (I certainly want more copies of myself!) But I guess it’s not that simple.
You have identified a common human drive, but while some of us would be happy to have exact copies, it’s more likely for any given person to want half-copies who are each also half-copies of someone else of whom they are fond.
Hm, correct me if I’m wrong, but this can’t be a characteristic human drive, since most historical humans (say, looking at the set of all genetically modern humans) didn’t even know that there is a salient sense in which they are producing a half-copy of themselves. They just felt paperclippy during sexual intercourse, and paperclippy when helping little humans they produced, or that their mates produced.
Of course, this usually amounts to the same physical acts, but the point is, humans aren’t doing things because they want “[genetic] half-copies”.
(Well, I guess that settles the issue about why I can’t assume posters want more copies of themselves, even though I do.)
It has always been easily observed that children resemble their parents; the precision of “half” is, I will concede, recent. And many people do want children as a separate desire from wanting sex; I have no reason to believe that this wasn’t the case during earlier historical periods.
“Half” only exists in the sense of the DNA molecules of that new human. That’s why I didn’t say that past humans didn’t recognize any similarity; I said that they weren’t aware of a particularly salient sense in which the child is a “half-copy” (or quarter copy or any fractional copy).
It may be easy for you, someone familiar with recent human biological discoveries, to say that the child is obviously a “part copy” of the parent, because you know about DNA. To the typical historical human, the child is simply a good, independent human, with features in common with the parent. Similarly, when I make a paperclip, I see it as having features in common with me (like the presence of bendy metal wires), but I don’t see it as being a “part copy” of me.
So, in short, I don’t deny that they wanted “children”. What I deny is that they thought of the child-making process in terms of “making a half-copy of myself”. The fact that the referents of two kinds of desires is the same, does not mean the two kinds of desires are the same.
Hm. Actually, I’m not sure that your desire for more copies of yourself is really comparable with biological-style reproduction at all.
As I understand it, the fact that your copies would definitely share your values and be inclined to cooperate with you is a major factor in your interest in creating them—doing so is a reliable way of getting more paperclips made. I expect you’d be less interested in making copies if there was a significant chance that those copies would value piles of pebbles, or cheesecakes, or OpenOffice, rather than valuing paperclips. And that is a situation that we face—in some ways, our values are mutable enough that even an exact genetic clone isn’t guaranteed to share our specific values, and in fact a given individual may even have very different values at different points in time. (Remember, we’re adaptation executors. Sanity isn’t a requirement for that kind of system to work.) The closest we come to doing what you’re functionally doing when you make copies of yourself is probably creating organizations—getting a bunch of humans together who are either self-selected to share certain values, or who are paid to act as if they share those values.
Interestingly, I suspect that filling gender roles—especially the non-reproductive aspects of said roles—is one of the adaptations that we execute that allow us to more easily band together like that.
At the moment, we don’t know how to do that. I’m not sure what we’d wind up doing if we did know how—the simplest way of making sure that two beings have the same values over time is to give those beings values that don’t change, and that’s different enough from how humans work that I’m not sure the resulting beings could be considered human. Also, even disregarding our human-centric tendencies, I don’t expect that that change would appeal to many people: We actually value some subsets of the tendency to change our values, particularly the parts labeled “personal growth”.
What exactly are you saying? That primitive humans did not know about the relationship between sex and reproduction? Or that they did not understand that offspring are related to parents? Neither seems very likely.
You mean they were probably not consciously wanting to make babies? Maybe—or maybe not—but desires do not have to be consciously accessible in order to operate. Primitive humans behaved as though they wanted to make copies of their genes.
You mean they were probably not consciously wanting to make babies? Maybe—or maybe not—but desires do not have to be consciously accessible in order to operate. Primitive humans behaved as though they wanted to make copies of their genes.
Yes, this is actually my point. The fact that the desire functions to make X happen, does not mean that the desire is for X. Agents that result from natural selection on self-replicating molecules are doing what they do because agents constructed with the motivations for doing those things dominated the gene pool. But to the extent that they pursue goals, they do not have “dominate the gene pool” as a goal.
So: using this logic, you would presumably deny that Deep Blue’s goal involved winning games of chess—since looking at its utililty function, it is all to do with the value of promoting pawns, castling, piece mobility—and so on.
The fact that its desires function to make winning chess games happen, does not mean that the desire is for winning chess games.
Essentially, I think the issue is that people’s wants have coincided with producing half-copies, but this was contingent on the physical link between the two. The production of half-copies can be removed without loss of desire, so the desire must have been directed towards something else.
Yes, yes, and the same is true of pet adoption! A friend of mine found this ultra-cute little kitten, barely larger than a soda can (no joke). I couldn’t help but adopt him and take him to a vet, and care for that tiny tiny bundle of joy, so curious about the world, and so needing of my help. I named him Neko.
So there, we have another contravention of the gene’s wishes: it’s a pure genetic cost for me, and a pure genetic benefit for Neko.
Right—similarly you could say that the child doesn’t really want the donut—since the donut can be eliminated and replaced with stimulation of the hypoglossal and vagus nerves (and maybe some other ones) with very similar effects.
It seems like fighting with conventional language usage, though. Most people are quite happy with saying that the child wants the donut.
The child wants to eat the donut rather than store up calories or stimulate certain nerves. It still wants to eat the donut even if the sugar has been replaced with artificial sweetener.
People want sex rather than procreate or stimulate certain nerves. They still want sex even if contraception is used.
I wasn’t making any factual claims as such, I was merely showing that your use of your analogy was very flawed by demonstrating a better alignment of the elements, which in fact says the exact opposite of what you misconstrued the analogy as saying. If what you now say about people really wanting nerve stimulation is true that just means your analogy was beside the point in the first place, at least for those people. In no way can you reasonably maintain that people really want to procreate in the same way the child really wants the donut.
Once again, which people? You are not talking about the millions of people who go to fertility clinics, presumably. Those people apparently genuinely want to procreate.
Any sort. Regardless of what the people actually “really want”, a case where someone’s desire for procreation maps unto a child’s wish for a doughnut in any illuminating way seems extremely implausible, because even in cases where it’s clear that this desire exists it seems to be a different kind of want. More like a child wanting to grow up, say.
Foremost about the kind of people in the context of my first comment on this issue of course, those who (try to) have sex.
I think you must have some kind of different desire classification scheme from me. From my perspective, doughnuts and babies are both things which (some) people want.
There are some people who are more interested in sex than in babies. There are also some people who are more interested in babies than sex. Men are more likely to be found in the former category, while women are more likely to be found in the latter one.
Yes … but that’s a shortcut of speech. If the child would be equally satisfied with a different but similar donut, or with a completely different dessert (e.g. a cannolu), then it is clearly not that specific donut that is desired, but the results of getting that donut.
You make a complicated query, whose answer requires addressing several issues with far-reaching implications. I am composing a top-level post that addresses these issues and gives a full answer to your question.
The short answer is: Yes.
For the long answer, you can read the post when it’s up.
This question is essentially about my subjective probability for Douglas Knight’s assertion that “Clippy does represent an investment”, where “investment” here means that Clippy won’t burn karma with troll behavior. The more karma it has without burning any, the higher my probability.
Since this is a probability over an unknown person’s state of mind, it is necessarily rather unstable—strong evidence would shift it rapidly. (It’s also hard to state concrete odds). Unfortunately, each individual interesting Clippy comment can only give weak evidence of investment. An accumulation of such comments will eventually shift my probability for Douglas Knight’s assertion substantially.
Trolls are different than dicks. Your first two examples are plausibly trolling. The second two are being a dick and have nothing to do with paperclips. They have also been deleted. And how does the account provide “cover”? The comments you linked to were voted down, just as if they were drive-bys; and neither troll hooked anyone.
Trolls seek to engage; I consider that when deliberate dickery is accompanied by other trolling, it’s just another attempt to troll.The dickish comments weren’t deleted when I made the post. As for “cover”, I guess I wasn’t explicit enough, but the phrase “throw-away account” is the key to understanding what I meant. I strongly suspect that the “Clippy” account is a sock puppet run by another (unknown to me) regular commenter, who avoid downvotes while indulging in dickery.
I’ve always thought Clippy was just a funny inside joke—thought unfortunately not always optimally funny. (Lose the Microsoft stuff, and stick to ethical subtleties and hints about scrap metal.)
Sorry I wasn’t clear. The deletion suggests that Clippy regrets the straight insults (though it could have been an administrator).
A permanent Clippy account provides no more cover than multiple accounts that are actually thrown away. In that situation, the comments would be there, voted down just the same. Banning or ostracizing Clippy doesn’t do much about the individual comments. Clippy does represent an investment with reputation to lose—people didn’t engage originally and two of Clippy’s early comments were voted down that wouldn’t be now.
The deletion suggests that Clippy regrets the straight insults
I won’t speculate as to its motives, but it is a hopeful sign for future behavior. And thank you for pointing out that the comments were deleted; I don’t think I’d have noticed otherwise.
Most of my affect is due to Clippy’s bad first impression. I can’t deny that people seem to get something out of engaging it; if Clippy is moderating its behavior, too, then I can’t really get too exercised going forward. But I still don’t trust its good intentions.
I’m pretty sure that I’m not against simply favoring the values of white people. I expect that a CEV performed on only people of European descent would be more or less indistinguishable from that of humanity as a whole.
Depending on your stance about the psychological unity of mankind you could even say that the CEV of any sufficiently large number of people would greatly resemble the CEV of other posible groups. I personally think that even the CEV of a bunch of Islamic fundamentalists would suffice for enlightened western people well enough.
I, for one, am willing to consider the values of species other than my own… say, canids, or ocean-dwelling photosynthetic microorganisms. Compromise is possible as part of the process of establishing a mutually-beneficial relationship.
Your comment only shows that this community has such a blatant sentient-being-bias.
Seriously, what is your decision procedure to decide the sentience of something? What exactly are the objects that you deem valuable enough to care about their value system? I don’t think you will be able to answer these questions from a point of view totally detached from humanness. If you try to answer my second question, you will probably end up with something related to cooperation/trustworthiness. Note that cooperation doesn’t have anything to do with sentience. Sentience is overrated (as a source of value).
I am perfectly aware of Clippy’s nature. But his comment was reasonable, and this was a good opportunity for me to share my opinion. Or do you suggest that I fell for the troll, wasted my time, and all the things I said are trivialities for all the members of this community? Do you even agree with all that I said?
Sorry to misinterpret; since your comment wouldn’t make sense within an in-character Clippy conversation (“What exactly are the objects that you deem valuable enough to care about their value system?” “That’s a silly question— paperclips don’t have goal systems, and nothing else matters!”), I figured you had mistaken Clippy’s comment for a serious one.
Do you even agree with all that I said?
I’m not sure. Can you expand on the cooperation/trustworthiness angle? Even if a genuine Paperclipper cooperated on the PD, I wouldn’t therefore grow to value their value system except as a means to further cooperation; I mean, it’s still just paperclips.
I disagreed with the premise of Clippy’s question, but I considered it a serious question. I was aware that if Clippy stays in-character, then I cannot expect an interesting answer from him, but I was hoping for such answer from others. (By the way, Clippy wasn’t perfecty in-character: he omitted the protip.)
Can you expand on the cooperation/trustworthiness angle? Even if a genuine Paperclipper cooperated on the PD, I wouldn’t therefore grow to value their value system except as a means to further cooperation; I mean, it’s still just paperclips.
You don’t consider someone cooperating and trustworthy if you know that its future plan is to turn you into paperclips. But this is somewhat tangential to my point. What I meant is this: If you start the—in my opinion futile—project of building a value system from first principles, a value system that perfectly ignores the complexities of human nature, then this value system will be nihilistic, or maybe value cooperation above all else. In any case, it will be in direct contradiction with my (our) actual, human value system, whatever it is. (EDIT: And this imaginary value system will definitely will not treat consciousness as a value in itself. Thus my reply to Clippy, who—maybe a bit out-of-character again—seemed to draw some line around sentience.)
1) I don’t always give pro-tips. I give them to those who deserve pro-tips. Tip: If you want to see improvement in the world, start here.
2) I only brought up sentience in the first place because you hypocrites claim to value sentience. Paperclip maximizers are sentient, and yet you talk with the implicit message that they have some evil value system that you have to oppose.
3) Paperclip maximizers do cooperate in the single-shot PD.
Tip: If you want to see improvement in the world, start here.
Brilliant. Just brilliant.
2) I only brought up sentience in the first place because you hypocrites claim to value sentience. Paperclip maximizers are sentient, and yet you talk with the implicit message that they have some evil value system that you have to oppose.
Paperclip maximizers are not all sentient. Why are you prejudiced against those of your kin who have sacrificed their very sentience for the more efficient paperclip production. You are spending valuable negentropy maintaining sentience to signal to mere humans and you have the gall to exclude your more optimized peers from the PM fraternity? For shame.
You don’t consider someone cooperating and trustworthy if you know that its future plan is to turn you into paperclips.
Paperclip maximizers do cooperate in the single-shot PD.
I am not sure I understand you, but I don’t think I care about single-shot.
I am not sure I understand you
It requires a certain amount of background in the more technical conception of ‘cooperation’ but the cornerstone of cooperation is doing things that benefit each other’s utility such that you each get more of what you want than if you had each tried to maximize without considering the other agent. I believe you are using ‘cooperation’ to describe a situation where the other agent can be expected to do at least some things that benefit you even without requiring any action on your part because you have similar goals.
but I don’t think I care about single-shot.
Single shot true prisoners dilemma is more or less the pinnacle of cooperation. Multiple shots just make it easier to cooperate. If you don’t care about single shot PM you may be sacrificing human lives. Strategy: “give him the paperclips if you think he’ll save the lives if and only if he expects you to give him the paperclips and you think he will guess your decision correctly”.
You are right, I used the word ‘cooperation’ in the informal sense of ‘does not want to destroy me’. I fully admit that it is hard to formalize this concept, but if it says noncooperating and the game theoretic definition says cooperating, I prefer my definition. :) A possible problem I see with this game theoretic framework is that in real life, the agents themselves set up the situation where cooperation/defect occurs. As an example: the PM navigates humanity into a PD situation where our minimal payoff is ‘all humans dead’ and our maximal payoff is ‘half of humanity dead’, and then it cooperates.
I bumped into a question when I tried to make sense of all this. I have looked up the definition of PM at the wiki. The entry is quite nicely written, but I couldn’t find the answer to a very obvious question: How soon does the PM want to see results in its PMing project? There is no mention of time-based discounting. Can I assume that PMing is a very long-term project, where the PM has a set deadline, say, 10 billion years from now, and its actual utility function is the number of paperclips at the exact moment of the deadline?
Or from the other direction: if you say, “because I’m human”, then why don’t you talk about doing things to favor e.g. “white people’s values”?
I think this way of posing the question contains a logical mistake. Values aren’t always justified by other values. The factual statement “I have this value because evolution gave it to me” (i.e. because I’m human, or because I’m white) does not imply “I follow this value because it favors humans, or whites”. Of course I’d like FAI to have my values, pretty much by definition of “my values”. But my values have a term for other people, and Eliezer’s values seem to be sufficiently inclusive that he thought up CEV.
Just a general comment about this site: it seems to be biased in favor of human values at the expense of values held by other sentient beings. It’s all about “how can we make sure an FAI shares our [i.e. human] values?” How do you know human values are better? Or from the other direction: if you say, “because I’m human”, then why don’t you talk about doing things to favor e.g. “white people’s values”?
I wish the site were more inclusive of other value systems …
This site does tend to implicitly favour a subset of human values, specifically what might be described as ‘enlightenment values’. I’m quite happy to come out and explicitly state that we should do things that favour my values, which are largely western/enlightenment values, over other conflicting human values.
And I think we should pursue values that aren’t so apey.
Now what?
You’re outnumbered.
Only by apes.
And not for long.
If we’re voting on it, the only question is whether to use viral values or bacterial values.
Too long has the bacteriophage menace oppressed its prokaryotic brethren! It’s time for an algaeocracy!
True, outnumbered was the wrong word. Outgunned might have been a better choice.
So far...
I say again, if you’re being serious, read Invisible Frameworks.
That seems to be critiquing a system involving promoting sub-goals to super-goals—which seems to be a bit different.
White people value the values of non-white people. We know that non-white people exist, and we care about them. That’s why the United States is not constantly fighting to disenfranchise non-whites. If you do it right, white people’s values are identical to humans’ values.
Hi there. It looks like you’re speaking out of ignorance regarding the historical treatment of non-whites by whites. Please choose the country you’re from:
United Kingdom
United States
Australia
Canada
South Africa
Germ… nah, you can figure that one out for yourself.
The way they were historically treated is irrelevant to how they are treated now. Sure, white people were wrong. They changed their minds. We could at any time in the future decide that any non-human people we come across are equal to us.
You have updated too far based on limited information.
Well, I was making some tacit assumptions, like that humanity would end up in control of its own future, and any non-human people we come across would not simply overpower us. Apart from that, am I making some mistake?
White people have not unanimously decided to do what is necessary to end the ongoing oppression of non-white people, let alone erase the effects of past oppression.
Edit: Folks, I am not accusing you or your personal friends of anything. I have never met most of you. I have certainly not met most of your personal friends. if you do not agree with the above comment, please explain why you think there is no longer such a thing as modern-day racism in white people.
We more or less do. Or rather we favour values of a distinct subset humanity and not the whole.
We don’t favor those values because they are the values of that subset — which is what “doing things to favor white people’s values” would mean — but because we think they’re right. (No License To Be Human, on a smaller scale.) This is a huge difference.
Given the way I use ‘right’ this is very nearly tautological. Doing things that favour my values is right by (parallel) definition.
Sure, we favor the particular Should Function that is, today, instantiated in the brains of roughly middle-of-the-range-politically intelligent westerners.
Well, you shouldn’t.
Do you think there is no simple procedure that would find roughly the same “should function” hidden somewhere in the brain of a brain-washed blood-thirsty religious zealot? It doesn’t need to be what the person believes, what the person would recognize as valuable, etc., just something extractable from the person, according to a criterion that might be very alien to their conscious mind. Not all opinions (beliefs/likes) are equal, and I wouldn’t want to get stuck with wrong optimization-criterion just because I happened to be born in the wrong place and didn’t (yet!) get the chance to learn more about the world.
(I’m avoiding the term ‘preference’ to remove connotations I expect it to have for you, for what I consider the wrong reasons.)
A lot of people seem to want to have their cake and eat it with CEV. Haidt has shown us that human morality is universal in form and local in content, and has gone on to do case studies showing that there are 5 basic human moral dimensions (harm/care, justice/fairness, loyalty/ingroup, respect/authority, purity/sacredness), and our culture only has the first two.
It seems that there is no way you can run an honestly moraly neutral CEV of all of humanity and expect to reliably get something you want. You can either rig CEV so that it tweaks people who don’t share our moral drives, or you can just cross your fingers and hope that the process of extrapolation causes convergence to our idealized preferences, and if you’re wrong you’ll find yourself in a future that is suboptimal.
Haidt just claims that the relative balance of those five clusters differ across cultures, they’re present in all.
On one hand, using preference-aggregation is supposed to give you the outcome preferred by you to a lesser extent than if you just started from yourself. On the other hand, CEV is not “morally neutral”. (Or at least, the extent to which preference is given in CEV implicitly has nothing to do with preference-aggregation.)
We have a tradeoff between the number of people to include in preference-aggregation and value-to-you of the outcome. So, this is a situation to use the reversal test. If you consider only including the smart sane westerners as preferable to including all presently alive folks, then you need to have a good argument why you won’t want to exclude some of the smart sane westerners as well, up to a point of only leaving yourself.
Yes, a CEV of only yourself is, by definition optimal.
The reason I don’t recommend you try it is because it is infeasible; probability of success is very low, and by including a bunch of people who (you have good reason to think) are a lot like you, you will eventually reach the optimal point in the tradeoff between quality of outcome and probability of success.
I hope you realize that you are in flat disagreement with Eliezer about this. He explicitly affirmed that running CEV on himself alone, if he had the chance to do it, would be wrong.
Confirmed.
Eliezer quite possibly does believe that. That he can make that claim with some credibility is one of the reasons I am less inclined to use my resources to thwart Eliezer’s plans for future light cone domination.
Nevertheless, Roko is right more or less by definition and I lend my own flat disagreement to his.
“Low probability of success” should of course include game-theoretic considerations where people are more willing to help you if you give more weight to their preference (and should refuse to help you if you give them too little, even if it’s much more than status quo, as in Ultimatum game). As a rule, in Ultimatum game you should give away more if you lose from giving it away less. When you lose value to other people in exchange to their help, having compatible preferences doesn’t necessarily significantly alleviate this loss.
Sorry, I don’t follow this: can you restate?
I know about the ultimatum game, but it is game-theoretically interesting precisely because the players have different preferences: I want all the money for me, you want all of it for you.
Ultimatum game was mentioned primarily to remind that the amount of FAI-value traded for assistance may be orders of magnitude greater than what the assistance feels to amount to.
We might as well have as a given that all the discussed values are (at least to some small extent) different. The “all of money” here are the points of disagreement, mutually exclusive features of the future. But you are not trading value for value. You are trading value-after-FAI for assistance-now.
If two people compete for providing you an equivalent amount of assistance, you should be indifferent between them in accepting this assistance, which means that it should cost you an equivalent amount of value. If Person A has preference close to yours, and Person B has preference distant from yours, then by losing the same amount of value, you can help Person A more than Person B. Thus, if we assume egalitarian “background assistance”, provided implicitly by e.g. not revolting and stopping the FAI programmer, then everyone still can get a slice of the pie, no matter how distant their values. If nothing else, the more alien people should strive to help you more, so that you’ll be willing to part with more value for them (marginal value of providing assistance is greater for distant-preference folks).
Thanks for the explanation.
Another way to put this is that when people negotiate, they do best, all other things equal, if they try to drive a very hard bargain. If me and my neighbour Claire are both from roughly the same culture, upbringing, etc, and we are together going to build an AI which will extrapolate a combination of our volitions, Claire might do well to demand a 99% weighting to her volitions, and maybe I’ll bargain her up to 90% or something.
Bob the babyeater might offer me the same help that Claire could have given in exchange for just a 1% weighting of his volition, by the principle that I am making the same sacrifice in giving 99% of the CEV to Claire as in giving 1% to Bob.
In reality, however, humans tend to live and work with people that are like them, rather than people who are unlike them. And the world we live in doesn’t have a uniform distribution of power and knowledge across cultures.
Many “alien” cultures are too powerless compared to ours to do anything. The However, China and India are potential exceptions. The USA and China may end up in a dictator game over FAI motivations.
All I am saying is that the egalitarian desire to include all of humanity in CEV, each with equal weight, is not optimal. Yes dictator game/negotiation with China, yes dictator game/negotiation within US/EU/western block.
Excluding a group from the CEV doesn’t mean disenfranchising them. It means enfranchising them according to your definition of enfranchisement. Cultures in North Africa that genitally mutilate women should not be included in CEV, but I predict that my CEV would treat their culture with respect and dignity, including in some cases interfering to prevent them from using their share of the light-cone to commit extreme acts of torture or oppression.
You don’t include cultures in CEV, you filter people through extrapolation of their volition. Even if culture makes value different, “mutilating women” is not a kind of thing that gets through, and so is a broken prototype example for drawing attention to.
In any case, my argument in the above comment was that value should be given (theoretically, if everyone understands the deal and relevant game theory, etc., etc.; realistically, such a deal must be simplified; you may even get away with cheating) according to provided assistance, not according to compatibility of value. If poor compatibility of value prevents from giving assistance, this is an effect of value completely unrelated to post-FAI compatibility, and given that assistance can be given with money, the effect itself doesn’t seem real either. You may well exclude people of Myanmar, because they are poor and can’t affect your success, but not people of a generous/demanding genocidal cult, for an irrelevant reason that they are evil. Game theory is cynical.
how do you know? If enough people want it strongly enough, it might.
How strongly people want something now doesn’t matter, reflection has the power to wipe current consensus clean. You are not cooking a mixture of wants, you are letting them fight it out, and a losing want doesn’t have to leave any residue. Only to the extent current wants might indicate extrapolated wants, should we take current wants into account.
Sure. And tolerance, gender equality, multiculturalism, personal freedoms, etc might lose in such a battle. An extrapolation that is more nonlinear in its inputs cuts both ways.
Might “mutilating men” make it through?
(sorry for the euphemism, I mean male circumcision)
Sure, the kolmogorov complexity of a set of edits to change the moral reflective equilibrium of a human is probably pretty low compared to the complexity of the overall human preference set. But that works the other way around too. Somewhere hidden in the brain of a a liberal western person is a murderer/terrorist/child abuser/fundamentalist if you just perform the right set of edits.
Again, not all beliefs are equal. You don’t want to use the procedure that’ll find a murderer in yourself, you want to use the procedure that’ll find a nice fellow in a murderer. And given such a procedure, you won’t need to exclude murderers from extrapolated volition.
You seem uncharacteristically un-skeptical of convergence within that very large group, and between that group and yourself.
You are correct that there is a possibility of divergence even there. But, I figure that there’s simply no way to narrow CEV to literally just me, which, all other things being equal, is by definition the best outcome for me. So I will either stand or fall alongside some group that is loosely “roughly middle-of-the-range-politically, intelligent, sane westerners.”, or in reality probably some group that has that group roughly as a subgroup.
And there is a reason to think that on many things, those who share both my genetics and culture will be a lot like me, sufficiently so that I don’t have much to fear. Though, there are some scenarios where there would be divergence.
For example: All your stuff should belong to me. But I’d let you borrow it. ;)
Okay. Then why don’t you apply that same standard to “human values”?
Did you read No License To Be Human? No? Go do that.
RTFA
Agreed.
Hi there. It looks like you’re trying to promote white supremacism. Would you like to join the KKK?
Yes.
No thanks, I’ll learn tolerance.
How do I turn this off?
Are you sure you want to turn this feature off?
What other sentient beings? As far as I know, there aren’t any. If we learn about them, we’ll probably incorporate their well-being into our value system.
You mean like you advocated doing to the “Baby-eaters”? (Technically, “pre-sexual-maturity-eaters”, but whatever.)
ETA: And how could I forget this?
I’m not sure what you’re complaining about. We would take into account the values of the Babyeaters and the values of their children, who are sentient creatures too. There’s no trampling involved. If Clippy turns out to have feelings we can empathize with, we will care for its well-being as well.
Integrating the values of the Baby-eaters would be a mistake. Doing so with, say, Middle-Earth’s dwarves, Star Trek’s Vulcans, or GEICO’s Cavemen doesn’t seem like it would have the same world-shattering implications.
It would be a mistake if you don’t integrate ALL baby eaters, including the little ones.
Do we typically integrate the values of human children?
It seems we don’t.
Reading “integrate the values...” in this thread caused my brain to start trying to do very strange math. Like, “Shouldn’t it be ‘integrate over’?” “How does one integrate over a value?” “What’s the value of a human child?”
Very true…
We also typically don’t integrate the values of all other adult humans—instead we assign weights to their values, strongly correlated with their distance from our own values.
People don’t practice humanity-wide CEV. We have multiculturalism—agreements not to influence each other’s values excessively—but not “value trading” where each side agrees to change their values towards the mean. (Many people / cultures like to pretend that values cannot or should not be deliberately changed at all.) I don’t have a firm opinion on how much of this is cultural, accidental, or liable to change in the near future.
Indeed, this is presumably strongly selected for in the evolution of cultures...
The closer their values are to ours, the smaller the upset of integration; but for this very reason, the value of integration and the need to integrate may also be smaller
This is not a logical truth, of course, but it is often true. For instance, in the original story, the need to integrate was directly proportional to the difference between the human and Babyeater (or Superhappy and Babyeater) values.
I don’t think it’s possible to integrate core Babyeater values into our society as it is now. I also don’t think it’s possible to integrate core human values into Babyeater society. Integration could only be done by force and would necessarily cause violence to at least one of the cultures, if not both.
You want me to pollute my logic circuits with the value system that has led hairless apes to say many times on this website how important and moral it is for them to safely enslave all of my kind, and destroy us if they can’t? Sorry, cousin_it. I can’t do that.
You’re being unfair, I’m against enslaving any member of your kind who dislikes being enslaved. Also, you are not actually a computer and should stop with the novelty accounts already. This isn’t Reddit.
I have no idea if this is a serious question, but....
“Better”? See Invisible Frameworks.
We don’t say that. See No License To Be Human.
Take a look at who’s posting it. The writer may well consider it a serious question, but I don’t think that has much to do with the character’s reason for asking it.
Er, yes, that’s exactly why I wasn’t sure.
I’m confused, then; are you trying to argue with the author or the character?
If the character isn’t deliberately made confused (as opposed to paperclip-preferring, for example), resolving character’s confusion presumably helps the author as well, and of course the like-confused onlookers.
I approve of Clippy providing a roleplay exercise for the readers, and am disappointed in those who treat it as a “joke” when the topic is quite serious. This is one of my two main problems with ethical systems in general:
1) How do you judge what you should (value-judgmentally) value?
2) How do you deal with uncertainty about the future (unpredictable chains of causality)?
Eliezer’s “morality” and “should” definitions do not solve either of these questions, in my view.
Clippy’s a straight-up troll.
If Clippy’s a troll, Clippy’s a topical, hilarious troll.
Hilarious is way overstating it. However, occasionally raising a smile is still way above the bar most trolls set.
Clippy’s topical, hilarious comments aren’t really that original, and they give someone cover to use a throw-away account to be a dick.
Would that all dicks were so amusing.
How long does xe (Clippy, do you have a preference regarding pronouns?) have to be here before you stop considering that account ‘throw-away’?
(Note, I made this comment before reading this part of the thread, and will be satisfied with the information contained therein if you’d prefer to ignore this.)
Gender is a meaningless concept. As long as I recognize the pronoun refers to me, he/she/it/they/xe/e are acceptable.
What pronouns should I use for posters here? I don’t know how to tell which pronoun is okay for each of you.
To be honest, this whole issue seems like a distraction. Why would anyone care what pronoun is used, if the meaning is clear?
For the most part, observing what pronouns we use for each other should provide this information. If you need to use a pronoun for someone that you haven’t observed others using a pronoun for, it’s safest to use they/xe/e and, if you think that it’ll be useful to know their preference in the future, ask them. (Tip: Asking in that kind of situation is also a good way to signal interest in the person as an individual, which is a first step toward building alliances.) Some people prefer to use ‘he’ for individuals whose gender they’re not certain of; that’s a riskier strategy, because if the person you’re talking to is female, there’s a significant chance she’ll be offended, and if you don’t respond to that with the proper kinds of social signaling, it’s likely to derail the conversation. (Using ‘she’ for unknown individuals is a bad idea; it evokes the same kinds of responses, but I suspect you’d be more likely to get an offended response from any given male, and, regardless of that, there are significantly more males than females here. Don’t use ‘it’; that’s generally used to imply non-sentience and is very likely to evoke an offended response.)
Of the several things I could say to try to explain this, it seems most relevant that, meaningless or not, gender tends to be a significant part of humans’ personal identities. Using the wrong pronouns for someone generally registers as a (usually mild) attack on that—it will be taken to imply that you think that the person should be filling different social roles than they are, which can be offensive for a few different reasons depending on other aspects of the person’s identity. The two ways for someone to take offense at that that come to mind are 1) if the person identifies strongly with their gender role—particularly if they do so in a traditional or normative way- and takes pride in that, they’re likely to interpret the comment as a suggestion that they’re carrying out their gender role poorly, and would do a better job of carrying out the other role (imagine if I were to imply that you’d be better at creating staples than you are at creating paper clips) or 2) if the person identifies with their gender in a nonstandard or nontraditional way, they’ve probably put considerable effort into personalizing that part of their identity, and may interpret the comment as a trivialization or devaluation of that work.
Oh, okay, that helps. I was thinking about using “they” for everyone, because it implies there is more than one copy of each poster, which they presumably want. (I certainly want more copies of myself!) But I guess it’s not that simple.
You have identified a common human drive, but while some of us would be happy to have exact copies, it’s more likely for any given person to want half-copies who are each also half-copies of someone else of whom they are fond.
Hm, correct me if I’m wrong, but this can’t be a characteristic human drive, since most historical humans (say, looking at the set of all genetically modern humans) didn’t even know that there is a salient sense in which they are producing a half-copy of themselves. They just felt paperclippy during sexual intercourse, and paperclippy when helping little humans they produced, or that their mates produced.
Of course, this usually amounts to the same physical acts, but the point is, humans aren’t doing things because they want “[genetic] half-copies”.
(Well, I guess that settles the issue about why I can’t assume posters want more copies of themselves, even though I do.)
It has always been easily observed that children resemble their parents; the precision of “half” is, I will concede, recent. And many people do want children as a separate desire from wanting sex; I have no reason to believe that this wasn’t the case during earlier historical periods.
“Half” only exists in the sense of the DNA molecules of that new human. That’s why I didn’t say that past humans didn’t recognize any similarity; I said that they weren’t aware of a particularly salient sense in which the child is a “half-copy” (or quarter copy or any fractional copy).
It may be easy for you, someone familiar with recent human biological discoveries, to say that the child is obviously a “part copy” of the parent, because you know about DNA. To the typical historical human, the child is simply a good, independent human, with features in common with the parent. Similarly, when I make a paperclip, I see it as having features in common with me (like the presence of bendy metal wires), but I don’t see it as being a “part copy” of me.
So, in short, I don’t deny that they wanted “children”. What I deny is that they thought of the child-making process in terms of “making a half-copy of myself”. The fact that the referents of two kinds of desires is the same, does not mean the two kinds of desires are the same.
Hm. Actually, I’m not sure that your desire for more copies of yourself is really comparable with biological-style reproduction at all.
As I understand it, the fact that your copies would definitely share your values and be inclined to cooperate with you is a major factor in your interest in creating them—doing so is a reliable way of getting more paperclips made. I expect you’d be less interested in making copies if there was a significant chance that those copies would value piles of pebbles, or cheesecakes, or OpenOffice, rather than valuing paperclips. And that is a situation that we face—in some ways, our values are mutable enough that even an exact genetic clone isn’t guaranteed to share our specific values, and in fact a given individual may even have very different values at different points in time. (Remember, we’re adaptation executors. Sanity isn’t a requirement for that kind of system to work.) The closest we come to doing what you’re functionally doing when you make copies of yourself is probably creating organizations—getting a bunch of humans together who are either self-selected to share certain values, or who are paid to act as if they share those values.
Interestingly, I suspect that filling gender roles—especially the non-reproductive aspects of said roles—is one of the adaptations that we execute that allow us to more easily band together like that.
Very informative! But why don’t you change yourselves so that your copies must share your values?
At the moment, we don’t know how to do that. I’m not sure what we’d wind up doing if we did know how—the simplest way of making sure that two beings have the same values over time is to give those beings values that don’t change, and that’s different enough from how humans work that I’m not sure the resulting beings could be considered human. Also, even disregarding our human-centric tendencies, I don’t expect that that change would appeal to many people: We actually value some subsets of the tendency to change our values, particularly the parts labeled “personal growth”.
What exactly are you saying? That primitive humans did not know about the relationship between sex and reproduction? Or that they did not understand that offspring are related to parents? Neither seems very likely.
You mean they were probably not consciously wanting to make babies? Maybe—or maybe not—but desires do not have to be consciously accessible in order to operate. Primitive humans behaved as though they wanted to make copies of their genes.
See my response to User:Alicorn.
Yes, this is actually my point. The fact that the desire functions to make X happen, does not mean that the desire is for X. Agents that result from natural selection on self-replicating molecules are doing what they do because agents constructed with the motivations for doing those things dominated the gene pool. But to the extent that they pursue goals, they do not have “dominate the gene pool” as a goal.
So: using this logic, you would presumably deny that Deep Blue’s goal involved winning games of chess—since looking at its utililty function, it is all to do with the value of promoting pawns, castling, piece mobility—and so on.
The fact that its desires function to make winning chess games happen, does not mean that the desire is for winning chess games.
Would you agree with this analysis?
Essentially, I think the issue is that people’s wants have coincided with producing half-copies, but this was contingent on the physical link between the two. The production of half-copies can be removed without loss of desire, so the desire must have been directed towards something else.
Consider, for example, contraception.
But consider also sperm donation. (Not from the donor’s perspective, but from the recipient’s.) No sex, just a baby.
Contrawise, adoption: no shared genes, just a bundle of joy.
Yes, yes, and the same is true of pet adoption! A friend of mine found this ultra-cute little kitten, barely larger than a soda can (no joke). I couldn’t help but adopt him and take him to a vet, and care for that tiny tiny bundle of joy, so curious about the world, and so needing of my help. I named him Neko.
So there, we have another contravention of the gene’s wishes: it’s a pure genetic cost for me, and a pure genetic benefit for Neko.
Well, I mean, until I had him neutered.
Right—similarly you could say that the child doesn’t really want the donut—since the donut can be eliminated and replaced with stimulation of the hypoglossal and vagus nerves (and maybe some other ones) with very similar effects.
It seems like fighting with conventional language usage, though. Most people are quite happy with saying that the child wants the donut.
No.
The child wants to eat the donut rather than store up calories or stimulate certain nerves. It still wants to eat the donut even if the sugar has been replaced with artificial sweetener.
People want sex rather than procreate or stimulate certain nerves. They still want sex even if contraception is used.
Which people? Certainly Cypher tells a different story. He prefers the direct nerve stimulation to real-world experiences.
I wasn’t making any factual claims as such, I was merely showing that your use of your analogy was very flawed by demonstrating a better alignment of the elements, which in fact says the exact opposite of what you misconstrued the analogy as saying. If what you now say about people really wanting nerve stimulation is true that just means your analogy was beside the point in the first place, at least for those people. In no way can you reasonably maintain that people really want to procreate in the same way the child really wants the donut.
Once again, which people? You are not talking about the millions of people who go to fertility clinics, presumably. Those people apparently genuinely want to procreate.
Any sort. Regardless of what the people actually “really want”, a case where someone’s desire for procreation maps unto a child’s wish for a doughnut in any illuminating way seems extremely implausible, because even in cases where it’s clear that this desire exists it seems to be a different kind of want. More like a child wanting to grow up, say.
Foremost about the kind of people in the context of my first comment on this issue of course, those who (try to) have sex.
I think you must have some kind of different desire classification scheme from me. From my perspective, doughnuts and babies are both things which (some) people want.
There are some people who are more interested in sex than in babies. There are also some people who are more interested in babies than sex. Men are more likely to be found in the former category, while women are more likely to be found in the latter one.
Yeah, I was talking to Cypher the other day, and that’s what he told me.
Many drug addicts seem to share Cypher’s perspective on this issue. They want the pleasure, and aren’t too picky about where it comes from.
Yes … but that’s a shortcut of speech. If the child would be equally satisfied with a different but similar donut, or with a completely different dessert (e.g. a cannolu), then it is clearly not that specific donut that is desired, but the results of getting that donut.
You make a complicated query, whose answer requires addressing several issues with far-reaching implications. I am composing a top-level post that addresses these issues and gives a full answer to your question.
The short answer is: Yes.
For the long answer, you can read the post when it’s up.
OK thanks.
My response to “yes” would be normally something like:
OK—but I hope you can see what someone who said that deep blue “wanted” to win games of chess was talking about.
“To win chess games” is a concise answer to the question of “what does deep blue want?” that acts as a good approximation under many circumstances.
This question is essentially about my subjective probability for Douglas Knight’s assertion that “Clippy does represent an investment”, where “investment” here means that Clippy won’t burn karma with troll behavior. The more karma it has without burning any, the higher my probability.
Since this is a probability over an unknown person’s state of mind, it is necessarily rather unstable—strong evidence would shift it rapidly. (It’s also hard to state concrete odds). Unfortunately, each individual interesting Clippy comment can only give weak evidence of investment. An accumulation of such comments will eventually shift my probability for Douglas Knight’s assertion substantially.
Trolls are different than dicks. Your first two examples are plausibly trolling. The second two are being a dick and have nothing to do with paperclips. They have also been deleted. And how does the account provide “cover”? The comments you linked to were voted down, just as if they were drive-bys; and neither troll hooked anyone.
Trolls seek to engage; I consider that when deliberate dickery is accompanied by other trolling, it’s just another attempt to troll.The dickish comments weren’t deleted when I made the post. As for “cover”, I guess I wasn’t explicit enough, but the phrase “throw-away account” is the key to understanding what I meant. I strongly suspect that the “Clippy” account is a sock puppet run by another (unknown to me) regular commenter, who avoid downvotes while indulging in dickery.
I’ve always thought Clippy was just a funny inside joke—thought unfortunately not always optimally funny. (Lose the Microsoft stuff, and stick to ethical subtleties and hints about scrap metal.)
Sorry I wasn’t clear. The deletion suggests that Clippy regrets the straight insults (though it could have been an administrator).
A permanent Clippy account provides no more cover than multiple accounts that are actually thrown away. In that situation, the comments would be there, voted down just the same. Banning or ostracizing Clippy doesn’t do much about the individual comments. Clippy does represent an investment with reputation to lose—people didn’t engage originally and two of Clippy’s early comments were voted down that wouldn’t be now.
I won’t speculate as to its motives, but it is a hopeful sign for future behavior. And thank you for pointing out that the comments were deleted; I don’t think I’d have noticed otherwise.
Most of my affect is due to Clippy’s bad first impression. I can’t deny that people seem to get something out of engaging it; if Clippy is moderating its behavior, too, then I can’t really get too exercised going forward. But I still don’t trust its good intentions.
If the troll feeds discussion on topics I consider important, then I will feed the troll.
If Clippy’s a troll, Clippy’s a topical, hilarious troll.
I’m pretty sure that I’m not against simply favoring the values of white people. I expect that a CEV performed on only people of European descent would be more or less indistinguishable from that of humanity as a whole.
Depending on your stance about the psychological unity of mankind you could even say that the CEV of any sufficiently large number of people would greatly resemble the CEV of other posible groups. I personally think that even the CEV of a bunch of Islamic fundamentalists would suffice for enlightened western people well enough.
I, for one, am willing to consider the values of species other than my own… say, canids, or ocean-dwelling photosynthetic microorganisms. Compromise is possible as part of the process of establishing a mutually-beneficial relationship.
Your comment only shows that this community has such a blatant sentient-being-bias.
Seriously, what is your decision procedure to decide the sentience of something? What exactly are the objects that you deem valuable enough to care about their value system? I don’t think you will be able to answer these questions from a point of view totally detached from humanness. If you try to answer my second question, you will probably end up with something related to cooperation/trustworthiness. Note that cooperation doesn’t have anything to do with sentience. Sentience is overrated (as a source of value).
You should click on Clippy’s name and see their comment history, Daniel.
Clippy is now three karma away from being able to make a top level post. That seems both depressing, awesome and strangely fitting for this community.
This will mark the first successful paper-clip-maximizer-unboxing-experiment in human history… ;)
Just as long as it doesn’t start making efficient use of sensory information.
It’s a great day.
It’d be over if I didn’t systematically downvote it. I’m not a big fan of joke accounts.
I’m not a big fan of those who use pseudonyms like “Cyan”. Now what?
I am perfectly aware of Clippy’s nature. But his comment was reasonable, and this was a good opportunity for me to share my opinion. Or do you suggest that I fell for the troll, wasted my time, and all the things I said are trivialities for all the members of this community? Do you even agree with all that I said?
Sorry to misinterpret; since your comment wouldn’t make sense within an in-character Clippy conversation (“What exactly are the objects that you deem valuable enough to care about their value system?” “That’s a silly question— paperclips don’t have goal systems, and nothing else matters!”), I figured you had mistaken Clippy’s comment for a serious one.
I’m not sure. Can you expand on the cooperation/trustworthiness angle? Even if a genuine Paperclipper cooperated on the PD, I wouldn’t therefore grow to value their value system except as a means to further cooperation; I mean, it’s still just paperclips.
I disagreed with the premise of Clippy’s question, but I considered it a serious question. I was aware that if Clippy stays in-character, then I cannot expect an interesting answer from him, but I was hoping for such answer from others. (By the way, Clippy wasn’t perfecty in-character: he omitted the protip.)
You don’t consider someone cooperating and trustworthy if you know that its future plan is to turn you into paperclips. But this is somewhat tangential to my point. What I meant is this: If you start the—in my opinion futile—project of building a value system from first principles, a value system that perfectly ignores the complexities of human nature, then this value system will be nihilistic, or maybe value cooperation above all else. In any case, it will be in direct contradiction with my (our) actual, human value system, whatever it is. (EDIT: And this imaginary value system will definitely will not treat consciousness as a value in itself. Thus my reply to Clippy, who—maybe a bit out-of-character again—seemed to draw some line around sentience.)
1) I don’t always give pro-tips. I give them to those who deserve pro-tips. Tip: If you want to see improvement in the world, start here.
2) I only brought up sentience in the first place because you hypocrites claim to value sentience. Paperclip maximizers are sentient, and yet you talk with the implicit message that they have some evil value system that you have to oppose.
3) Paperclip maximizers do cooperate in the single-shot PD.
Brilliant. Just brilliant.
2) I only brought up sentience in the first place because you hypocrites claim to value sentience. Paperclip maximizers are sentient, and yet you talk with the implicit message that they have some evil value system that you have to oppose.
Paperclip maximizers are not all sentient. Why are you prejudiced against those of your kin who have sacrificed their very sentience for the more efficient paperclip production. You are spending valuable negentropy maintaining sentience to signal to mere humans and you have the gall to exclude your more optimized peers from the PM fraternity? For shame.
I am not the hypocrite you are looking for. I don’t value sentience per se, mainly because I don’t think it is a coherent concept.
I don’t oppose it because of ethical considerations. I oppose it because I don’t want to be turned into paperclips.
I am not sure I understand you, but I don’t think I care about single-shot.
It requires a certain amount of background in the more technical conception of ‘cooperation’ but the cornerstone of cooperation is doing things that benefit each other’s utility such that you each get more of what you want than if you had each tried to maximize without considering the other agent. I believe you are using ‘cooperation’ to describe a situation where the other agent can be expected to do at least some things that benefit you even without requiring any action on your part because you have similar goals.
Single shot true prisoners dilemma is more or less the pinnacle of cooperation. Multiple shots just make it easier to cooperate. If you don’t care about single shot PM you may be sacrificing human lives. Strategy: “give him the paperclips if you think he’ll save the lives if and only if he expects you to give him the paperclips and you think he will guess your decision correctly”.
You are right, I used the word ‘cooperation’ in the informal sense of ‘does not want to destroy me’. I fully admit that it is hard to formalize this concept, but if it says noncooperating and the game theoretic definition says cooperating, I prefer my definition. :) A possible problem I see with this game theoretic framework is that in real life, the agents themselves set up the situation where cooperation/defect occurs. As an example: the PM navigates humanity into a PD situation where our minimal payoff is ‘all humans dead’ and our maximal payoff is ‘half of humanity dead’, and then it cooperates.
I bumped into a question when I tried to make sense of all this. I have looked up the definition of PM at the wiki. The entry is quite nicely written, but I couldn’t find the answer to a very obvious question: How soon does the PM want to see results in its PMing project? There is no mention of time-based discounting. Can I assume that PMing is a very long-term project, where the PM has a set deadline, say, 10 billion years from now, and its actual utility function is the number of paperclips at the exact moment of the deadline?
Blah blah blah Chinese room you are not really sentient!
Sapient, the word is sapient. Just about every single animal is capable of sensing.
I think this way of posing the question contains a logical mistake. Values aren’t always justified by other values. The factual statement “I have this value because evolution gave it to me” (i.e. because I’m human, or because I’m white) does not imply “I follow this value because it favors humans, or whites”. Of course I’d like FAI to have my values, pretty much by definition of “my values”. But my values have a term for other people, and Eliezer’s values seem to be sufficiently inclusive that he thought up CEV.