It comes down to you saying “if you’re human you should want to be moral, because humans should be moral”
No, that isn’t what I’m saying. I’m saying that if you’re human, you should want to be moral, because wanting to be moral follows from the desires of a human with consistent preferences, due in part to human nature.
if my utility function doesn’t encompass what you think is “moral” and I’m human
Then I dispute that your utility function is what you think it is.
I’m saying that if you’re human, you should want to be moral, because wanting to be moral follows from the desires of a human with consistent preferences, due in part to human nature.
The error as I see it is that “human nature”, whatever you see as such, is a statement about similarities, it isn’t a statement about how things should be.
It’s like saying “a randomly chosen positive natural number is really big, so all numbers should be really big”. How do you see that differently?
We’ve already established that agents can have consistent preferences without adhering to what you think of as “universal human morality”. Child soldiers are human. Their preferences sure can be brutal, but they can be as internally consistent or inconsistent as those of anyone else. I sure would like to change their preferences, because I’d prefer for them to be different, not because some ‘idealised human spirit’ / ‘psychic unity of mankind’ ideal demands so.
Then I dispute that your utility function is what you think it is.
Proof by demonstration? Well, lock yourself in a cellar with only water and send me a key, I’ll send it back FedEx with instructions to set you free, after a week. Would that suffice? I’d enjoy proving that I know my own utility function better than you know my utility function (now that would be quite weird), I wouldn’t enjoy the suffering. Who knows, might even be healthy overall.
It’s like saying “a randomly chosen positive natural number is really big, so all numbers should be really big”.
You can’t randomly choose a positive natural number using an even distribution. If you use an uneven distribution, whether the result is likely to be big depends on how your distribution compares to your definition of “big”.
Choose from those positive numbers that a C++ int variable can contain, or any other* non-infinite subset of positive natural numbers, then. The point is the observation of “most numbers need more than 1 digit to be expressed” not implying in any way some sort of “need” for the 1-digit numbers to “change”, to satisfy the number fairy, or some abstract concept thereof.
* (For LW purposes: Any other? No, not any other. Choose one with a cardinality of at least 10^6. Heh.)
The erro as I see it is that “human nature”, whatever you see as such, is a statement about similarities, it isn’t a statement about how things should be.
It is a statement about similarities, but it’s about a similarity that shapes what people should do. I don’t know how I can explain it without repeating myself, but I’ll try.
For an analogy, let’s consider beings that aren’t humans. Paperclip maximizers, for example. Except these paperclip maximizers aren’t AIs, but a species that somehow evolved biologically. They’re not perfect reasoners and can have internally inconsistent preferences. These paperclip maximizers can prefer to do something that isn’t paperclip-maximizing, even though that is contrary to their nature—that is, if they were to maximize paperclips, they would prefer it to whatever they were doing earlier. One day, a paperclip maximizer who is maximizing paperclips tells his fellow clippies, “You should maximize paperclips, because if you did, you would prefer to, as it is your nature”. This clippy’s statement is true—the clippies’ nature is such that if they maximized clippies, they would prefer it to other goals. So, regardless of what other clippies are actually doing, the utility-maximizing thing for them to do would be to maximize paperclips.
So it is with humans. Upon discovering/realizing/deriving what is moral and consistently acting/being moral, the agent would find that being moral is better than the alternative. This is in part due to human nature.
We’ve already established that agents can have consistent preferences without adhering to what you think of as “universal human morality”.
Agents, yes. Humans, no. Just like the clippies can’t have consistent preferences if they’re not maximizing paperclips.
Proof by demonstration? Well, lock yourself in a cellar with only water and send me a key, I’ll send it back FedEx with instructions to set you free, after a week. Would that suffice?
What would that prove? Also, I don’t claim that I know the entirety of your utility function better than you do—you know much better than I do what kind of ice cream you prefer, what TV shows you like to watch, etc. But those have little to do with human nature in the sense that we’re talking about it here.
A clippy which isn’t maximizing paperclips is not a clippy.
It’s a clippy because it would maximize paperclips if it had consistent preferences and sufficient knowledge.
That my utility function includes something which you’d probably consider immoral.
I don’t dispute that this is possible. What I dispute is that your utility function would contain that if you were internally consistent (and had knowledge of what being moral is like).
The desires of an agent are defined by its preferences. “This is a paperclip maximizer which does not want to maximize paperclips” is a contradiction in terms. And what do you mean by “consistent”, do you mean “consistent with ‘human nature’? Who cares? Or consistent within themselves? Highly doubtful, what would internal consistency have to do with being an altruist? If there’s anything which is characteristic of “human nature”, it is the inconsistency of their preferences.
A human which doesn’t share what you think of as “correct” values (may I ask, not disparagingly, are you religious?) is still a human. An unusual one, maybe (probably not), but an agent not in “need” of any change towards more “moral” values. Stalin may have been happy the way he was.
I don’t dispute that this is possible. What I dispute is that your utility function would contain that if you were internally consistent (and had knowledge of what being moral is like).
Because of the warm fuzzies? The social signalling? Is being moral awsome, or deeply fulfilling? Are you internally consistent … ?
“This is a paperclip maximizer which does not want to maximize paperclips” is a contradiction in terms.
Call it a quasi-paperclip maximizer, then. I’m not interested in disputing definitions. Whatever you call it, it’s a being whose preferences are not necessarily internally consistent, but when they are, it prefers to maximize paperclips. When its preferences are internally inconsistent, it may prefer to do things and have goals other than maximizing paperclips.
Highly doubtful, what would internal consistency have to do with being an altruist?
There’s no necessary connection between the two, but I’m not equating morality and altruism. Morality is what one should do and/or how one should be, which need not be altruistic.
Humans can have incorrect values and still be human, but in that case they are internally inconsistent., because of the preferences they have due to human nature. I’m not saying that humans should strive to have human nature, I’m saying that they already have it. I doubt that Stalin was happy—just look at how paranoid he was. And no, I’m not religious, and have never been.
Because of the warm fuzzies? The social signalling? Is being moral awsome, or deeply fulfilling?
Yes to the first and third questions, Being moral is awesome and fulfilling. It makes you feel happier, more fulfilled, more stable, and similar feelings. It doesn’t guarantee happiness, but it contributes to it both directly (being moral feels good) and indirectly (it helps you make good decisions). It makes you stronger and more resilient (once you’ve internalized it fully). It’s hard to describe beyond that, but good feels good (TVTropes warning).
I think I’m internally consistent. I’ve been told that I am. It’s unlikely that I’m perfectly consistent, but whatever inconsistencies I have are probably minor. I’m open to having them addressed, whatever they are.
Claiming that Stalin wasn’t happy sounds like a variation of sour grapes where not only can you not be as successful as him, it would be actively uncomfortable for you to believe that someone who lacks compassion can be happy, so you claim that he’s not.
It’s true he was paranoid but it’s also true that in the real world, there are tradeoffs and you don’t see people becoming happy with no downsides whatsoever—claiming that this disqualifies them from being called happy eviscerates the word of meaning.
I’m also not convinced that Stalin’s “paranoia” was paranoia (it seems rationa for someone who doesn’t care about the welfare of others and can increase his safety by instilling fear and treating everyone as enemies to do so). I would also caution against assuming that since Stalin’s paranoia is prominent enough for you to have heard of it, it’s too big a deal for him to have been happy—it’s promiment enough for you to have heard of it because it was a big deal to the people affected by it, which is unrelated to how much it affected his happiness.
Stalin was paranoid even by the standards of major world leaders. Khrushchev wasn’t so paranoid, for example. Stalin saw enemies behind every corner. That is not a happy existence.
Khruschev was deposed. Stalin stayed dictator until he died of natural causes. That suggests that Khruschev wasn’t paranoid enough, while Stalin was appropriately paranoid.
Seeing enemies around every corner meant that sometimes he saw enemies that weren’t there, but it was overall adaptive because it resulted in him not getting defeated by any of the enemies that actually existed. (Furthermore, going against nonexistent enemies can be beneficial insofar as the ruthlessness in going after them discourages real enemies.)
Stalin saw enemies behind every corner. That is not a happy existence.
How does the last sentence follow from the previous one? It’s certainly not as happy an existence as it would have been if he had no enemies, but as I pointed out, nobody’s perfectly happy. There are always tradeoffs and we don’t claim that the fact that someone had to do something to gain his happiness automatically makes that happiness fake.
Stalin refused to believe Hitler would attack him, probably since that would be suicidally stupid on the attacker’s part. Was he paranoid, or did he update?
The desires of an agent are defined by its preferences. “This is a paperclip maximizer which does not want to maximize paperclips” is a contradiction in terms.
I’m not sure “preference” is a powerful enough term to capture an agent’s true goals, however defined. Consider any of the standard preference reversals: a heavy cigarette smoker, for example, might prefer to buy and consume their next pack in a Near context, yet prefer to quit in a Far. The apparent contradiction follows quite naturally from time discounting, yet neither interpretation of the person’s preferences is obviously wrong.
Proof by demonstration? Well, lock yourself in a cellar with only water and send me a key, I’ll send it back FedEx with instructions to set you free, after a week.
That would only prove that you think you want to do that. The issue is that what you think you want and what you actually want do not generally coincide, because of imperfect self-knowledge, bounded thinking time, etc.
I don’t know about child soldiers, but it’s fairly common for amateur philosophers to argue themselves into thinking they “should” be perfectly selfish egoists, or hedonistic utilitarians, because logic or rationality demands it. They are factually mistaken, and to the extent that they think they want to be egoists or hedonists, their “preferences” are inconsistent, because if they noticed the logical flaw in their argument they would change their minds.
That would only prove that you think you want to do that.
Isn’t that when I throw up my arms and say “congratulations, your hypothesis is unfalsifiable, the dragon is permeable to fluor”. What experimental setup would you suggest? Would you say any statement about one’s preferences is moot? It seems that we’re always under bounded thinking time constraints. Maybe the paperclipper really wants to help humankind and be moral, and mistakingly thinks otherwise. Who would know, it optimized its own actions under resource constraints, and then there’s the ‘Löbstacle’ to consider.
Is saying “I like vanilla ice cream” FAI-complete and must never be uttered or relied upon by anyone?
it’s fairly common for amateur philosophers to argue themselves into thinking they “should” be perfectly selfish egoists, or hedonistic utilitarians, because logic or rationality demands it
Or argue themselves into thinking that there is some subset of preferences such every other (human?) agent should voluntarily choose to adopt them, against their better judgment (edit; as it contradicts what they (perceive, after thorough introspection) as their own preferences)? You can add “objective moralists” to the list.
What would it be that is present in every single human’s brain architecture throughout human history that would be compatible with some fixed ordering over actions, called “morally good”? (Otherwise you’d have your immediate counterexample.) The notion seems so obviously ill-defined and misguided (hence my first comment asking Cousin_It).
It’s fine (to me) to espouse preferences that aim to change other humans (say, towards being more altruistic, or towards being less altruistic, or whatever), but to appeal to some objective guiding principle based on “human nature” (which constantly evolves in different strands) or some well-sounding ev-psych applause-light is just a new substitute for the good old Abrahamic heavenly father.
Would you say any statement about one’s preferences is moot? It seems that we’re always under bounded thinking time constraints. Maybe the paperclipper really wants to help humankind and be moral, and mistakingly thinks otherwise. Who would know, it optimized its own actions under resource constraints, and then there’s the ‘Löbstacle’ to consider.
Is saying “I like vanilla ice cream” FAI-complete and must never be uttered or relied upon by anyone?
I wouldn’t say any of those things. Obviously paperclippers don’t “really want to help humankind”, because they don’t have any human notion of morality built-in in the first place. Statements like “I like vanilla ice cream” are also more trustworthy on account of being a function of directly observable things like how you feel when you eat it.
The only point I’m trying to make here is that it is possible to be mistaken about your own utility function. It’s entirely consistent for the vast majority of humans to have a large shared portion of their built-in utility function (built-in by their genes), even though many of them seemingly want to do bad things, and that’s because humans are easily confused and not automatically self-aware.
It is possible to be mistaken about your own utility function.
For sure.
It’s entirely consistent for the vast majority of humans to have a large shared portion of their built-in utility function (built-in by their genes), even though many of them seemingly want to do bad things
I’d agree if humans were like dishwashers. There are templates for dishwashers, ways they are supposed to work. If you came across a broken dishwasher, there could be a case for the dishwasher to be repaired, to go back to “what it’s supposed to be”.
However, that is because there is some external authority (exasparated humans who want to fix their damn dishwasher, dirty dishes are piling up) conceiving of and enforcing such a purpose. The fact that genes and the environment shape utility functions in similar ways is a description, not a prescription. It would not be a case for any “broken” human to go back to “what his genes would want him to be doing”. Just like it wouldn’t be a case against brain uploading.
Some of the discussion seems to me like saying that “deep down in every flawed human, there is ‘a figure of light’, in our community ‘a rational agent following uniform human values with slight deviations accounting for ice-cream taste’, we just need to dig it up”. There is only your brain. With its values. There is no external standard to call its values flawed. There are external standards (rationality = winning) to better its epistemic and instrumental rationality, but those can help the serial killer and the GiveWell activist equally. (Also, both of those can be ‘mistaken’ about their values.)
No, that isn’t what I’m saying. I’m saying that if you’re human, you should want to be moral, because wanting to be moral follows from the desires of a human with consistent preferences, due in part to human nature.
Then I dispute that your utility function is what you think it is.
The error as I see it is that “human nature”, whatever you see as such, is a statement about similarities, it isn’t a statement about how things should be.
It’s like saying “a randomly chosen positive natural number is really big, so all numbers should be really big”. How do you see that differently?
We’ve already established that agents can have consistent preferences without adhering to what you think of as “universal human morality”. Child soldiers are human. Their preferences sure can be brutal, but they can be as internally consistent or inconsistent as those of anyone else. I sure would like to change their preferences, because I’d prefer for them to be different, not because some ‘idealised human spirit’ / ‘psychic unity of mankind’ ideal demands so.
Proof by demonstration? Well, lock yourself in a cellar with only water and send me a key, I’ll send it back FedEx with instructions to set you free, after a week. Would that suffice? I’d enjoy proving that I know my own utility function better than you know my utility function (now that would be quite weird), I wouldn’t enjoy the suffering. Who knows, might even be healthy overall.
You can’t randomly choose a positive natural number using an even distribution. If you use an uneven distribution, whether the result is likely to be big depends on how your distribution compares to your definition of “big”.
Choose from those positive numbers that a C++ int variable can contain, or any other* non-infinite subset of positive natural numbers, then. The point is the observation of “most numbers need more than 1 digit to be expressed” not implying in any way some sort of “need” for the 1-digit numbers to “change”, to satisfy the number fairy, or some abstract concept thereof.
* (For LW purposes: Any other? No, not any other. Choose one with a cardinality of at least 10^6. Heh.)
It is a statement about similarities, but it’s about a similarity that shapes what people should do. I don’t know how I can explain it without repeating myself, but I’ll try.
For an analogy, let’s consider beings that aren’t humans. Paperclip maximizers, for example. Except these paperclip maximizers aren’t AIs, but a species that somehow evolved biologically. They’re not perfect reasoners and can have internally inconsistent preferences. These paperclip maximizers can prefer to do something that isn’t paperclip-maximizing, even though that is contrary to their nature—that is, if they were to maximize paperclips, they would prefer it to whatever they were doing earlier. One day, a paperclip maximizer who is maximizing paperclips tells his fellow clippies, “You should maximize paperclips, because if you did, you would prefer to, as it is your nature”. This clippy’s statement is true—the clippies’ nature is such that if they maximized clippies, they would prefer it to other goals. So, regardless of what other clippies are actually doing, the utility-maximizing thing for them to do would be to maximize paperclips.
So it is with humans. Upon discovering/realizing/deriving what is moral and consistently acting/being moral, the agent would find that being moral is better than the alternative. This is in part due to human nature.
Agents, yes. Humans, no. Just like the clippies can’t have consistent preferences if they’re not maximizing paperclips.
What would that prove? Also, I don’t claim that I know the entirety of your utility function better than you do—you know much better than I do what kind of ice cream you prefer, what TV shows you like to watch, etc. But those have little to do with human nature in the sense that we’re talking about it here.
A clippy which isn’t maximizing paperclips is not a clippy.
A human which isn’t adhering to your moral codex is still a human.
That my utility function includes something which you’d probably consider immoral.
It’s a clippy because it would maximize paperclips if it had consistent preferences and sufficient knowledge.
I don’t dispute that this is possible. What I dispute is that your utility function would contain that if you were internally consistent (and had knowledge of what being moral is like).
The desires of an agent are defined by its preferences. “This is a paperclip maximizer which does not want to maximize paperclips” is a contradiction in terms. And what do you mean by “consistent”, do you mean “consistent with ‘human nature’? Who cares? Or consistent within themselves? Highly doubtful, what would internal consistency have to do with being an altruist? If there’s anything which is characteristic of “human nature”, it is the inconsistency of their preferences.
A human which doesn’t share what you think of as “correct” values (may I ask, not disparagingly, are you religious?) is still a human. An unusual one, maybe (probably not), but an agent not in “need” of any change towards more “moral” values. Stalin may have been happy the way he was.
Because of the warm fuzzies? The social signalling? Is being moral awsome, or deeply fulfilling? Are you internally consistent … ?
Call it a quasi-paperclip maximizer, then. I’m not interested in disputing definitions. Whatever you call it, it’s a being whose preferences are not necessarily internally consistent, but when they are, it prefers to maximize paperclips. When its preferences are internally inconsistent, it may prefer to do things and have goals other than maximizing paperclips.
There’s no necessary connection between the two, but I’m not equating morality and altruism. Morality is what one should do and/or how one should be, which need not be altruistic.
Humans can have incorrect values and still be human, but in that case they are internally inconsistent., because of the preferences they have due to human nature. I’m not saying that humans should strive to have human nature, I’m saying that they already have it. I doubt that Stalin was happy—just look at how paranoid he was. And no, I’m not religious, and have never been.
Yes to the first and third questions, Being moral is awesome and fulfilling. It makes you feel happier, more fulfilled, more stable, and similar feelings. It doesn’t guarantee happiness, but it contributes to it both directly (being moral feels good) and indirectly (it helps you make good decisions). It makes you stronger and more resilient (once you’ve internalized it fully). It’s hard to describe beyond that, but good feels good (TVTropes warning).
I think I’m internally consistent. I’ve been told that I am. It’s unlikely that I’m perfectly consistent, but whatever inconsistencies I have are probably minor. I’m open to having them addressed, whatever they are.
Claiming that Stalin wasn’t happy sounds like a variation of sour grapes where not only can you not be as successful as him, it would be actively uncomfortable for you to believe that someone who lacks compassion can be happy, so you claim that he’s not.
It’s true he was paranoid but it’s also true that in the real world, there are tradeoffs and you don’t see people becoming happy with no downsides whatsoever—claiming that this disqualifies them from being called happy eviscerates the word of meaning.
I’m also not convinced that Stalin’s “paranoia” was paranoia (it seems rationa for someone who doesn’t care about the welfare of others and can increase his safety by instilling fear and treating everyone as enemies to do so). I would also caution against assuming that since Stalin’s paranoia is prominent enough for you to have heard of it, it’s too big a deal for him to have been happy—it’s promiment enough for you to have heard of it because it was a big deal to the people affected by it, which is unrelated to how much it affected his happiness.
Stalin was paranoid even by the standards of major world leaders. Khrushchev wasn’t so paranoid, for example. Stalin saw enemies behind every corner. That is not a happy existence.
Khruschev was deposed. Stalin stayed dictator until he died of natural causes. That suggests that Khruschev wasn’t paranoid enough, while Stalin was appropriately paranoid.
Seeing enemies around every corner meant that sometimes he saw enemies that weren’t there, but it was overall adaptive because it resulted in him not getting defeated by any of the enemies that actually existed. (Furthermore, going against nonexistent enemies can be beneficial insofar as the ruthlessness in going after them discourages real enemies.)
How does the last sentence follow from the previous one? It’s certainly not as happy an existence as it would have been if he had no enemies, but as I pointed out, nobody’s perfectly happy. There are always tradeoffs and we don’t claim that the fact that someone had to do something to gain his happiness automatically makes that happiness fake.
Stalin’s paranoia, and the actions he took as a result, also created enemies, thus becoming a partly self-fulfilling attitude.
You do see people becoming happy with fewer downsides than others, though.
Stalin refused to believe Hitler would attack him, probably since that would be suicidally stupid on the attacker’s part. Was he paranoid, or did he update?
I’m not sure “preference” is a powerful enough term to capture an agent’s true goals, however defined. Consider any of the standard preference reversals: a heavy cigarette smoker, for example, might prefer to buy and consume their next pack in a Near context, yet prefer to quit in a Far. The apparent contradiction follows quite naturally from time discounting, yet neither interpretation of the person’s preferences is obviously wrong.
I’ve seen it used as shorthand for “utility function”, saving 5 keystrokes. That was the intended use here. Point taken, alternate phrasings welcome.
That would only prove that you think you want to do that. The issue is that what you think you want and what you actually want do not generally coincide, because of imperfect self-knowledge, bounded thinking time, etc.
I don’t know about child soldiers, but it’s fairly common for amateur philosophers to argue themselves into thinking they “should” be perfectly selfish egoists, or hedonistic utilitarians, because logic or rationality demands it. They are factually mistaken, and to the extent that they think they want to be egoists or hedonists, their “preferences” are inconsistent, because if they noticed the logical flaw in their argument they would change their minds.
Isn’t that when I throw up my arms and say “congratulations, your hypothesis is unfalsifiable, the dragon is permeable to fluor”. What experimental setup would you suggest? Would you say any statement about one’s preferences is moot? It seems that we’re always under bounded thinking time constraints. Maybe the paperclipper really wants to help humankind and be moral, and mistakingly thinks otherwise. Who would know, it optimized its own actions under resource constraints, and then there’s the ‘Löbstacle’ to consider.
Is saying “I like vanilla ice cream” FAI-complete and must never be uttered or relied upon by anyone?
Or argue themselves into thinking that there is some subset of preferences such every other (human?) agent should voluntarily choose to adopt them, against their better judgment (edit; as it contradicts what they (perceive, after thorough introspection) as their own preferences)? You can add “objective moralists” to the list.
What would it be that is present in every single human’s brain architecture throughout human history that would be compatible with some fixed ordering over actions, called “morally good”? (Otherwise you’d have your immediate counterexample.) The notion seems so obviously ill-defined and misguided (hence my first comment asking Cousin_It).
It’s fine (to me) to espouse preferences that aim to change other humans (say, towards being more altruistic, or towards being less altruistic, or whatever), but to appeal to some objective guiding principle based on “human nature” (which constantly evolves in different strands) or some well-sounding ev-psych applause-light is just a new substitute for the good old Abrahamic heavenly father.
I wouldn’t say any of those things. Obviously paperclippers don’t “really want to help humankind”, because they don’t have any human notion of morality built-in in the first place. Statements like “I like vanilla ice cream” are also more trustworthy on account of being a function of directly observable things like how you feel when you eat it.
The only point I’m trying to make here is that it is possible to be mistaken about your own utility function. It’s entirely consistent for the vast majority of humans to have a large shared portion of their built-in utility function (built-in by their genes), even though many of them seemingly want to do bad things, and that’s because humans are easily confused and not automatically self-aware.
For sure.
I’d agree if humans were like dishwashers. There are templates for dishwashers, ways they are supposed to work. If you came across a broken dishwasher, there could be a case for the dishwasher to be repaired, to go back to “what it’s supposed to be”.
However, that is because there is some external authority (exasparated humans who want to fix their damn dishwasher, dirty dishes are piling up) conceiving of and enforcing such a purpose. The fact that genes and the environment shape utility functions in similar ways is a description, not a prescription. It would not be a case for any “broken” human to go back to “what his genes would want him to be doing”. Just like it wouldn’t be a case against brain uploading.
Some of the discussion seems to me like saying that “deep down in every flawed human, there is ‘a figure of light’, in our community ‘a rational agent following uniform human values with slight deviations accounting for ice-cream taste’, we just need to dig it up”. There is only your brain. With its values. There is no external standard to call its values flawed. There are external standards (rationality = winning) to better its epistemic and instrumental rationality, but those can help the serial killer and the GiveWell activist equally. (Also, both of those can be ‘mistaken’ about their values.)