Even if you’re restricting your assertion to special cases, let’s go with that.
Why should I overcome my “bias” and not save my own child, just because there is some other child with a better chance of being saved, but which I do not care about as much?
What makes that an “evil” bias, as opposed to an ubiquitous aspect of most parents’ utility functions?
Why should I overcome my “bias” and not save my own child, just because there is some other child with a better chance of being saved, but which I do not care about as much?
Assuming that saving my child would give me X utility and saving the other child would give his parents X utility, it’s just a “shut up and multiply” kind of thing...
Assuming that saving my child would give me X utility and saving the other child would give his parents X utility
This assumption is excluded by Kawoomba’s “but which I do not care about as much”, so isn’t directly relevant at this point (unless you are making a distinction between “caring” and “utility”, which should be more explicit).
I guess I’m just not sure why Kawoomba’s own utility gets special treatment over the other child’s parents utility function. Then again, your reply and my own sentence just now have me slightly confused, so I may need to think on this a bit more.
I guess I’m just not sure why Kawoomba’s own utility gets special treatment over the other child’s parents utility function.
Taboo “utility function”, and “Kawoomba cares about Kawoomba’s utility function” would resolve into the tautologous “Kawoomba is motivated by whatever it is that motivates Kawoomba”. The subtler problem is that it’s not a given that Kawoomba knows what motivates Kawoomba, so claims with certainty about what that is or isn’t (including those made by Kawoomba) may be unfounded. To the extent “utility function” refers to idealized extrapolated volition, rather than present desires, people won’t already have good understanding of even their own “utility function”.
The subtler problem is that it’s not a given that Kawoomba knows what motivates Kawoomba, so claims with certainty about what that is or isn’t (including those made by Kawoomba) may be unfounded.
There is no idealized extrapolated volition that is based on my current volition that would prefer someone else’s child over one of my own (CEV_me, not CEV_mankind). There are certainly inconsistencies in my non-idealized utility function, but that does not mean that every statement I make about my own utility function must be suspect, merely that such suspect/contradictory statements exist.
If you prefer vanilla over strawberry ice cream, there may be cases where that preference does not transfer to your extrapolated volition due to some other contradictory preferences. However, for comparisons with a significant delta involved, the initial result that determines your decision should be preserved. (It may however be different when extrapolating to a CEV for all humankind.)
Also, you used my name with a frequency of 7⁄84 in your last comment <3.
that does not mean that every statement I make about my own utility function must be suspect
In general, unless something is well-understood, there is good reason to suspect an error. Human values is not something that’s understood particularly well.
Assuming that saving my child would give me X utility and saving the other child would give his parents X utility
If you’ve found a way to aggregate utility across persons, I’d like to hear it.
Normally, we talk about trying to satisfy a particular utility function. If the parent values her child more than the neighbor’s child, that is reflected in her utility function. What other standard are you trying to invoke?
What reason do you have for aiming to satisfy you own utility function
Um, it’s my utility function, that which I aim to maximize and that which already incorporates my e.g. altruistic desires. Postulating “other preferences” that can overrule my utility function would be a contradiction in terms.
The other two questions were more aimed at MugaSofer, who was the one differentiating between preference as a “bias” and as part of your utility function, and who introduced the whole “evil” thing.
The nearest I can come to making sense of your claim is that it’s some sort of imaginary Prisoner’s Dilemma: you can cooperate by saving a random child instead of your own, and in symmetric cases other parents can cooperate by saving your child instead of theirs.
However, even if you are into counterfactual bargaining, I am pretty sure almost no other parent would cooperate here, which makes defecting a no-brainer.
I suppose to be fair I should imagine a world in which every parent is brainwashed into valuing other children’s lives as much as their own (I am pretty sure it would take brainwashing). In this case (assuming you escaped the brainwashing so it’s still a legitimate decision) saving the other child might be the right thing to do. At that point, though, you’re arguably not optimizing for humans anymore.
My assertion is that all humans share utility—which is the standard assumption in ethics, and seems obviously true—and that parents are biased towards their children (for simple evopsych reasons,) leading them to choose their child when, objectively, their own ethics dictates they choose the other. The example given was that of a triage situation; you can only choose one, and need to decide who has he greater chance of survival.
Your moral philosophy in so far as it affects your actions is by definition already part of your utility function.
It makes no sense to say “my utility function dictates I want to do X, but because my own ethics says otherwise, I should do otherwise”, it’s a contradictio in terminis.
We should be very careful with ethical assumptions that seem “obviously true”. Especially when they are not (true as in “common”, it wouldn’t make sense otherwise) - parents choosing their own child over other children is an example of following a different ethical compass, one valuing their own children over others. You can neither claim that those parents are confused about their own utility function, nor that they are “wrong”. Your proposed “obviously true” ethical assumption is also based on “evopsych”. You’re trying to elevate an extreme altruist approach above others and calling it obviously true. For you, maybe, for the vast majority of e.g. parents? Not so much.
There is no epistemological truth in terminal values.
parents choosing their own child over other children is an example of following a different ethical compass, one valuing their own children over others. You can neither claim that those parents are confused about their own utility function, nor that they are “wrong”.
No.
Humans regularly act against their own ethics, whether due to misinformation or bias, akrasia, or cached thoughts about morality.
… are you seriously suggesting that, say, racists, are right about what they want? How then do they change when confronted with evidence that other races are, well, people? Perhaps I have misunderstood your point.
It seems obviously true that the moralities people implement are often internally inconsistent. It also seems obviously true that people can talk about imperatives they feel derive from one horn or the other of an inconsistent moral system, without either lying or being wrong as such.
The inconsistency might resolve itself with new information, but it’s going to inform any statements we make about the moral system it exists in until that information arrives.
I would advise you to read “cached thoughts” and then answer my question:
… are you seriously suggesting that, say, racists, are right about what they want? How then do they change when confronted with evidence that other races are, well, people?
… are you seriously suggesting that, say, racists, are right about what they want?
I am saying that the statement “a racist wants that which he/she wants” is tautologically true. There is no objective “right” or “wrong” when comparing utility functions, there is just “this utility function values X and Y, this other utility function values X and Z, they are compatible in respect to X, they are incompatible in respect to Y”.
Certainly what we value changes all the time. But that’s just change, it’s not becoming “less wrong” or “wronger”. Instead, it may be “more (/less) compatible with commonly shared elements of western utility functions” (which still fluctuate across time and culture, and species).
Except that humans share a utility function, which doesn’t change. You can persuade someone that murder is good, but you do it by persuading them that it leads to outcomes they already considered “good” and they were mistaken about the downsides of, well, killing people. Cached thoughts can result in actions that, objectively, are wrong. They are not wrong because this is some essential property of these actions, morality is in our minds, but we can still meaningfully say “this is wrong” just was we can say “this is a chair” or “there are five apples”. Eliezer’s latest sequence touches on this kind of meaningfulness. Other standard stuff worth reading in this context is “The Psychological Unity of Humankind” and “Coherent Extrapolated Volition”; and, well, the Metaethics Sequence.
Except that humans share a utility function, which doesn’t change.
Humans trivially don’t share a utility function, since they have differing preferences over world-states. I’m even pretty sure that individual people don’t have anything that we could call a reliable utility function, since we don’t have the cognitive juice to evaluate world-states in their totality and even tractable subsets of the world end up getting evaluated differently based on all sorts of random crap including, but not limited to, presentation order and how recently you’ve eaten.
CEV attempts to resolve people’s conflicting preferences by doing away with several human cognitive limitations, requiring reflective consistency, and applying resolution steps based on projected social interactions (at least, that’s how I’m reading “grew up farther together”), but these requirements (especially the latter) are underspecified in its present form. Even if they weren’t, CEV in its present form does not, nor does it try to, demonstrate that the entirety of the human moral landscape in fact coheres.
Humans trivially don’t share a utility function, since they have differing preferences over world-states.
Humans trivially do share a utility function, since they change their beliefs consistently in response to argument. Of course, as with all other knowledge, self-knowledge and moral reasoning are hampered by biases, cached thoughts, and simple stupidity.
CEV, and for that matter The Psychological Unity of Humankind, are relevant without being themselves arguments. Have you, in fact, read the metaethics sequence? I ask for information as to how best to proceed.
Humans trivially do share a utility function, since they change their beliefs consistently in response to argument.
...no offense, but I don’t think that word means what you think it means.
Non-pathological human ethics may or may not ultimately run off some consistent set of intrinsic affective associations. (Whether or not it does more or less reduces to the question of whether CEV is complete, which as I’ve said is currently unknown.) Even if true, this doesn’t imply a shared utility function within any useful domain.
Utility (in its simplest form) is nothing more or less than a preference ordering over some set of possible states, a utility function is one that maps those states to their preference ordering for a given agent, and in between those states and our hypothetical intrinsic associations there’s layers upon layers of bias and acculturation, probably enough to be effectively unique to the individual. I’ve be very surprised if we could find two people with exactly the same preferences over fully specified future states, though we’d probably find large chunks that looked quite similar.
Non-pathological human ethics may or may not ultimately run off some consistent set of intrinsic affective associations. (Whether or not it does is a question that more or less reduces to the question of whether CEV is complete, which as I’ve said is currently unknown.) If true, this does not demonstrate a shared utility function within some domain. Utility (in its simplest form) is nothing more or less than a preference ordering over some set of possible states, and between those states and our hypothetical intrinsic associations there’s layers upon layers of bias and acculturation, probably enough to be effectively unique to the individual. I’ve be very surprised if we could find two people with exactly the same preferences over fully specified future states, though we’d probably find large chunks that looked quite similar.
...huh?
The fact that morality is acted upon in different ways (due to your “layers” or simply mistaken beliefs about the world) doesn’t change the fact that it is there, underneath, and that this is the standard we work by to declare something “good” or “bad”. We aren’t perfect at it, but we can make a reasonable attempt. Just like, say, mathematics, or predicting the movement of planets.
The fact that morality is acted upon in different ways (due to your “layers” or simply mistaken beliefs about the world) doesn’t change the fact that it is there, underneath, and that this is the standard we work by to declare something “good” or “bad”.
Now we’re getting somewhere.
First, that’s not a utility function; see the edited version of my last comment. We have a tendency around here to use “utility function” as if it describes fundamental moral impulses, but I’d imagine that’s because we like to talk about AIs, for whom such a function can be written explicitly and for whom consistency between agents is no trouble. Neither of those conditions holds true for our messy meat brains.
That being said, I’m afraid the idea that there’s some uniform set of impulses on which all existing moralities are fundamentally based is more an article of faith than a statement of fact given the present state of knowledge. There’s clearly enough unity there for some moral concepts to (e.g.) be describable in language, but that’s a relatively weak criterion. Pathology gives the idea of strong consistency a lot of trouble, but even if you ignore that there’s simply not enough evidence to declare that it’s consistent enough to define as a single function covering all normal people; just off the top of my head, for example, it could easily be that parts of it sum as a polynomial, or something similar, for which the coefficients vary somewhat between people or populations.
First, that’s not a utility function; see the edited version of my last comment. We have a tendency around here to use “utility function” as if it describes fundamental moral impulses, but I’d imagine that’s because we like to talk about AIs, for whom such a function can be written explicitly and for whom consistency between agents is no trouble. Neither of those conditions holds true for our messy meat brains.
Fair enough. What term would you prefer? I’ll use “morality” for now.
Pathology gives the idea a lot of trouble, but even if you ignore that there’s simply not enough evidence to declare that it’s consistent enough to define as a single function describing the foundational moral sentiments of all normal people.
Quite the opposite, we can see that our morality exists unchanged regardless of beliefs by the fact that there are people who actually do have different moralities. As a vegetarian, I can tell you that a lot of people who believe eating meat is OK do so because they are mistaken about the environment; remove the mistake (by showing them how horrible conditions are in factory farms, for example) and they will see that eating meat is wrong (or at least that factory farming is wrong.) If they genuinely didn’t value the pain of animals, say, this would fail. No amount of argument will persuade Clippy that killing people is wrong.
As a vegetarian, I can tell you that a lot of people who believe eating meat is OK do so because they are mistaken about the environment; remove the mistake (by showing them how horrible conditions are in factory farms, for example) and they will see that eating meat is wrong (or at least that factory farming is wrong.) If they genuinely didn’t value the pain of animals, say, this would fail.
You wouldn’t happen to have non-anecdotal evidence that this is actually the case, would you?
What, like a study of people showed images of slaughterhouses or something? Nope. To be honest, that’s kind of a terrible example. Racists work much better.
I think I’d agree that most humans share roughly the same set of inputs to that architecture: hit most people on the head, and they’re likely to feel pain; humiliate them, and they’re likely to feel embarrassment. I doubt that the relative weightings of these traits are likely to remain identical between individuals, but if you factor that out I think we have a human commonality that I could get behind.
I suspect we’d differ in our opinion of acculturation’s role in defining certain categories (the pain of animals, for example) as morally significant, though. That strikes me as a level or two above anything I’d be comfortable calling a human universal.
I think I’d agree that most humans share roughly the same set of inputs to that architecture: hit most people on the head, and they’re likely to feel pain; humiliate them, and they’re likely to feel embarrassment.
I note that humans can empathise with pains they do not themselves feel.
I suspect we’d differ in our opinion of acculturation’s role in defining certain categories (the pain of animals, for example) as morally significant, though. That strikes me as a level or two above anything I’d be comfortable calling a human universal.
Well, yeah. It’s not the greatest example, I suppose. How about racism? That’s usually my go-to for this sort of thing. I kill Jews because Jews are parasites that undermine civilization; you kill Nazis because they murder innocent people.
Even if you’re restricting your assertion to special cases, let’s go with that.
Why should I overcome my “bias” and not save my own child, just because there is some other child with a better chance of being saved, but which I do not care about as much?
What makes that an “evil” bias, as opposed to an ubiquitous aspect of most parents’ utility functions?
Assuming that saving my child would give me X utility and saving the other child would give his parents X utility, it’s just a “shut up and multiply” kind of thing...
This assumption is excluded by Kawoomba’s “but which I do not care about as much”, so isn’t directly relevant at this point (unless you are making a distinction between “caring” and “utility”, which should be more explicit).
I guess I’m just not sure why Kawoomba’s own utility gets special treatment over the other child’s parents utility function. Then again, your reply and my own sentence just now have me slightly confused, so I may need to think on this a bit more.
Taboo “utility function”, and “Kawoomba cares about Kawoomba’s utility function” would resolve into the tautologous “Kawoomba is motivated by whatever it is that motivates Kawoomba”. The subtler problem is that it’s not a given that Kawoomba knows what motivates Kawoomba, so claims with certainty about what that is or isn’t (including those made by Kawoomba) may be unfounded. To the extent “utility function” refers to idealized extrapolated volition, rather than present desires, people won’t already have good understanding of even their own “utility function”.
There is no idealized extrapolated volition that is based on my current volition that would prefer someone else’s child over one of my own (CEV_me, not CEV_mankind). There are certainly inconsistencies in my non-idealized utility function, but that does not mean that every statement I make about my own utility function must be suspect, merely that such suspect/contradictory statements exist.
If you prefer vanilla over strawberry ice cream, there may be cases where that preference does not transfer to your extrapolated volition due to some other contradictory preferences. However, for comparisons with a significant delta involved, the initial result that determines your decision should be preserved. (It may however be different when extrapolating to a CEV for all humankind.)
Also, you used my name with a frequency of 7⁄84 in your last comment <3.
In general, unless something is well-understood, there is good reason to suspect an error. Human values is not something that’s understood particularly well.
If you value e.g. your family extremely higher than a grain of salt, would you say that there is any chance of that not being reflected in your CEV?
Any “CEV” that doesn’t conserve e.g. that particular relationship would be misnamed.
If you’ve found a way to aggregate utility across persons, I’d like to hear it.
Normally, we talk about trying to satisfy a particular utility function. If the parent values her child more than the neighbor’s child, that is reflected in her utility function. What other standard are you trying to invoke?
Ah, this clears up things a bit for me, thank you.
Why would I need to aim to satisfy overall utility including others, as opposed to just that of my own family?
Is any such preference that chooses my own utility over that of others a bias, and not part of my utility function?
Is it an evil bias if I buy myself some tech toys as opposed to donating that amount to my preferred charity?
What reason do you have for aiming to satisfy you own utility function, or that of your family’s?
I’m afraid this is a little too much lingo for me. Sorry.
You’d have to taboo “evil” before I can answer this question.
Um, it’s my utility function, that which I aim to maximize and that which already incorporates my e.g. altruistic desires. Postulating “other preferences” that can overrule my utility function would be a contradiction in terms.
The other two questions were more aimed at MugaSofer, who was the one differentiating between preference as a “bias” and as part of your utility function, and who introduced the whole “evil” thing.
The nearest I can come to making sense of your claim is that it’s some sort of imaginary Prisoner’s Dilemma: you can cooperate by saving a random child instead of your own, and in symmetric cases other parents can cooperate by saving your child instead of theirs.
However, even if you are into counterfactual bargaining, I am pretty sure almost no other parent would cooperate here, which makes defecting a no-brainer.
I suppose to be fair I should imagine a world in which every parent is brainwashed into valuing other children’s lives as much as their own (I am pretty sure it would take brainwashing). In this case (assuming you escaped the brainwashing so it’s still a legitimate decision) saving the other child might be the right thing to do. At that point, though, you’re arguably not optimizing for humans anymore.
My assertion is that all humans share utility—which is the standard assumption in ethics, and seems obviously true—and that parents are biased towards their children (for simple evopsych reasons,) leading them to choose their child when, objectively, their own ethics dictates they choose the other. The example given was that of a triage situation; you can only choose one, and need to decide who has he greater chance of survival.
Your moral philosophy in so far as it affects your actions is by definition already part of your utility function.
It makes no sense to say “my utility function dictates I want to do X, but because my own ethics says otherwise, I should do otherwise”, it’s a contradictio in terminis.
We should be very careful with ethical assumptions that seem “obviously true”. Especially when they are not (true as in “common”, it wouldn’t make sense otherwise) - parents choosing their own child over other children is an example of following a different ethical compass, one valuing their own children over others. You can neither claim that those parents are confused about their own utility function, nor that they are “wrong”. Your proposed “obviously true” ethical assumption is also based on “evopsych”. You’re trying to elevate an extreme altruist approach above others and calling it obviously true. For you, maybe, for the vast majority of e.g. parents? Not so much.
There is no epistemological truth in terminal values.
No.
Humans regularly act against their own ethics, whether due to misinformation or bias, akrasia, or cached thoughts about morality.
… are you seriously suggesting that, say, racists, are right about what they want? How then do they change when confronted with evidence that other races are, well, people? Perhaps I have misunderstood your point.
It seems obviously true that the moralities people implement are often internally inconsistent. It also seems obviously true that people can talk about imperatives they feel derive from one horn or the other of an inconsistent moral system, without either lying or being wrong as such.
The inconsistency might resolve itself with new information, but it’s going to inform any statements we make about the moral system it exists in until that information arrives.
I would advise you to read “cached thoughts” and then answer my question:
I am saying that the statement “a racist wants that which he/she wants” is tautologically true. There is no objective “right” or “wrong” when comparing utility functions, there is just “this utility function values X and Y, this other utility function values X and Z, they are compatible in respect to X, they are incompatible in respect to Y”.
Certainly what we value changes all the time. But that’s just change, it’s not becoming “less wrong” or “wronger”. Instead, it may be “more (/less) compatible with commonly shared elements of western utility functions” (which still fluctuate across time and culture, and species).
Except that humans share a utility function, which doesn’t change. You can persuade someone that murder is good, but you do it by persuading them that it leads to outcomes they already considered “good” and they were mistaken about the downsides of, well, killing people. Cached thoughts can result in actions that, objectively, are wrong. They are not wrong because this is some essential property of these actions, morality is in our minds, but we can still meaningfully say “this is wrong” just was we can say “this is a chair” or “there are five apples”. Eliezer’s latest sequence touches on this kind of meaningfulness. Other standard stuff worth reading in this context is “The Psychological Unity of Humankind” and “Coherent Extrapolated Volition”; and, well, the Metaethics Sequence.
Humans trivially don’t share a utility function, since they have differing preferences over world-states. I’m even pretty sure that individual people don’t have anything that we could call a reliable utility function, since we don’t have the cognitive juice to evaluate world-states in their totality and even tractable subsets of the world end up getting evaluated differently based on all sorts of random crap including, but not limited to, presentation order and how recently you’ve eaten.
CEV attempts to resolve people’s conflicting preferences by doing away with several human cognitive limitations, requiring reflective consistency, and applying resolution steps based on projected social interactions (at least, that’s how I’m reading “grew up farther together”), but these requirements (especially the latter) are underspecified in its present form. Even if they weren’t, CEV in its present form does not, nor does it try to, demonstrate that the entirety of the human moral landscape in fact coheres.
Humans trivially do share a utility function, since they change their beliefs consistently in response to argument. Of course, as with all other knowledge, self-knowledge and moral reasoning are hampered by biases, cached thoughts, and simple stupidity.
CEV, and for that matter The Psychological Unity of Humankind, are relevant without being themselves arguments. Have you, in fact, read the metaethics sequence? I ask for information as to how best to proceed.
...no offense, but I don’t think that word means what you think it means.
Non-pathological human ethics may or may not ultimately run off some consistent set of intrinsic affective associations. (Whether or not it does more or less reduces to the question of whether CEV is complete, which as I’ve said is currently unknown.) Even if true, this doesn’t imply a shared utility function within any useful domain.
Utility (in its simplest form) is nothing more or less than a preference ordering over some set of possible states, a utility function is one that maps those states to their preference ordering for a given agent, and in between those states and our hypothetical intrinsic associations there’s layers upon layers of bias and acculturation, probably enough to be effectively unique to the individual. I’ve be very surprised if we could find two people with exactly the same preferences over fully specified future states, though we’d probably find large chunks that looked quite similar.
Yes.
Good to know.
...huh?
The fact that morality is acted upon in different ways (due to your “layers” or simply mistaken beliefs about the world) doesn’t change the fact that it is there, underneath, and that this is the standard we work by to declare something “good” or “bad”. We aren’t perfect at it, but we can make a reasonable attempt. Just like, say, mathematics, or predicting the movement of planets.
Now we’re getting somewhere.
First, that’s not a utility function; see the edited version of my last comment. We have a tendency around here to use “utility function” as if it describes fundamental moral impulses, but I’d imagine that’s because we like to talk about AIs, for whom such a function can be written explicitly and for whom consistency between agents is no trouble. Neither of those conditions holds true for our messy meat brains.
That being said, I’m afraid the idea that there’s some uniform set of impulses on which all existing moralities are fundamentally based is more an article of faith than a statement of fact given the present state of knowledge. There’s clearly enough unity there for some moral concepts to (e.g.) be describable in language, but that’s a relatively weak criterion. Pathology gives the idea of strong consistency a lot of trouble, but even if you ignore that there’s simply not enough evidence to declare that it’s consistent enough to define as a single function covering all normal people; just off the top of my head, for example, it could easily be that parts of it sum as a polynomial, or something similar, for which the coefficients vary somewhat between people or populations.
Fair enough. What term would you prefer? I’ll use “morality” for now.
Quite the opposite, we can see that our morality exists unchanged regardless of beliefs by the fact that there are people who actually do have different moralities. As a vegetarian, I can tell you that a lot of people who believe eating meat is OK do so because they are mistaken about the environment; remove the mistake (by showing them how horrible conditions are in factory farms, for example) and they will see that eating meat is wrong (or at least that factory farming is wrong.) If they genuinely didn’t value the pain of animals, say, this would fail. No amount of argument will persuade Clippy that killing people is wrong.
You wouldn’t happen to have non-anecdotal evidence that this is actually the case, would you?
What, like a study of people showed images of slaughterhouses or something? Nope. To be honest, that’s kind of a terrible example. Racists work much better.
How about “moral architecture”?
I think I’d agree that most humans share roughly the same set of inputs to that architecture: hit most people on the head, and they’re likely to feel pain; humiliate them, and they’re likely to feel embarrassment. I doubt that the relative weightings of these traits are likely to remain identical between individuals, but if you factor that out I think we have a human commonality that I could get behind.
I suspect we’d differ in our opinion of acculturation’s role in defining certain categories (the pain of animals, for example) as morally significant, though. That strikes me as a level or two above anything I’d be comfortable calling a human universal.
Moral architecture sounds good.
I note that humans can empathise with pains they do not themselves feel.
Well, yeah. It’s not the greatest example, I suppose. How about racism? That’s usually my go-to for this sort of thing. I kill Jews because Jews are parasites that undermine civilization; you kill Nazis because they murder innocent people.
EDIT: I’m not actually Nazi, obviously.