“Would you sacrifice yourself to save the lives of 10 others?” you ask person J. “I guess so”, J replies. “I might find it difficult bringing myself to actually do it, but I know it’s the right thing to do”.
“But you give a lot of money to charity” you tell this highly moral, saintly individual. “And you make sure to give only to charities that really work. If you stay alive, the extra money you will earn can be used to save the lives of more than 10 people. You are not just sacrificing yourself, you are sacrificing them too. Sacrificing the lives of more than 10 people to save 10? Are you so sure it’s the right thing to do?”.
“Yes”, J replies. “And I don’t accept your utilitarian model of ethics that got you to that conclusion”.
What I figured out (and I don’t know if this has been covered on LW yet) is that J’s decision can actually be rational, if:
J’s utility function is strongly weighted in favour of J’s own wellbeing, but takes everyone else’s into account too
J considers the social shame of killing 10 other people to save himself worse (according to this utility function) than his own death plus a bunch of others
The other thing I realised was that people with a utility function such as J’s should not necessarily be criticized. If that’s how we’re going to behave anyway, we may as well formalize it and that should leave everyone better off on average.
J considers the social shame of killing 10 other people to save himself worse (according to this utility function) than his own death plus a bunch of others.
Yes, but only if this is really, truly J’s utility function. There’s a significant possibility that J is suffering from major scope insensitivity and failing to fully appreciate the loss of fun happening when all those people die that he could have saved by living and donating to effective charity. When I say “significant possibility”, I’m estimating P>.95.
Note: I interpreted “charities that really work” as “charities that you’ve researched well and concluded that they’re the most effective one’s out there.” If you just mean that the charity donation produces positive instead of negative fun (considering that there exist some charities that actually don’t help people), then my P estimate drops.
It seems plausible to me that J really, truly cares about himself significantly more than he cares about other people, certainly with P > 0.05.
The effect could be partly due to this and partly due to scope insensitivity but still… how do you distinguish one from the other?
It seems: caring about yourself → caring what society thinks of you → following society’s norms → tendency towards scope insensitivity (since several of society’s norms are scope-insensitive).
In other words: how do you tell whether J has utility function F, or a different utility function G which he is doing a poor job of optimising due to biases? I assume it would have something to do with pointing out the error and seeing how he reacts, but it can’t be that simple. Is the question even meaningful?
Re: “charities that work”, your assumption is correct.
Considering that J is contributing a lot of money to truly effective charity, I think that his utility function is such that he will gain more utils from the huge amount of fun generated from his continued donations minus that by social shame minus that of ten people dying compared to J himself dying if his biases did not render him incapable of appreciating just how much fun his charity was generating. If he’s very selfish, my probability estimate is raised (not above .95, but above whatever it would have been before) by the fact that most people don’t want to die.
One way to find out the source of such a decision is telling them to read the Sequences, and see what they think afterwards. The question is very meaningful, because the whole point of instrumental rationality is learning how to prevent your biases from sabotaging your utility function.
Another point I should elaborate on.
“Would you sacrifice yourself to save the lives of 10 others?” you ask person J. “I guess so”, J replies. “I might find it difficult bringing myself to actually do it, but I know it’s the right thing to do”.
“But you give a lot of money to charity” you tell this highly moral, saintly individual. “And you make sure to give only to charities that really work. If you stay alive, the extra money you will earn can be used to save the lives of more than 10 people. You are not just sacrificing yourself, you are sacrificing them too. Sacrificing the lives of more than 10 people to save 10? Are you so sure it’s the right thing to do?”.
“Yes”, J replies. “And I don’t accept your utilitarian model of ethics that got you to that conclusion”.
What I figured out (and I don’t know if this has been covered on LW yet) is that J’s decision can actually be rational, if:
J’s utility function is strongly weighted in favour of J’s own wellbeing, but takes everyone else’s into account too
J considers the social shame of killing 10 other people to save himself worse (according to this utility function) than his own death plus a bunch of others
The other thing I realised was that people with a utility function such as J’s should not necessarily be criticized. If that’s how we’re going to behave anyway, we may as well formalize it and that should leave everyone better off on average.
Yes, but only if this is really, truly J’s utility function. There’s a significant possibility that J is suffering from major scope insensitivity and failing to fully appreciate the loss of fun happening when all those people die that he could have saved by living and donating to effective charity. When I say “significant possibility”, I’m estimating P>.95.
Note: I interpreted “charities that really work” as “charities that you’ve researched well and concluded that they’re the most effective one’s out there.” If you just mean that the charity donation produces positive instead of negative fun (considering that there exist some charities that actually don’t help people), then my P estimate drops.
It seems plausible to me that J really, truly cares about himself significantly more than he cares about other people, certainly with P > 0.05.
The effect could be partly due to this and partly due to scope insensitivity but still… how do you distinguish one from the other?
It seems: caring about yourself → caring what society thinks of you → following society’s norms → tendency towards scope insensitivity (since several of society’s norms are scope-insensitive).
In other words: how do you tell whether J has utility function F, or a different utility function G which he is doing a poor job of optimising due to biases? I assume it would have something to do with pointing out the error and seeing how he reacts, but it can’t be that simple. Is the question even meaningful?
Re: “charities that work”, your assumption is correct.
Considering that J is contributing a lot of money to truly effective charity, I think that his utility function is such that he will gain more utils from the huge amount of fun generated from his continued donations minus that by social shame minus that of ten people dying compared to J himself dying if his biases did not render him incapable of appreciating just how much fun his charity was generating. If he’s very selfish, my probability estimate is raised (not above .95, but above whatever it would have been before) by the fact that most people don’t want to die.
One way to find out the source of such a decision is telling them to read the Sequences, and see what they think afterwards. The question is very meaningful, because the whole point of instrumental rationality is learning how to prevent your biases from sabotaging your utility function.