Is there a word for a person, or an agent, that self-modifies to find something more painful, in order to change someone else’s incentives, as described here? Obviously there are some choice phrases we might like to use about such a person, but most of them—eg “moral blackmail”—seem insufficiently precise. Is there a term that captures specifically this, and not other behaviour we don’t like? If not, what might be a good, specific term?
Have you read Schelling? He discusses a wide variety of maneuvers that are much like this. However, I can think of no standard names for this technique.
I suppose you could call such agents voluntary human shields.
It seems like the sort of thing that one would accuse another of, in order to score political points by making others feel ashamed to have sympathized with the person so accused. IOW, making the accusation is a much cheaper form of manipulation than actually doing the self-modification — and can be used to undermine many claims that one person is harming another. Thus, we should expect to hear the accusation from people who would like to go on harming others and getting away with it.
I wouldn’t be surprised to see examples of people saying “you don’t really feel bad, you’re faking it” which is a very different thing, and there’s an example of people saying “we mustn’t incentivize these hypothetical Muslims to self-modify in this way”. But can you point me to an example of what you describe happening—of someone saying “you, the actual real person I am replying to, have self-modified to find something more painful in order to change other people’s incentives”?
I have used that term for this, but it’s not very precise: the Wikipedia entry has the monster absorbing positive utility rather than threatening negative, and there’s no mention of self-modification.
The self-modification isn’t in itself the issue though is it? It seems to me that just about any sort of agent would be willing to self-modify into a utility monster if it had an expectation of that strategy being more likely to achieve its goals, and the pleasure/pain distinction is simply adding a constant (negative) offset to all utilities (which is meaningless since utility functions are generally assumed to be invariant under affine transformations).
I don’t even think it’s a subset of utility monster, it’s just a straight up “agent deciding to become a utility monster because that furthers its goals”.
Is there a word for a person, or an agent, that self-modifies to find something more painful, in order to change someone else’s incentives, as described here? Obviously there are some choice phrases we might like to use about such a person, but most of them—eg “moral blackmail”—seem insufficiently precise. Is there a term that captures specifically this, and not other behaviour we don’t like? If not, what might be a good, specific term?
Have you read Schelling? He discusses a wide variety of maneuvers that are much like this. However, I can think of no standard names for this technique.
I suppose you could call such agents voluntary human shields.
“Utility martyr”?
It seems like the sort of thing that one would accuse another of, in order to score political points by making others feel ashamed to have sympathized with the person so accused. IOW, making the accusation is a much cheaper form of manipulation than actually doing the self-modification — and can be used to undermine many claims that one person is harming another. Thus, we should expect to hear the accusation from people who would like to go on harming others and getting away with it.
I wouldn’t be surprised to see examples of people saying “you don’t really feel bad, you’re faking it” which is a very different thing, and there’s an example of people saying “we mustn’t incentivize these hypothetical Muslims to self-modify in this way”. But can you point me to an example of what you describe happening—of someone saying “you, the actual real person I am replying to, have self-modified to find something more painful in order to change other people’s incentives”?
Subset of utility monster, I think.
I have used that term for this, but it’s not very precise: the Wikipedia entry has the monster absorbing positive utility rather than threatening negative, and there’s no mention of self-modification.
The self-modification isn’t in itself the issue though is it? It seems to me that just about any sort of agent would be willing to self-modify into a utility monster if it had an expectation of that strategy being more likely to achieve its goals, and the pleasure/pain distinction is simply adding a constant (negative) offset to all utilities (which is meaningless since utility functions are generally assumed to be invariant under affine transformations).
I don’t even think it’s a subset of utility monster, it’s just a straight up “agent deciding to become a utility monster because that furthers its goals”.
http://tvtropes.org/pmwiki/pmwiki.php/Main/AntiVillain ?