Let X be ‘I am slightly more committed to this group’s welfare, particularly to that of its weakest members, than most of its members are. If you suffer a serious loss of status/well-being I will still help you in order to display affiliation to this group even though you will no longer be in a position to help me. I am substantially more kind and helpful to the people I like and substantially more vindictive and aggressive towards those I dislike. I am generally stable in who I like. I am much more capable and popular than most members of this group, demand appropriate consideration, and grant appropriate consideration to those more capable than myself. I adhere to simple taboos so that my reputation and health are secure and so that I am unlikely to contaminate the reputations or health of my friends. I currently like you and dislike your enemies but I am somewhat inclined towards ambivalence on regarding whether I like you right now so the pay-off would be very great for you if you were to expend resources pleasing me and get me into the stable ‘liking you’ region of my possible attitudinal space. Once there, I am likely to make a strong commitment to a friendly attitude towards you rather than wasting cognitive resources checking a predictable parameter among my set of derivative preferences.′
Then, instead of saying my previous suggestion, say something like, ‘I would precommit to acting in such a way that X if and only if you would precommit to acting in such a way that you could truthfully say, “X if and only if you would precommit to acting in such a way that you could truthfully say X.”’
(Edit: Note, if you haven’t already, that the above is just a special case of the decision theory, “I would adhere to rule system R if and only if (You would adhere to R if and only if I would adhere to R).” )
Wouldn’t the mere ability to recognize such a symmetric decision theory be strong evidence of X being true?
If I understood you correctly, I think that people do do this kind of thing, except it’s all nonverbal and implicit. E.g. Using hard to fake tests for the other person’s decision theory is a way to make the other person honestly reveal what’s going on inside them. Another component is use of strong emotions, which are sort of like a precommitment mechanism for people, because once activated, they are stable.
Yes, I understand the signal must be hard to fake. But if the concern is merely about optimizing signal quality, wouldn’t it be an even stronger mechanism to noticeably couple your payoff profile to a credible mechanism?
Just as a sketch, find some “punisher” that noticeably imposes disutility (like repurposing the signal faker’s means toward paperclip production, since that’s such such a terrible outcome, apparently) on you whenever you deviate from your purported decision theory. It’s rather trivial to have a publicly-viewable database of who is coupled to the punisher (and by what decision theory), and to make it verifiable that any being with which you are interacting matches a specific database entry.
This has the effect of elevating your signal quality to that of the punisher’s. Then, it’s just a problem of finding a reliable punisher.
We do. That’s one of the functions of reputation and gossip among humans, and also the purpose of having a legal system. But it doesn’t work perfectly: we have yet to find a reliable punisher, and if we did find one it would probably need to constantly monitor everyone and invade their privacy.
Attention Users: please provide me with your decision theory, and what means I should use to enforce your decision theory so that you can reliably claim to adhere to it.
For this job, I request 50,000 USD as compensation, and I ask that it be given to User:Kevin.
That’s a good point. Let me try a different one.
Let X be ‘I am slightly more committed to this group’s welfare, particularly to that of its weakest members, than most of its members are. If you suffer a serious loss of status/well-being I will still help you in order to display affiliation to this group even though you will no longer be in a position to help me. I am substantially more kind and helpful to the people I like and substantially more vindictive and aggressive towards those I dislike. I am generally stable in who I like. I am much more capable and popular than most members of this group, demand appropriate consideration, and grant appropriate consideration to those more capable than myself. I adhere to simple taboos so that my reputation and health are secure and so that I am unlikely to contaminate the reputations or health of my friends. I currently like you and dislike your enemies but I am somewhat inclined towards ambivalence on regarding whether I like you right now so the pay-off would be very great for you if you were to expend resources pleasing me and get me into the stable ‘liking you’ region of my possible attitudinal space. Once there, I am likely to make a strong commitment to a friendly attitude towards you rather than wasting cognitive resources checking a predictable parameter among my set of derivative preferences.′
Then, instead of saying my previous suggestion, say something like, ‘I would precommit to acting in such a way that X if and only if you would precommit to acting in such a way that you could truthfully say, “X if and only if you would precommit to acting in such a way that you could truthfully say X.”’
(Edit: Note, if you haven’t already, that the above is just a special case of the decision theory, “I would adhere to rule system R if and only if (You would adhere to R if and only if I would adhere to R).” )
Wouldn’t the mere ability to recognize such a symmetric decision theory be strong evidence of X being true?
If I understood you correctly, I think that people do do this kind of thing, except it’s all nonverbal and implicit. E.g. Using hard to fake tests for the other person’s decision theory is a way to make the other person honestly reveal what’s going on inside them. Another component is use of strong emotions, which are sort of like a precommitment mechanism for people, because once activated, they are stable.
Yes, I understand the signal must be hard to fake. But if the concern is merely about optimizing signal quality, wouldn’t it be an even stronger mechanism to noticeably couple your payoff profile to a credible mechanism?
Just as a sketch, find some “punisher” that noticeably imposes disutility (like repurposing the signal faker’s means toward paperclip production, since that’s such such a terrible outcome, apparently) on you whenever you deviate from your purported decision theory. It’s rather trivial to have a publicly-viewable database of who is coupled to the punisher (and by what decision theory), and to make it verifiable that any being with which you are interacting matches a specific database entry.
This has the effect of elevating your signal quality to that of the punisher’s. Then, it’s just a problem of finding a reliable punisher.
Why not just do that, for example?
We do. That’s one of the functions of reputation and gossip among humans, and also the purpose of having a legal system. But it doesn’t work perfectly: we have yet to find a reliable punisher, and if we did find one it would probably need to constantly monitor everyone and invade their privacy.
Yet another reason why people invented religion...
Well it looks like you just got yourself a job ;-0
That is good!
Attention Users: please provide me with your decision theory, and what means I should use to enforce your decision theory so that you can reliably claim to adhere to it.
For this job, I request 50,000 USD as compensation, and I ask that it be given to User:Kevin.