Compassion (in a certain sense) may be part of your answer.
If I (as Prisoner A) have a term in my utility function such that an injury to Prisoner B is an injury to me (discounted), it can make ‘Cooperate’ much more attractive.
I might have enough compassion to be willing to do 6 months in jail if it will spare Prisoner B a 2-year prison term (or more).
His point there is, the values in the matrix are supposed to represent the participants’ utility, rather than jail time, which accounts for your compassion for your friend. If it were simply prison sentences, your reasoning would apply, which is why EY says the true Prisoner’s Dilemma requires convoluted, unusual scenarios, and why normal presentations of the PD don’t make the situation clear.
That Prisoner A is completely and utterly selfish is part of the Prisoner’s Dilemma. If the prisoner’s not selfish, it’s not the Prisoner’s Dilemma anymore.
EDIT: Of course, this is only true if the numbers in the matrix represent years spent in jail, not utilons.
Of course, this might still be muddy if you recast the payoff matrix in utilons, or (to abstract away less) adjust the “external” payoff matrices so that the “internal” payoff matrices match those of the original problem.
I suspect I’m not smart enough to play on this site. I’m quite unsure I can even parse your sentence correctly, and I can’t imagine a reason to adjust the external payoff matrices (they were given by Wei Dai, that is the original problem I’m discussing) so the internal payoff mtrices match something. I’m baffled.
See Cyan’s comment below. Do not be dispirited by lolspeak.
Also, the reason to adjust the payoff matrices in the original problem is so that your ‘internal’ payoff matrices match those of Wei Dai’s problem, or to put it another way, consider the problem in the least convenient possible world. Basically, the prisoner’s dilemma is still there if you take the problem to be in utilons, which take into account things like your ‘compassion’ (in this case, valuing the reward given to the other person). I can’t quite figure out what your formula for discounting is above, so let me simplify...
It would be remiss for me to not do the math, though it is not my forte:
Suppose the matrix represents jelly beans for you or the opponent, each worth 1 utilon. Further suppose that you get .25 utilons for each jelly bean the opponent gets, due to your ‘compassion’. Now take this payoff matrix (in jellybeans):
375/500 -150/600
600/0 75/100
Which becomes in your ‘internal’ matrix (in utilons):
500/500 0/600
600/0 100/100
Now cooperation is dominated by defection for the ‘compassionate’ person.
Someone please note if my numbers don’t work out—it’s early here.
But maybe I just think I do. I thought I understood that narrow part of Wei Dai’s post on a problem that maybe defeats TDT. I had no idea that compassion had already been considered and compensated out of consideration. And that’s such common shared knowledge here in the LessWrong community that it need not be mentioned.
I have a lot to learn. I now see I was very arrogant think I could contribute here. I should read the archives & wiki before I post. I apologize.
<<Begins to compute an estimated time to de-lurk. They collectively write several times faster than I can read, even if I don’t slow down to mull it over. Hmmm… >>
Eliezer_Yudkowsky wrote on 19 August 2009 03:24:46PM:
Tversky demonstrated: One experiment based on the simple dilemma found that approximately 40% of participants played “cooperate” (i.e., stayed silent). Hmmm...
Compassion (in a certain sense) may be part of your answer.
If I (as Prisoner A) have a term in my utility function such that an injury to Prisoner B is an injury to me (discounted), it can make ‘Cooperate’ much more attractive.
I might have enough compassion to be willing to do 6 months in jail if it will spare Prisoner B a 2-year prison term (or more).
For example, given the external payoff matrix given by Wei Dai (http://lesswrong.com/lw/15z/ingredients_of_timeless_decision_theory/11w9) (19 August 2009 07:08:23AM):
My INTERNAL payoff matrix becomes:
And ‘Cooperate’ now strictly dominates using elementary game theory.
Thank you for your time and consideration.
RickJS
While a good question, Eliezer_Yudkowsky has already thoroughly answered it in The True Prisoner’s Dilemma.
His point there is, the values in the matrix are supposed to represent the participants’ utility, rather than jail time, which accounts for your compassion for your friend. If it were simply prison sentences, your reasoning would apply, which is why EY says the true Prisoner’s Dilemma requires convoluted, unusual scenarios, and why normal presentations of the PD don’t make the situation clear.
That Prisoner A is completely and utterly selfish is part of the Prisoner’s Dilemma. If the prisoner’s not selfish, it’s not the Prisoner’s Dilemma anymore.
EDIT: Of course, this is only true if the numbers in the matrix represent years spent in jail, not utilons.
inorite?!
Of course, this might still be muddy if you recast the payoff matrix in utilons, or (to abstract away less) adjust the “external” payoff matrices so that the “internal” payoff matrices match those of the original problem.
Inorite? What is that?
I suspect I’m not smart enough to play on this site. I’m quite unsure I can even parse your sentence correctly, and I can’t imagine a reason to adjust the external payoff matrices (they were given by Wei Dai, that is the original problem I’m discussing) so the internal payoff mtrices match something. I’m baffled.
“inorite”.
See Cyan’s comment below. Do not be dispirited by lolspeak.
Also, the reason to adjust the payoff matrices in the original problem is so that your ‘internal’ payoff matrices match those of Wei Dai’s problem, or to put it another way, consider the problem in the least convenient possible world. Basically, the prisoner’s dilemma is still there if you take the problem to be in utilons, which take into account things like your ‘compassion’ (in this case, valuing the reward given to the other person). I can’t quite figure out what your formula for discounting is above, so let me simplify...
It would be remiss for me to not do the math, though it is not my forte:
Suppose the matrix represents jelly beans for you or the opponent, each worth 1 utilon. Further suppose that you get .25 utilons for each jelly bean the opponent gets, due to your ‘compassion’. Now take this payoff matrix (in jellybeans):
Which becomes in your ‘internal’ matrix (in utilons):
Now cooperation is dominated by defection for the ‘compassionate’ person.
Someone please note if my numbers don’t work out—it’s early here.
Ah. Thanks! I think I get that.
But maybe I just think I do. I thought I understood that narrow part of Wei Dai’s post on a problem that maybe defeats TDT. I had no idea that compassion had already been considered and compensated out of consideration. And that’s such common shared knowledge here in the LessWrong community that it need not be mentioned.
I have a lot to learn. I now see I was very arrogant think I could contribute here. I should read the archives & wiki before I post. I apologize.
<<Begins to compute an estimated time to de-lurk. They collectively write several times faster than I can read, even if I don’t slow down to mull it over. Hmmm… >>