Your reasoning goes above and beyond UDT: it says you must always cooperate in the Prisoner’s Dilemma to avoid “driving net utility down”. I’m pretty sure you made a mistake somewhere.
We’re talking about ethics rather than decision theory. If you want to apply the latter to the former then it makes perfect sense to take the attitude that “One util has the same ethical value, whoever that util belongs to. Therefore, we’re going to try to maximize ‘total utility’ (whatever sense one can make of that concept)”.
I think UDT does (or may do, depending on how you set it up) co-operate in a one-shot Prisoner’s Dilemma. (However, if you imagine a different game “The Torture Game” where you’re a sadist who gets 1 util for torturing, and inflicting −100 utils, then of course UDT cannot prevent you from torturing. So I’m certainly not arguing that UDT, exactly as it is, constitutes an ethical panacea.)
The connection between “The Torture Game” and Prisoner’s Dilemma is actually very close: Prisoner’s Dilemma is just A and B simultaneously playing the torture game with A as torturer and B as victim and vice versa, not able to communicate to each other whether they’ve chosen to torture until both have committed themselves one way or the other.
I’ve observed that UDT happily commits torture when playing The Torture Game, and (imo) being able to co-operate in a one-shot Prisoner’s Dilemma should be seen as one of the ambitions of UDT (whether or not it is ultimately successful).
So what about this then: Two instances of The Torture Game but rather than A and B moving simultaneously, first A chooses whether to torture and then B chooses. From B’s perspective, this is almost the same as Parfit’s Hitchhiker. The problem looks interesting from A’s perspective too, but it’s not one of the Standard Newcomblike Problems that I discuss in my UDT post.
I think, just as UDT aspires to co-operate in a one-shot PD i.e. not to torture in a Simultaneous Torture Game, so UDT aspires not to torture in the Sequential Torture Game.
Doesn’t make sense to me. Two flawless predictors that condition on each other’s actions can’t exist. Alice does whatever Bob will do, Bob does the opposite of what Alice will do, whoops, contradiction. Or maybe I’m reading you wrong?
Sorry—I guess I wasn’t clear enough. I meant that there are two human players and two (possibly non-human) flawless predictors.
So in other words, it’s almost like there are two totally independent instances of Newcomb’s game, except that the predictor from game A fills the boxes in the game B and vice versa.
Yes, you can consider a two-player game as a one-player game with the second player an opaque part of environment. In two-player games, ambient control is more apparent than in one-player games, but it’s also essential in Newcomb problem, which is why you make the analogy.
This needs to be spelled out more. Do you mean that if A takes both boxes, B gets $1,000, and if A takes one box, B gets $1,000,000? Why is this a dilemma at all? What you do has no effect on the money you get.
I don’t know how to format a table, but here is what I want the game to be:
A-action B-action A-winnings B-winnings
2-box 2-box $1 $1
2-box 1-box $1001 $0
1-box 2-box $0 $1001
1-box 1-box $1000 $1000
Now compare this with Newcomb’s game:
A-action Prediction A-winnings
2-box 2-box $1
2-box 1-box $1001
1-box 2-box $0
1-box 1-box $1000
Now, if the “Prediction” in the second table is actually a flawless prediction of a different player’s action then we obtain the first three columns of the first table.
Hopefully the rest is clear, and please forgive the triviality of this observation.
But that’s exactly what I’m disputing. At this point, in a human dialogue I would “re-iterate” but there’s no need because my argument is back there for you to re-read if you like.
Yes, and how easy it is to arrive at such a proof may vary depending on circumstances. But in any case, recall that I merely said “UDT-style”.
UDT doesn’t cooperate in the PD unless you see the other guy’s source code and have a mathematical proof that it will output the same value as yours.
UDT doesn’t specify how exactly to deal with logical/observational uncertainty, but in principle it does deal with them. It doesn’t follow that if you don’t know how to analyze the problem, you should therefore defect. Human-level arguments operate on the level of simple approximate models allowing for uncertainty in how they relate to the real thing; decision theories should apply to analyzing these models in isolation from the real thing.
What’s “complete uncertainty”? How exploitable you are depends on who tries to exploit you. The opponent is also uncertain. If the opponent is Omega, you probably should be absolutely certain, because it’ll find the single exact set of circumstances that make you lose. But if the opponent is also fallible, you can count on the outcome not being the worst-case scenario, and therefore not being able to estimate the value of that worse-case scenario is not fatal. An almost formal analogy is analysis of algorithms in worst case and average case: worst case analysis applies to the optimal opponent, average case analysis to random opponent, and in real life you should target something in between.
The “always defect” strategy is part of a Nash equilibrium. The quining cooperator is part of a Nash equilibrium. IMO that’s one of the minimum requirements that a good strategy must meet. But a strategy that cooperates whenever its “mathematical intuition module” comes up blank can’t be part of any Nash equilibrium.
“Nash equilibrium” is far from being a generally convincing argument. Mathematical intuition module doesn’t come up blank, it gives probabilities of different outcomes, given the present observational and logical uncertainty. When you have probabilities of the other player acting each way depending on how you act, the problem is pretty straightforward (assuming expected utility etc.), and “Nash equilibrium” is no longer a relevant concern. It’s when you don’t have a mathematical intuition module, don’t have probabilities of the other player’s actions conditional on your actions, when you need to invent ad-hoc game-theoretic rituals of cognition.
Your reasoning goes above and beyond UDT: it says you must always cooperate in the Prisoner’s Dilemma to avoid “driving net utility down”. I’m pretty sure you made a mistake somewhere.
Two things to say:
We’re talking about ethics rather than decision theory. If you want to apply the latter to the former then it makes perfect sense to take the attitude that “One util has the same ethical value, whoever that util belongs to. Therefore, we’re going to try to maximize ‘total utility’ (whatever sense one can make of that concept)”.
I think UDT does (or may do, depending on how you set it up) co-operate in a one-shot Prisoner’s Dilemma. (However, if you imagine a different game “The Torture Game” where you’re a sadist who gets 1 util for torturing, and inflicting −100 utils, then of course UDT cannot prevent you from torturing. So I’m certainly not arguing that UDT, exactly as it is, constitutes an ethical panacea.)
Another random thought:
The connection between “The Torture Game” and Prisoner’s Dilemma is actually very close: Prisoner’s Dilemma is just A and B simultaneously playing the torture game with A as torturer and B as victim and vice versa, not able to communicate to each other whether they’ve chosen to torture until both have committed themselves one way or the other.
I’ve observed that UDT happily commits torture when playing The Torture Game, and (imo) being able to co-operate in a one-shot Prisoner’s Dilemma should be seen as one of the ambitions of UDT (whether or not it is ultimately successful).
So what about this then: Two instances of The Torture Game but rather than A and B moving simultaneously, first A chooses whether to torture and then B chooses. From B’s perspective, this is almost the same as Parfit’s Hitchhiker. The problem looks interesting from A’s perspective too, but it’s not one of the Standard Newcomblike Problems that I discuss in my UDT post.
I think, just as UDT aspires to co-operate in a one-shot PD i.e. not to torture in a Simultaneous Torture Game, so UDT aspires not to torture in the Sequential Torture Game.
If we’re talking about ethics, please note that telling the truth in my puzzles doesn’t maximize total utility either.
UDT doesn’t cooperate in the PD unless you see the other guy’s source code and have a mathematical proof that it will output the same value as yours.
A random thought, which once stated sounds obvious, but I feel like writing it down all the same:
One-shot PD = Two parallel “Newcomb games” with flawless predictors, where the players swap boxes immediately prior to opening.
Doesn’t make sense to me. Two flawless predictors that condition on each other’s actions can’t exist. Alice does whatever Bob will do, Bob does the opposite of what Alice will do, whoops, contradiction. Or maybe I’m reading you wrong?
Sorry—I guess I wasn’t clear enough. I meant that there are two human players and two (possibly non-human) flawless predictors.
So in other words, it’s almost like there are two totally independent instances of Newcomb’s game, except that the predictor from game A fills the boxes in the game B and vice versa.
Yes, you can consider a two-player game as a one-player game with the second player an opaque part of environment. In two-player games, ambient control is more apparent than in one-player games, but it’s also essential in Newcomb problem, which is why you make the analogy.
This needs to be spelled out more. Do you mean that if A takes both boxes, B gets $1,000, and if A takes one box, B gets $1,000,000? Why is this a dilemma at all? What you do has no effect on the money you get.
I don’t know how to format a table, but here is what I want the game to be:
A-action B-action A-winnings B-winnings
2-box 2-box $1 $1
2-box 1-box $1001 $0
1-box 2-box $0 $1001
1-box 1-box $1000 $1000
Now compare this with Newcomb’s game:
A-action Prediction A-winnings
2-box 2-box $1
2-box 1-box $1001
1-box 2-box $0
1-box 1-box $1000
Now, if the “Prediction” in the second table is actually a flawless prediction of a different player’s action then we obtain the first three columns of the first table.
Hopefully the rest is clear, and please forgive the triviality of this observation.
But that’s exactly what I’m disputing. At this point, in a human dialogue I would “re-iterate” but there’s no need because my argument is back there for you to re-read if you like.
Yes, and how easy it is to arrive at such a proof may vary depending on circumstances. But in any case, recall that I merely said “UDT-style”.
UDT doesn’t specify how exactly to deal with logical/observational uncertainty, but in principle it does deal with them. It doesn’t follow that if you don’t know how to analyze the problem, you should therefore defect. Human-level arguments operate on the level of simple approximate models allowing for uncertainty in how they relate to the real thing; decision theories should apply to analyzing these models in isolation from the real thing.
This is intriguing, but sounds wrong to me. If you cooperate in a situation of complete uncertainty, you’re exploitable.
What’s “complete uncertainty”? How exploitable you are depends on who tries to exploit you. The opponent is also uncertain. If the opponent is Omega, you probably should be absolutely certain, because it’ll find the single exact set of circumstances that make you lose. But if the opponent is also fallible, you can count on the outcome not being the worst-case scenario, and therefore not being able to estimate the value of that worse-case scenario is not fatal. An almost formal analogy is analysis of algorithms in worst case and average case: worst case analysis applies to the optimal opponent, average case analysis to random opponent, and in real life you should target something in between.
The “always defect” strategy is part of a Nash equilibrium. The quining cooperator is part of a Nash equilibrium. IMO that’s one of the minimum requirements that a good strategy must meet. But a strategy that cooperates whenever its “mathematical intuition module” comes up blank can’t be part of any Nash equilibrium.
“Nash equilibrium” is far from being a generally convincing argument. Mathematical intuition module doesn’t come up blank, it gives probabilities of different outcomes, given the present observational and logical uncertainty. When you have probabilities of the other player acting each way depending on how you act, the problem is pretty straightforward (assuming expected utility etc.), and “Nash equilibrium” is no longer a relevant concern. It’s when you don’t have a mathematical intuition module, don’t have probabilities of the other player’s actions conditional on your actions, when you need to invent ad-hoc game-theoretic rituals of cognition.