Please stop using the words “rational” and “optimal”, and give me some sign that you’ve read the linked post on counterfactuals rather than asking counterfactual questions whose assumptions you refuse to spell out.
The only difficult question here concerns the imbalance in knowledge between Omega and a human, per comment by shminux. Because of this, I don’t actually know what TDT does here (much less ‘rationality’).
Assumptions: The game uses the payout matrix described OP, and the second player learns of the first player’s move before making his move. Both players know that both players are trying to win and will not use a strategy which does not result in them winning.
My conclusion is that both players defect. My problem is that it would be better for player 2 if player 2 did not have the option to defect if player 1 cooperated.
I’ve thrown out cooperatebot and reverse quid pro quo as candidates for best strategy.
FYI: I’m using this as my reference, and this hinges on reflexive inconsistency. I can’t find a reflexively consistent strategy even with only two options available. (Note that defectbot consistently equals or outperforms quid pro quo in all cases)
Again, you don’t sound like you’ve read this post here. Let’s say that, in fact, “it would be better for player 2 if player 2 did not have the option to defect if player 1 cooperated”—though I’m not at all sure of that, when player 2 is Omega—and let’s say Omega uses TDT. Then it will ask counterfactual questions about what “would” happen if Omega’s own abstract decision procedure gave various answers. Because of the nature of the counterfactuals, these will screen off any actions by player 1 that depend on said answers, even ‘known’ actions.
You’re postulating away the hard part, namely the question of whether the human player’s actions depend on Omega’s real thought processes or if Omega can just fool us!
Which strategy is best does not depend on what any given agent decides the ideal strategy is.
I’m assuming only that both the human player and Omega are capable of considering a total of six strategies for a simple payoff matrix and determining which ones are best. In particular, I’m calling Löb’shit on the line of thought “If I can prove that it is best to cooperate, other actors will concur that it is best to cooperate” when used as part of the proof that cooperation is best.
I’m using TDT instead of CDT because I wish to refuse to allow precommitment to become necessary or beneficial, and CDT has trouble explaining why to one-box if the boxes are transparent.
Please stop using the words “rational” and “optimal”, and give me some sign that you’ve read the linked post on counterfactuals rather than asking counterfactual questions whose assumptions you refuse to spell out.
The only difficult question here concerns the imbalance in knowledge between Omega and a human, per comment by shminux. Because of this, I don’t actually know what TDT does here (much less ‘rationality’).
Assumptions: The game uses the payout matrix described OP, and the second player learns of the first player’s move before making his move. Both players know that both players are trying to win and will not use a strategy which does not result in them winning.
My conclusion is that both players defect. My problem is that it would be better for player 2 if player 2 did not have the option to defect if player 1 cooperated.
I’ve thrown out cooperatebot and reverse quid pro quo as candidates for best strategy.
FYI: I’m using this as my reference, and this hinges on reflexive inconsistency. I can’t find a reflexively consistent strategy even with only two options available. (Note that defectbot consistently equals or outperforms quid pro quo in all cases)
Again, you don’t sound like you’ve read this post here. Let’s say that, in fact, “it would be better for player 2 if player 2 did not have the option to defect if player 1 cooperated”—though I’m not at all sure of that, when player 2 is Omega—and let’s say Omega uses TDT. Then it will ask counterfactual questions about what “would” happen if Omega’s own abstract decision procedure gave various answers. Because of the nature of the counterfactuals, these will screen off any actions by player 1 that depend on said answers, even ‘known’ actions.
You’re postulating away the hard part, namely the question of whether the human player’s actions depend on Omega’s real thought processes or if Omega can just fool us!
Which strategy is best does not depend on what any given agent decides the ideal strategy is.
I’m assuming only that both the human player and Omega are capable of considering a total of six strategies for a simple payoff matrix and determining which ones are best. In particular, I’m calling Löb’shit on the line of thought “If I can prove that it is best to cooperate, other actors will concur that it is best to cooperate” when used as part of the proof that cooperation is best.
I’m using TDT instead of CDT because I wish to refuse to allow precommitment to become necessary or beneficial, and CDT has trouble explaining why to one-box if the boxes are transparent.