I want to discuss the predisposition part. My argument for human players depends on this. If I was going to predispose myself, decide to choose an option, then which option would I predispose myself to?
If the two players involved don’t have mutual access to each other’s source code, then how would they pick up on the predisposition? Well, if B is perfectly rational, and has these preferences, then B is for all intents and purposes equivalent to a version of me with these preferences. So I engage in a game with A. Now, because A also knows that I am rational and have these preferences, A* would simulate me simulating him.
This leads to a self referential algorithm which does not compute. Thus, at least one of us must predispose ourselves. Predisposition to defection leads to (D, D), and predisposition to cooperation leads to (C, C).
(C, C) > (D, D) thus the agents predispose themselves to cooperation.
Remember that the agents update their choice based on how they predict the other agent would react to an intermediary decision step. Because they are equally rational, their decision making process is reflected.
Thus A* is a high fidelity prediction of B, and B* is a high fidelity prediction of A.
You are assuming that all rational strategies are identical and deterministic. In fact, you seem to be using “rational” as a stand-in for “identical”, which reduces this scenario to the twin PD. But imagine a world where everyone makes use of the type of supperrationality you are positing here—basically, everyone assumes people are just like them. Then any one person who switches to a defection strategy would have a huge advantage. Defecting becomes the rational thing to do. Since everybody is rational, everybody switches to defecting—because this is just a standard one-shot PD. You can’t get the benefits of knowing the opponent’s source code unless you know the opponent’s source code.
In this case, I think the rational strategy is identical. If A and B are perfectly rational and have the same preferences, then assuming they didn’t both know the above two, they wold converge on the same strategy.
I believe that for any formal decision problem, a given level of information about that problem, and a given set of preferences, there is only one rational strategy (not a choice, but a strategy. The strategy may suggest a set of choices as opposed to any particular choice), but there is only one such strategy.
I speculate that everyone knows that if a single one of them switched to defect, then all of them would, so I doubt it.
However, I haven’t analysed how RDT works in prisoner dilemma games with n > 2, so I’m not sure.
I want to discuss the predisposition part. My argument for human players depends on this. If I was going to predispose myself, decide to choose an option, then which option would I predispose myself to?
If the two players involved don’t have mutual access to each other’s source code, then how would they pick up on the predisposition? Well, if B is perfectly rational, and has these preferences, then B is for all intents and purposes equivalent to a version of me with these preferences. So I engage in a game with A. Now, because A also knows that I am rational and have these preferences, A* would simulate me simulating him.
This leads to a self referential algorithm which does not compute. Thus, at least one of us must predispose ourselves. Predisposition to defection leads to (D, D), and predisposition to cooperation leads to (C, C). (C, C) > (D, D) thus the agents predispose themselves to cooperation.
Remember that the agents update their choice based on how they predict the other agent would react to an intermediary decision step. Because they are equally rational, their decision making process is reflected.
Thus A* is a high fidelity prediction of B, and B* is a high fidelity prediction of A.
Please take a look at the diagrams.
You are assuming that all rational strategies are identical and deterministic. In fact, you seem to be using “rational” as a stand-in for “identical”, which reduces this scenario to the twin PD. But imagine a world where everyone makes use of the type of supperrationality you are positing here—basically, everyone assumes people are just like them. Then any one person who switches to a defection strategy would have a huge advantage. Defecting becomes the rational thing to do. Since everybody is rational, everybody switches to defecting—because this is just a standard one-shot PD. You can’t get the benefits of knowing the opponent’s source code unless you know the opponent’s source code.
In this case, I think the rational strategy is identical. If A and B are perfectly rational and have the same preferences, then assuming they didn’t both know the above two, they wold converge on the same strategy.
I believe that for any formal decision problem, a given level of information about that problem, and a given set of preferences, there is only one rational strategy (not a choice, but a strategy. The strategy may suggest a set of choices as opposed to any particular choice), but there is only one such strategy.
I speculate that everyone knows that if a single one of them switched to defect, then all of them would, so I doubt it.
However, I haven’t analysed how RDT works in prisoner dilemma games with n > 2, so I’m not sure.