The part about two Predictors playing against each other reminded me of Robust Cooperation in the Prisoner’s Dilemma, where two agents with the algorithm “If I find a proof that the other player cooperates with me, cooperate, otherwise defect” are able to mutually prove cooperation and cooperate.
If we use that framework, Marion plays “If I find a proof that the Predictor fills both boxes, two-box, else one-box” and the Predictor plays “If I find a proof that Marion one-boxes, fill both, else only fill box A”. I don’t understand the math very well, but I think in this case neither agent finds a proof, and the Predictor fills only box A while Marion takes only box B—the worst possible outcome for Marion.
Marion’s third conditional might correspond to Marion only searching for proofs in PA, while the Predictor searches for proofs in PA+1, in which case Marion will not find a proof, the Predictor will, and then the Predictor fills both boxes and Marion takes only box B. But in this case clearly Marion has abandoned the ability to predict the Predictor and has given the Predictor epistemic vantage over her.
Yes. The one I described is the one the paper calls FairBot. It also defines PrudentBot, which looks for a proof that the other player cooperates with PrudentBot and a proof that it defects against DefectBot. PrudentBot defects against CooperateBot.
Yeah after the first two conditionals return as non-halting, Marion effectively abandons trying to further predict the predictor. After iterating the non-halting stack, Marion will conclude that she’s better served by giving into the partial blackmail and taking the million dollars then she is by trying to game the last $1000 out of the predictor, based on the fact that her ideal state is gated behind an infinitely recursed function.
The part about two Predictors playing against each other reminded me of Robust Cooperation in the Prisoner’s Dilemma, where two agents with the algorithm “If I find a proof that the other player cooperates with me, cooperate, otherwise defect” are able to mutually prove cooperation and cooperate.
If we use that framework, Marion plays “If I find a proof that the Predictor fills both boxes, two-box, else one-box” and the Predictor plays “If I find a proof that Marion one-boxes, fill both, else only fill box A”. I don’t understand the math very well, but I think in this case neither agent finds a proof, and the Predictor fills only box A while Marion takes only box B—the worst possible outcome for Marion.
Marion’s third conditional might correspond to Marion only searching for proofs in PA, while the Predictor searches for proofs in PA+1, in which case Marion will not find a proof, the Predictor will, and then the Predictor fills both boxes and Marion takes only box B. But in this case clearly Marion has abandoned the ability to predict the Predictor and has given the Predictor epistemic vantage over her.
This would cooperate with CooperateBot (algorithm that unconditionally says “Cooperate”).
Yes. The one I described is the one the paper calls FairBot. It also defines PrudentBot, which looks for a proof that the other player cooperates with PrudentBot and a proof that it defects against DefectBot. PrudentBot defects against CooperateBot.
Yeah after the first two conditionals return as non-halting, Marion effectively abandons trying to further predict the predictor. After iterating the non-halting stack, Marion will conclude that she’s better served by giving into the partial blackmail and taking the million dollars then she is by trying to game the last $1000 out of the predictor, based on the fact that her ideal state is gated behind an infinitely recursed function.