Either that, or the idea of mind reading agents is flawed.
We shouldn’t conclude that, since to various degrees mindreading agents already happen in real life.
If we tighten our standard to “games where the mindreading agent is only allowed to predict actions you’d choose in the game, which is played with you already knowing about the mindreading agent”, then many decision theories that are different in other situations might all choose to respond to “pick B or I’ll kick you in the dick” by picking B.
Mindreading agents do happen in real life but they are often wrong and can be fooled. Most decision theories on this website don’t entertain either of these possibilities. If we allow “fooling a predictor” as a possible action then the solution to Newcomb’s problem is easy: simply fool the predictor and then take both boxes.
In Newcomb’s scenario, an agent that believes they have a probability of 99.9% of being able to fool Omega should two-box. They’re wrong and will only get $1000 instead of $1000000, but that’s a cost of having wildly inaccurate beliefs about the world they’re in, not a criticism of any particular decision theory.
Setting up a scenario in which the agent has true beliefs about the world isolates the effect of the decision theory for analysis, without mixing in a bunch of extraneous factors. Likewise for the fairness assumption that says that the payoff distribution is correlated only with the agents’ strategies and not the process by which they arrive at those strategies.
Violating those assumptions does allow a broader range of scenarios, but doesn’t appear to help in the evaluation of decision theories. It’s already a difficult enough field of study without throwing in stuff like that.
To entertain that possibility, suppose you’re X% confident that your best “fool the predictor into thinking I’ll one-box, and then two-box” plan will work, and Y% confident that “actually do one-box, in a way the predictor can predict” plan will work. If X=Y or X>Y you’ve got no incentive to actually one-box, only to try to pretend you will, but above some threshold of belief the predictor might beat your deception it makes sense to actually be honest.
We shouldn’t conclude that, since to various degrees mindreading agents already happen in real life.
If we tighten our standard to “games where the mindreading agent is only allowed to predict actions you’d choose in the game, which is played with you already knowing about the mindreading agent”, then many decision theories that are different in other situations might all choose to respond to “pick B or I’ll kick you in the dick” by picking B.
Mindreading agents do happen in real life but they are often wrong and can be fooled. Most decision theories on this website don’t entertain either of these possibilities. If we allow “fooling a predictor” as a possible action then the solution to Newcomb’s problem is easy: simply fool the predictor and then take both boxes.
In Newcomb’s scenario, an agent that believes they have a probability of 99.9% of being able to fool Omega should two-box. They’re wrong and will only get $1000 instead of $1000000, but that’s a cost of having wildly inaccurate beliefs about the world they’re in, not a criticism of any particular decision theory.
Setting up a scenario in which the agent has true beliefs about the world isolates the effect of the decision theory for analysis, without mixing in a bunch of extraneous factors. Likewise for the fairness assumption that says that the payoff distribution is correlated only with the agents’ strategies and not the process by which they arrive at those strategies.
Violating those assumptions does allow a broader range of scenarios, but doesn’t appear to help in the evaluation of decision theories. It’s already a difficult enough field of study without throwing in stuff like that.
To entertain that possibility, suppose you’re X% confident that your best “fool the predictor into thinking I’ll one-box, and then two-box” plan will work, and Y% confident that “actually do one-box, in a way the predictor can predict” plan will work. If X=Y or X>Y you’ve got no incentive to actually one-box, only to try to pretend you will, but above some threshold of belief the predictor might beat your deception it makes sense to actually be honest.