Newcomb’s Problem

TagLast edit: Feb 19, 2025, 10:14 PM by RobertM

Newcomb’s Problem is the original Newcomblike decision problem that inspired the creation of causal decision theory as distinct from evidential decision theory, spawning a vast array of philosophical literature in the process. It is sometimes called Newcomb’s Paradox (despite not being a paradox). The dilemma was originally formulated by William Newcomb, and presented to the philosophical community by Robert Nozick.

The original formulation of Newcomb’s Problem was as follows:

An alien named Omega has come to Earth, and has offered some people the following dilemma.

Before you are two boxes, Box A and Box B.

You may choose to take both boxes (“two-box”), or take only Box B (“one-box”).

Box A is transparent and contains $1,000.

Box B is opaque and contains either $1,000,000 or $0.

The alien Omega has already set up the situation and departed, but previously put $1,000,000 into Box B if and only if Omega predicted that you would one-box (take only the opaque Box B and leave Box A and its $1,000 behind).

Omega is an excellent predictor of human behavior. For the sake of quantifying this assertion and how we know it, we can assume e.g. that Omega has run 67 previous experiments and not been wrong even once. Since people are often strongly opinionated about their choices in Newcomb’s Problem, it isn’t unrealistic to suppose this is the sort of thing you could predict by reasoning about, e.g., a scan of somebody’s brain.

Newcomb originally specified that Omega would leave Box B empty in the case that you tried to decide by flipping a coin; since this violates algorithm-independence, we can alternatively suppose that Omega can predict coinflips.

We may also assume, e.g., that Box A combusts if it is left behind, so nobody else can pick up Box A later; that Omega adds $1 of pollution-free electricity to the world economy for every $1 used in Its dilemmas, so that the currency does not represent a zero-sum wealth transfer; etcetera. Omega never plays this game with a person more than once.

The two original opposing arguments given about Newcomb’s problem were, roughly:

Argument for one-boxing: People who take only Box B tend to walk away rich. People who two-box tend to walk away poor. It is better to be rich than poor.
Argument for two-boxing: Omega has already made its prediction. Box B is already empty or already full. It would be irrational to leave behind Box A when this choice cannot cause Box B’s contents to change. It’s true that Omega has chosen to reward people with irrational dispositions in this setup, but Box B is now already empty, and taking only one box won’t change that.

For the larger argument of which this became part, see one of the introductions to logical decision theory. As of 2016, the most academically common view of Newcomb’s Problem is that it surfaces the split between evidential decision theories and causal decision theories, and that causal decision theory is correct. However, both that framing and that conclusion have been variously disputed, most recently by logical decision theories.

add a diagram of a causal model for Newcomb’s Problem.

The more extensive Wikipedia page on Newcomb’s Problem may be found under “Newcomb’s Paradox”.

Replies by different decision theories

(This section does not remotely do justice to the vast literature on Newcomb’s Problem.)

Pretheoretic reactions

Well, by assumption, Omega is pretty good at predicting me, so I’d better take only Box B.
Omega’s already gone. I can’t possibly get any more money by leaving behind Box A.
I have free will, so Omega can’t predict me. This problem is paradoxical.
This is a silly dilemma; why would Omega do that? ^[1]

Evidential decision theory

Evidential decision theories can be seen as a form of decision theory that was originally written down by historical accident—writing the expected utility formula as if it conditioned using Bayesian updating, because Bayesian updating is usually the way we condition probability functions. Historically, though, evidential decision theories was explicitly named as such in an (arguably failed) attempt to rationalize the pretheoretic answer of “I expect to do better if I one-box” on Newcomb’s Problem.

On Evidential decision theories, the principle of rational choice is to choose so that your act is the best news you could have received about your action; in other words, imagine being told that you had in fact made each of your possible choices, imagine what you would believe about the world in that case, and output the choice which would be the best news. Thus, evidential agents one-box on Newcomb’s Problem.

Although the EDT answer happens to conform with “the behavior of the agents that end up rich” on Newcomb’s Problem, LDT proponents note that it does not do so in general; see e.g. the transparent Newcomb’s Problem.

Causal decision theory

On causal decision theories, the principle of rational choice is to choose according to the causal consequences of your physical act; formally, to calculate expected utility by conditioning using a causal counterfactual. To choose, imagine as the world as it is right up until the moment of your physical act; assume that your physical act changes without that changing anything else about the world up until that point; then imagine time running forward under what your model says are the rules or physical laws.

A causal agent thus believes that Box B is already empty, and takes both boxes. When they imagine the (counterfactual) result of taking only box B instead, they imagine the world being the same up until that point in time—including Box B remaining empty—and then imagine the result of taking only Box B under physical laws past that point, namely, going home with $0.

Historically speaking, causal decision theory was first invented to justify two-boxing on Newcomb’s Problem; we can see CDT as formalizing the pretheoretic intuition, “Omega’s already gone, so I can’t get more money by leaving behind Box A.”

Logical decision theories

On logical decision theories, the principle of rational choice is “Decide as though you are choosing the logical output of your decision algorithm.” E.g., on timeless_dt, our extended causal model of the world would include a logical proposition for whether the output of your decision algorithm is ‘one-box’ or ‘two-box’; and this logical fact would affect both Omega’s prediction of you, and your actual decision. Thus, an LDT agent prefers that its algorithm have the logical output of one-boxing.

add graph for TDT on NP

^
Other Newcomblike problems may seem more naturally motivated, such as voting in elections, the Prisoner’s Dilemma, Parfit’s Hitchhiker, and the Absent-Minded Driver dilemma.

No entries.

TLW Mar 4, 2022, 12:16 AM
1 point
Similar issues arise if we imagine a skilled human psychologist who can predict other people’s actions with 65% accuracy.
This statement hides a subtle problem.
Consider the case of an agent that can predict an arbitrary agent with at least 65% probability, running the following algorithm: predict what it itself will do, then do the opposite. It is trivial to show that said agent is self-contradictory.
The only case that you can have a “skilled human psychologist who can predict other people’s actions with 65% accuracy” and not have similar self-contradictions is both a) there is no more than one such agent in existence, and b) said agent cannot predict themselves.
In which case we’ve essentially elevated said ‘skilled human psychologist’ to Oracle level (in the computational-oracle sense)...