Vladimir_Nesov comments on [missing post]

Vladimir_Nesov 29 Jan 2021 8:24 UTC
20 points
This is not the halting problem. The halting problem is about existence of an algorithm that predicts all algorithms. Here, we are mostly interested in predicting Agent and Predictor.

However, the standard argument about the halting problem can be applied to this case, giving interesting constructions. Agent can decide to diagonalize Predictor, to simulate it and then say the opposite, making it impossible for Predictor to predict it. Similarly, Agent can decide to diagonalize itself (which is called having a chicken rule), by committing to do the opposite of whatever it predicts itself to do, if that ever happens. This makes it impossible for Agent to predict what it’s going to do. (The details of correctly implementing the chicken rule are harder to formalize in the absence of oracles.)

It might seem that some questions similar to the halting problem are “unfair”: if Predictor sees that Agent is diagonalizing it, it understands Agent’s behavior fully, yet it can’t express its understanding, as it must state a particular action Agent is going to perform, which is impossible. What if instead Predictor states a dependence of Agent’s action on Predictor’s prediction? But then Agent might be able to diagonalize that as well. One solution is for Predictor to avoid saying or thinking its prediction in a way legible to Agent, and standard Newcomb’s problem tries to do exactly that by keeping boxes opaque and Predictor’s algorithm unknown.

When Predictor is not an oracle and its algorithm is known to Agent, we might simply require that it fills the big box with money only if it succeeds in predicting that Agent one-boxes. In that case, Agent has an incentive to be predictable. If a decision theory prescribes behavior that makes it impossible for a particular Predictor to predict one-boxing, then Agent following that decision theory goes home without the money. One example is the Agent Simulates Predictor (ASP) problem, where Predictor performs only a bounded computation, and Agent can fully simulate Predictor’s reasoning, which tempts it into two-boxing if Predictor is seen to predict one-boxing. Another example is Transparent Newcomb’s problem, where Agent is allowed to observe Predictor’s prediction directly (by seeing the contents of the boxes before making a decision).