Measure comments on Newcomb’s problem is just a standard time consistency problem

Measure 31 Mar 2022 21:23 UTC
3 points

One possibility would be to select a program which at every time it has to make a decision estimates the utility that will result from each possible action, and takes the action that will yield the highest utility.

Why doesn’t this work for Newcomb’s problem? What is the (expected) utility for one-boxing? For two-boxing? Which is higher?
- Dagon 31 Mar 2022 23:10 UTC
  2 points
  Parent
  The “problem” part is that the utility for being predicted NOT to two-box is different from the utility for two-boxing. If the decision cannot influence the already-locked-in prediction (which is the default intuition behind CDT), it’s simple to correctly two-box and take your 1.001 million. If the decision is invisibly constrained to match the prediction, then it’s simple to one-box and take your 1m. Both are maximum available outcomes, in different decision-causality situations.
  - Measure 1 Apr 2022 1:31 UTC
    3 points
    Parent
    But conditional on making each decision, you can update your distribution for Omega’s prediction and calculate your EV accordingly. I guess that’s basically EDT though.