paulfchristiano comments on “Evil” decision problems in provability logic

paulfchristiano 16 Nov 2014 3:24 UTC
0 points
AF
If the agent is randomized (and the universe can’t observe this randomness) then this problem goes away. So it still seems (a bit) plausible that one can have an “optimal” decision theory. The formalization of this would be non-trivial. Intuitively, it seems like if UDT gets to condition on the output of a competitor X, then UDT ought to be able to produce an output at least as good as X’s (and it is essentially the unique decision theory with this property), for any extensional problem.

The modal logic setting might be OK for exploring these questions, but I suspect the results would have to be “pulled back” to a more conventional environment in order to be widely appreciated.

(Sorry if this would be more appropriate as a comment on your follow-up post.)
- Benya_Fallenstein 16 Nov 2014 4:29 UTC
  0 points
  AF Parent
  Actually, drnickbone’s original LessWrong post introducing evil problems also gives an extension to the case you are considering: The evil decision problem gives the agent three or more options, and rewards the one that the “victim” decision theory assigns the least probability to (breaking ties lexicographically). Then, no decision theory can put probability $> 1 / N$ on the action that is rewarded in its probabilistic evil problem.
  - paulfchristiano 16 Nov 2014 15:52 UTC
    0 points
    AF Parent
    This setup plays a computational trick, and as a result I don’t think it violates the optimality standard I proposed. In order to decide what it should do, the CDT agent needs to think strictly longer than the UDT agent. But if the CDT agent thinks longer than the UDT agent, it’s totally unsurprising that it does better! (Basically, the problem just consists of a computational question which is chosen to be slightly too complex for the UDT agent. But the CDT agent is allowed to think as long as it likes. This entire family of problems appears to be predicated on the lack of computational limits for our agents.)
    
    As a result, if the UDT agent is told what the CDT agent decides, then it can get the same performance as the CDT agent. This seems to illustrate that the CDT agent isn’t doing better by being wiser, just by knowing something the UDT agent doesn’t. (I wasn’t actually thinking about this case when I introduced the weakened criterion; the weakening is obviously necessary for UDT with 10 years of time to compete with CDT with 11 years of time, and I included it for that reason.)
    
    Does this seem right? If so, is there a way to set up the problem that violates my weakened standard?
    
    Incidentally, this problem involves a discontinuous dependence on UDT’s decision(both by the competitor and by the environment). I wonder if this discontinuous dependence is necessary?
    What links here?
    Improving the modal UDT optimality result by Benya_Fallenstein (23 Nov 2014 22:16 UTC; 13 points)
    - paulfchristiano 16 Nov 2014 16:09 UTC
      0 points
      AF Parent
      Nevermind, we get the same problem in the Newcomb’s problem case when the simulated agent is running UDT, i.e. where P(big box contains $1M) = P(UDT agent 1-boxes after being told that the CDT agent 2-boxes).