jessicata comments on Asymptotic Decision Theory (Improved Writeup)

jessicata 27 Oct 2018 8:37 UTC
LW: 4 AF: 2
AF

In the original ADT paper, the agents are allowed to output distributions over moves.

The fact that we take the limit as epsilon goes to 0 means the evil problem can’t be constructed, even if randomization is not allowed. (The proof in the ADT paper doesn’t work, but that doesn’t mean something like it couldn’t possibly work)

It’s basically saying “since the two actions A and A′ get equal expected utility in the limit, the total variation distance between a distribution over the two actions, and one of the actions, limits to zero”, which is false

You’re right, this is an error in the proof, good catch.

Re chicken: The interpretation of the embedder that I meant is “opponent only uses the embedder where it is up against [whatever policy you plugged in]”. This embedder does not get knocked down by the reality filter. Let $E_{t}$ be the embedder. The logical inductor expects $U_{t}$ to equal the crash/crash utility, and it also expects $E_{t} (⌈ A D T_{ϵ} ⌉)$ to equal the crash/crash utility. The expressions $U_{t}$ and $E_{t} (⌈ A D T_{ϵ} ⌉)$ are provably equal, so of course the logical inductor expects them to be the same, and the reality check passes.

The error in your argument is that you are embedding actions rather than agents. The fact that NeverSwerveBot and ADT both provably always take the straight action does not mean the embedder assigns them equal utilities.
- Diffractor 29 Oct 2018 21:50 UTC
  LW: 3 AF: 2
  AF Parent
  Wasn’t there a fairness/continuity condition in the original ADT paper that if there were two “agents” that converged to always taking the same action, then the embedder would assign them the same value? (more specifically, if $E_{t} (| A_{t} - B_{t} |) < δ$ , then $E_{t} (| E_{t} (A_{t}) - E_{t} (B_{t}) |) < ϵ$ ) This would mean that it’d be impossible to have $E_{t} (E_{t} (A D T_{t, ϵ}))$ be low while $E_{t} (E_{t} (s t r a i g h t_{t}))$ is high, so the argument still goes through.
  Although, after this whole line of discussion, I’m realizing that there are enough substantial differences between the original formulation of ADT and the thing I wrote up that I should probably clean up this post a bit and clarify more about what’s different in the two formulations. Thanks for that.
  - jessicata 29 Oct 2018 22:31 UTC
    LW: 2 AF: 1
    AF Parent
    Yes, the continuity condition on embedders in the ADT paper would eliminate the embedder I meant. Which means the answer might depend on whether ADT considers discontinuous embedders. (The importance of the continuity condition is that it is used in the optimality proof; the optimality proof can’t apply to chicken for this reason).