johnswentworth comments on Are minimal circuits deceptive?

johnswentworth 7 Sep 2019 20:03 UTC
LW: 14 AF: 8
AF
Interesting proof. A couple off-the-cuff thoughts...
First, this is proving something much more general than just deceptiveness. We can just scribble out the three or four English sentences with the word “deceptive” in them, and just say ” $C (π, x)$ is a predicate”—the proof doesn’t actually use any properties of C. So this is a fairly general result about minimal circuits on MDPs—and I’m wondering what other predicates could be plugged into it to yield interesting results.
Second, the proof doesn’t actually use the assumption that the minimal circuit on the original set of tasks is performing search. All that matters is that there is some MDP on which the circuit is deceptive, and that the circuit is minimal subject to an average performance bound on the full set of tasks.
Third, we can state the overall conclusion as roughly “if there exists a set of MDPs for which the minimal circuit achieving some average performance on the set is deceptive on at least one of the MDPs, then there exists a single MDP whose minimal circuit is deceptive.” On the other hand, the reverse implication holds trivially: if there exists a single MDP whose minimal circuit is deceptive, then there exists a set of MDPs for which the minimal circuit achieving some average performance on the set is deceptive on at least one of the MDPs. Proof: take the set to contain just the one MDP whose minimal dircuit is deceptive. So we don’t just have a one-way implication here; we actually have equivalence between two open problems. Obvious next question: what other minimal-circuit problems are equivalent to these two?
- evhub 9 Sep 2019 18:50 UTC
  LW: 4 AF: 3
  AF Parent
  I agree that the proof here can be made significantly more general—and I agree that exploring that definitely seems worthwhile—though I also think it’s worth pointing out that the proof rests on assumptions that I would be a lot less confident would hold in other situations. The point of explaining the detail regarding search algorithms here is that it gives a plausible story for why the assumptions made regarding $π_{learn}$ and $T_{bad}$ should actually hold.