Rohin Shah comments on Occam’s Razor May Be Sufficient to Infer the Preferences of Irrational Agents: A reply to Armstrong & Mindermann

Rohin Shah 9 Oct 2019 7:11 UTC
LW: 3 AF: 2
AF
That’s an accurate summary of what I’m saying.
at least, I don’t see any argument for it in A&M’s paper. It may still be true, but further argument is needed.
If you are picking randomly out of a set of N possibilities, the chance that you pick the “correct” one is 1/N. It seems like in any decomposition (whether planner/reward or initial conditions/dynamics), there will be N decompositions, with N >> 1, where I’d say “yeah, that probably has similar complexity as the correct one”. The chance that the correct one is also the simplest one out of all of these seems basically like 1/N, which is ~0.
You could make an argument that we aren’t actually choosing randomly, and correctness is basically identical to simplicity. I feel the pull of this argument in the limit of infinite data for laws of physics (but not for finite data), but it just seems flatly false for the reward/planner decomposition.
- Daniel Kokotajlo 9 Oct 2019 22:45 UTC
  LW: 1 AF: 1
  AF Parent
  I feel like there’s a big difference between “similar complexity” and “the same complexity.” Like, if we have theory T and then we have theory T* which adds some simple unobtrusive twist to it, we get another theory which is of similar complexity… yet realistically an Occam’s-Razor-driven search process is not going to settle on T*, because you only get T* by modifying T. And if I’m wrong about this then it seems like Occam’s Razor is broken in general; in any domain there are going to be ways to turn T’s into T*’s. But Occam’s Razor is not broken in general (I feel).
  Maybe this is the argument you anticipate above with ”...we aren’t actually choosing randomly.” Occam’s Razor isn’t random. Again, I might agree with you that intuitively Occam’s Razor seems more useful in physics than in preference-learning. But intuitions are not arguments, and anyhow they aren’t arguments that appeared in the text of A&M’s paper.