Adam Jermyn comments on The Speed + Simplicity Prior is probably anti-deceptive

Adam Jermyn 12 May 2022 17:00 UTC
1 point
AF
This is very interesting! A few thoughts/questions:
1. I didn’t quite follow the argument that H_{fh} beats H_{sd} on complexity. Is it that pointing to the base objective is more complicated than the logic of (simple mesaobjective) + (search logic to long-run optimize the mesaobjective)? If so worry a little that H_{sd} still has to learn a pointer to the base objective, if only so that it can perform well on it during training.
2. I actually think you can define a speed prior with a single long training episode. For an agent that plays chess the prior can be over thinking time per move. For an agent that runs in a simulated environment it could be ‘thinking time per unit simulation time’. For GPT it could be ‘thinking time per predicted word’, and so on.