Nora Belrose comments on Seriously, what goes wrong with “reward the agent when it makes you smile”?

Nora Belrose 15 Aug 2022 13:52 UTC
6 points
5
I agree that AGI will need general purpose problem solving routines (by definition). I also agree that this requires something like recursive decomposition of problems into subproblems. I’m just very skeptical that the kinds of neural nets we’re training right now can learn to do anything remotely like that— I think it’s much more likely that people will hard code this type of reasoning into the compute graph with stuff like MCTS. This has already been pretty useful for e.g. MuZero. Once we’re hard coding search it’s less scary because it’s more interpretable and we can see exactly where the mesaobjective is.
I also don’t really buy the compactness argument at all. I think neural nets are biased toward flat minima / broad basins but these don’t generally correspond to “simple” functions in the Kolmogorov sense; they’re more like equivalence classes of diverse bundles of heuristics that all get about the same train and val loss. I’m interpreting this paper as providing some evidence in that direction.
- johnswentworth 15 Aug 2022 17:25 UTC
  2 points
  0
  Parent
  I’m just very skeptical that the kinds of neural nets we’re training right now can learn to do anything remotely like that— I think it’s much more likely that people will hard code this type of reasoning into the compute graph with stuff like MCTS. This has already been pretty useful for e.g. MuZero. Once we’re hard coding search it’s less scary because it’s more interpretable and we can see exactly where the mesaobjective is.
  I hope that you’re right; that would make Retargeting The Search very easy, and basically eliminates the inner alignment problem. Assuming, of course, that we can somehow confidently rule out the rest of the net doing any search in more subtle ways.