Dalcy comments on Trying to isolate objectives: approaches toward high-level interpretability

Dalcy 11 Jan 2023 18:57 UTC
2 points
0
One thing I imagine might be useful even in small training regimes would be to train on tasks where the only possible solution necessarily involves a search procedure, i.e. “search-y tasks”
For example, it’s plausible that simple heuristics aren’t sufficient to get you to superhuman-level on tasks like Chess or Go, so a superhuman RL performance on these tasks would be a fairly good evidence that the model already has an internal search process.
But one problem with Chess or Go would be that the objective is fixed, i.e. the game rules. So perhaps one way to effectively isolate objectives in small training regimes is to find tasks that are both “search-y” and can be modified to have modularly varying objectives eg Chess, but with various possible game rules.
- Jozdien 11 Jan 2023 19:21 UTC
  1 point
  0
  Parent
  Oh yeah I agree—I was thinking more along the lines of that small models would end up with heuristics even for some tasks that require search to do really well, because they may have slightly complex heuristics learnable by models of that size that allow okay performance relative to the low-power search they would otherwise be capable of. I agree that this could make a quantitative difference though and hadn’t thought explicitly of structuring the task along this frame, so thanks!