Oh yeah I agree—I was thinking more along the lines of that small models would end up with heuristics even for some tasks that require search to do really well, because they may have slightly complex heuristics learnable by models of that size that allow okay performance relative to the low-power search they would otherwise be capable of. I agree that this could make a quantitative difference though and hadn’t thought explicitly of structuring the task along this frame, so thanks!
Oh yeah I agree—I was thinking more along the lines of that small models would end up with heuristics even for some tasks that require search to do really well, because they may have slightly complex heuristics learnable by models of that size that allow okay performance relative to the low-power search they would otherwise be capable of. I agree that this could make a quantitative difference though and hadn’t thought explicitly of structuring the task along this frame, so thanks!