Something I wrote recently as part of a private conversation, which feels relevant enough to ongoing discussions to be worth posting publicly:
The way I think about it is something like: a “goal representation” is basically what you get when it’s easier to state some compact specification on the outcome state, than it is to state an equivalent set of constraints on the interveningtrajectories to that state.
In principle, this doesn’t have to equate to “goals” in the intuitive, pretheoretic sense, but in practice my sense is that this happens largely when (and because) permitting longer horizons (in the sense of increasing the length of the minimal sequence needed to reach some terminal state) causes the intervening trajectories to explode in number and complexity, s.t. it’s hard to impose meaningful constraints on those trajectories that don’t map to (and arise from) some much simpler description of the outcomes those trajectories lead to.
This connects with the “reasoners compress plans” point, on my model, because a reasoner is effectively a way to map that compact specification on outcomes to some method of selecting trajectories (or rather, selecting actions which select trajectories); and that, in turn, is what goal-oriented reasoning is. You get goal-oriented reasoners (“inner optimizers”) precisely in those cases where that kind of mapping is needed, because simple heuristics relating to the trajectory instead of the outcome don’t cut it.
It’s an interesting question as to where exactly the crossover point occurs, where trajectory-heuristics stop functioning as effectively as consequentialist outcome-based reasoning. On one extreme, there are examples like tic-tac-toe, where it’s possible to play perfectly based on a myopic set of heuristics without any kind of search involved. But as the environment grows more complex, the heuristic approach will in general be defeated by non-myopic, search-like, goal-oriented reasoning (unless the latter is too computationally intensive to be implemented).
That last parenthetical adds a non-trivial wrinkle, and in practice reasoning about complex tasks subject to bounded computation does best via a combination of heuristic-based reasoning about intermediate states, coupled to a search-like process of reaching those states. But that already qualifies in my book as “goal-directed”, even if the “goal representations” aren’t as clean as in the case of something like (to take the opposite extreme) AIXI.
To me, all of this feels somewhat definitionally true (though not completely, since the real-world implications do depend on stuff like how complexity trades off against optimality, where the “crossover point” lies, etc). It’s just that, in my view, the real world has already provided us enough evidence about this that our remaining uncertainty doesn’t meaningfully change the likelihood of goal-directed reasoning being necessary to achieve longer-term outcomes of the kind many (most?) capabilities researchers have ambitions about.
Something I wrote recently as part of a private conversation, which feels relevant enough to ongoing discussions to be worth posting publicly: