The phenomenon you call by names like “goals” or “agency” is one possible shadow of the deep structure of optimization—roughly, preimaging outcomes onto choices by reversing a complicated transformation.
i.e. if we were to pin-down something we actually care about, that’d be “a system exhibiting consequentialism”, because those are the kind of systems that will end up shaping our lightcone and more. Consequentialism is convergent in an optimization process, i.e. the “deep structure of optimization”. Terms like “goals” or “agency” are shadows of consequentialism, finite approximations of this deep structure.
And by the virtue of being finite approximations (eg they’re embedded), these “agents” have a bunch of convergent properties that makes it easy for us to reason about the “deep structure” themselves, like eg modularity, having a world-model, etc (check johnswentworth’s comment).
it is relatively unimportant to understand agency for its own sake or intelligence for its own sake or optimization for its own sake. Instead we should remember that these are frames for understanding these patterns that exert influence over the future
Also important to note:
i.e. if we were to pin-down something we actually care about, that’d be “a system exhibiting consequentialism”, because those are the kind of systems that will end up shaping our lightcone and more. Consequentialism is convergent in an optimization process, i.e. the “deep structure of optimization”. Terms like “goals” or “agency” are shadows of consequentialism, finite approximations of this deep structure.
And by the virtue of being finite approximations (eg they’re embedded), these “agents” have a bunch of convergent properties that makes it easy for us to reason about the “deep structure” themselves, like eg modularity, having a world-model, etc (check johnswentworth’s comment).
Edit: Also the following quote