I think another thing is: One could think of subagents as an inevitable consequence of bounded rationality.
In particular, you can’t have a preference for “a future state of the world”, because “a future state of the world” is too complicated to fit in your head. You can have a preference over “thoughts” (see §9.2.2 here), and a “thought” can involve attending to a particular aspect of “a future state of the world”. But if that’s how preferences are built, then you can immediately get into situations where you can have multiple preferences conflicting with each other—e.g. there’s a future state of the world, and if you think about it one way / attend to one aspect of it, it’s attractive, and if you think about it another way / attend to a different aspect of it, it’s aversive. And if you anthropomorphize those conflicting preferences (which is reasonable in the context of an algorithm that will back-chain from arbitrary preferences), you wind up talking about conflicting “subagents”.
I think another thing is: One could think of subagents as an inevitable consequence of bounded rationality.
In particular, you can’t have a preference for “a future state of the world”, because “a future state of the world” is too complicated to fit in your head. You can have a preference over “thoughts” (see §9.2.2 here), and a “thought” can involve attending to a particular aspect of “a future state of the world”. But if that’s how preferences are built, then you can immediately get into situations where you can have multiple preferences conflicting with each other—e.g. there’s a future state of the world, and if you think about it one way / attend to one aspect of it, it’s attractive, and if you think about it another way / attend to a different aspect of it, it’s aversive. And if you anthropomorphize those conflicting preferences (which is reasonable in the context of an algorithm that will back-chain from arbitrary preferences), you wind up talking about conflicting “subagents”.