Unlike in subagent models, the subcomponents of agents are not themselves always well modeled as (relatively) rational agents. For example, there might be shards that are inactive most of the time and only activate in a few situations.
For what it’s worth, at least in my conception of subagent models, there can also be subagents that are inactive most of the time and only activate in a few situations. That’s probably the case for most of person’s subagents, though of course “subagent” isn’t necessarily a concept that cuts reality a joints, so this depends on where exactly you’d draw the boundaries for specific subagents.
Shard theory claims that the process that maps shards to actions can be modeled as making “bids” to a planner. That is, instead of shards directly voting on actions, they attempt to influence the planner in ways that have “historically increased the probability of executing plans” favored by the shard. For example, if the juice shard bringing a memory of consuming juice to conscious attention has historically led to the planner outputting plans where the baby consumes more juice, then the juice shard will be shaped via reinforcement learning to recall memories of juice consumption at opportune times. On the other hand, if raising the presence of a juice pouch to the planner’s attention has never been tried in the past, then we shouldn’t expect the juice shard to attempt this more so than any other random action.
This is another way in which shard theory differs from subagent models—by default, shards aren’t doing their own planning or search; they merely execute strategies that are learned via reinforcement learning.
This is actually close to the model of subagents that I had in “Subagents, akrasia, and coherence in humans” and “Subagents, neural Turing machines, thought selection, and blindspots”. The former post talks about subagents sending competing bids to a selection mechanism that picks the winning bid based on (among other things) reinforcement learning and the history of which subagents have made successful predictions in the past. It also distinguishes between “goal-directed” and “habitual” subagents, where “habitual” ones are mostly executing reinforced strategies rather than doing planning.
The latter post talks about learned rules which shape our conscious content, and how some of the appearance of planning and search may actually come from reinforcement learning creating rules that modify consciousness in specific ways (e.g. the activation of an “angry” subagent frequently causing harm, with reward then accruing to selection rules that block the activation of the “angry” subagent such as by creating a feeling of confusion instead, until it looks like there is a “confusion” subagent that “wants” to block the feeling of anger).
I agree that your model of subagents in the two posts share a lot of commonalities with parts of Shard Theory, and I should’ve done a lit review of your subagent posts. (I based my understanding of subagent models on some of the AI Safety formalisms I’ve seen as well as John Wentworth’s Why Subagents?.) My bad.
That being said, I think it’s a bit weird to have “habitual subagents”, since the word “agent” seems to imply some amount of goal-directedness. I would’ve classified your work as closer to Shard Theory than the subagent models I normally think about.
That being said, I think it’s a bit weird to have “habitual subagents”, since the word “agent” seems to imply some amount of goal-directedness.
Yeah, I did drift towards more generic terms like “subsystems” or “parts” later in the series for this reason, and might have changed the name of the sequence if only I’d managed to think of something better. (Terms like “subagents” and “multi-agent models of mind” still gesture away from rational agent models in a way that more generic terms like “subsystems” don’t.)
For what it’s worth, at least in my conception of subagent models, there can also be subagents that are inactive most of the time and only activate in a few situations. That’s probably the case for most of person’s subagents, though of course “subagent” isn’t necessarily a concept that cuts reality a joints, so this depends on where exactly you’d draw the boundaries for specific subagents.
This is actually close to the model of subagents that I had in “Subagents, akrasia, and coherence in humans” and “Subagents, neural Turing machines, thought selection, and blindspots”. The former post talks about subagents sending competing bids to a selection mechanism that picks the winning bid based on (among other things) reinforcement learning and the history of which subagents have made successful predictions in the past. It also distinguishes between “goal-directed” and “habitual” subagents, where “habitual” ones are mostly executing reinforced strategies rather than doing planning.
The latter post talks about learned rules which shape our conscious content, and how some of the appearance of planning and search may actually come from reinforcement learning creating rules that modify consciousness in specific ways (e.g. the activation of an “angry” subagent frequently causing harm, with reward then accruing to selection rules that block the activation of the “angry” subagent such as by creating a feeling of confusion instead, until it looks like there is a “confusion” subagent that “wants” to block the feeling of anger).
Thanks for the clarification!
I agree that your model of subagents in the two posts share a lot of commonalities with parts of Shard Theory, and I should’ve done a lit review of your subagent posts. (I based my understanding of subagent models on some of the AI Safety formalisms I’ve seen as well as John Wentworth’s Why Subagents?.) My bad.
That being said, I think it’s a bit weird to have “habitual subagents”, since the word “agent” seems to imply some amount of goal-directedness. I would’ve classified your work as closer to Shard Theory than the subagent models I normally think about.
No worries!
Yeah, I did drift towards more generic terms like “subsystems” or “parts” later in the series for this reason, and might have changed the name of the sequence if only I’d managed to think of something better. (Terms like “subagents” and “multi-agent models of mind” still gesture away from rational agent models in a way that more generic terms like “subsystems” don’t.)