The idea of agent clusters seems closely related to my idea about modeling dynamically inconsistent agents. In my model, each subagent controls particular states. More general game-theoretic models can be considered, by they seem to have worse properties.
Regarding the broader question, about how to bridge the gap between agents that are “ideal, perfectly rational consequentialists” and “realistic” agents (e.g. humans), more factors that can be relevant are:
Realistic agents have computational resource bounds.
Realistic agents might be learning algorithms with suboptimal sample complexity (i.e. it takes them longer to learn that a perfect agent).
Realistic agents might only remain coherent within some subspace of the state space (although that might be possible to model using dynamic inconsistency).
We can also consider agents that have some “Knightian uncertainty” about their own utility function. For example, we can consider a convex set in the space of utility functions, and have the agent follow the maximin policy w.r.t. this convex set. As a more specific example, we can consider an instrumental reward function that is only defined on some affine subspace of the instrumental state space, and consider all extensions of it to the entire instrumental state space that don’t increase its range.
The idea of agent clusters seems closely related to my idea about modeling dynamically inconsistent agents. In my model, each subagent controls particular states. More general game-theoretic models can be considered, by they seem to have worse properties.
Regarding the broader question, about how to bridge the gap between agents that are “ideal, perfectly rational consequentialists” and “realistic” agents (e.g. humans), more factors that can be relevant are:
Realistic agents have computational resource bounds.
Realistic agents might be learning algorithms with suboptimal sample complexity (i.e. it takes them longer to learn that a perfect agent).
Realistic agents might only remain coherent within some subspace of the state space (although that might be possible to model using dynamic inconsistency).
We can also consider agents that have some “Knightian uncertainty” about their own utility function. For example, we can consider a convex set in the space of utility functions, and have the agent follow the maximin policy w.r.t. this convex set. As a more specific example, we can consider an instrumental reward function that is only defined on some affine subspace of the instrumental state space, and consider all extensions of it to the entire instrumental state space that don’t increase its range.