I only really know about the first bit, so have a comment about that :)
Predictably, when presented with the 1st-person problem I immediately think of hierarchical models. It’s easy to say “just imagine you were in their place.” What I’d think could do this thing is accessing/constructing a simplified model of the world (with primitives that have interpretations as broad as “me” and “over there”) that is strongly associated with the verbal thought (EDIT: or alternately is a high-level representation that cashes out to the verbal thought via a pathway that ends in verbal imagination), and then cashing out the simplified model into a sequence of more detailed models/anticipations by fairly general model-cashing-out machinery.
I’m not sure if this is general enough to capture how humans do it, though. When I think of humans on roughly this level of description, I usually think of having many different generative models (a metaphor for a more continuous system with many principal modes, which is still a metaphor for the brain-in-itself) that get evaluated at first in simple ways, and if found interesting get broadcasted and get to influence the current thought, meanwhile getting evaluated in progressively more complex ways. Thus a verbal thought “imagine you were in their place” can get sort of cashed out into imagination by activation of related-seeming imaginings. This lacks the same notion of “models” as above; i.e. a context agent is still too agenty, we don’t need the costly simplification of agentyness in our model to talk about learning from other peoples’ actions.
Plus that doesn’t get into how to pick out what simplified models to learn from. You can probably guess better than me if humans do something innate that involves tracking human-like objects and then feeling sympathy for them. And I think I’ve seen you make an argument that something similar could work for an AI, but I’m not sure. (Would a Bayesian updater have less of the path-dependence that safety of such innate learning seems to rely on?)
I only really know about the first bit, so have a comment about that :)
Predictably, when presented with the 1st-person problem I immediately think of hierarchical models. It’s easy to say “just imagine you were in their place.” What I’d think could do this thing is accessing/constructing a simplified model of the world (with primitives that have interpretations as broad as “me” and “over there”) that is strongly associated with the verbal thought (EDIT: or alternately is a high-level representation that cashes out to the verbal thought via a pathway that ends in verbal imagination), and then cashing out the simplified model into a sequence of more detailed models/anticipations by fairly general model-cashing-out machinery.
I’m not sure if this is general enough to capture how humans do it, though. When I think of humans on roughly this level of description, I usually think of having many different generative models (a metaphor for a more continuous system with many principal modes, which is still a metaphor for the brain-in-itself) that get evaluated at first in simple ways, and if found interesting get broadcasted and get to influence the current thought, meanwhile getting evaluated in progressively more complex ways. Thus a verbal thought “imagine you were in their place” can get sort of cashed out into imagination by activation of related-seeming imaginings. This lacks the same notion of “models” as above; i.e. a context agent is still too agenty, we don’t need the costly simplification of agentyness in our model to talk about learning from other peoples’ actions.
Plus that doesn’t get into how to pick out what simplified models to learn from. You can probably guess better than me if humans do something innate that involves tracking human-like objects and then feeling sympathy for them. And I think I’ve seen you make an argument that something similar could work for an AI, but I’m not sure. (Would a Bayesian updater have less of the path-dependence that safety of such innate learning seems to rely on?)