If we expect to gain something from studying how humans implement these processes, it’d have to be something like ensuring that our AIs understand them “in the same way that humans do,” which e.g. might help our AIs generalize in a similar way to humans.
I take your point that there is probably nothing special about the specific way(s) that humans get good at predicting other humans. I do think that “help[ing] our AIs generalize in a similar way to humans” might be important for safety (e.g., we probably don’t want an AGI that figures out its programmers way faster/more deeply than they can figure it out). I also think it’s the case that we don’t currently have a learning algorithm that can predict humans as well as humans can predict humans. (Someattempts, but not there yet.) So to the degree that current approaches are lacking, it makes sense to me to draw some inspiration from the brain-based algorithms that already implement these processes extremely well—i.e., to first understand these algorithms, and to later develop training goals in accordance with the heuristics/architecture these algorithms seem to instantiate.
This is notably in contrast to affective empathy, though, which is not something that’s inherently necessary for predictive accuracy—so figuring out how/why humans do that has a more concrete story for how that could be helpful.
Agreed! I think it’s worth noting that if you take seriously the ‘hierarchical IRL’ model I proposed in the ToM section, understanding the algorithm(s) underlying affective empathy might actually require understanding cognitive and affective ToM (i.e., if these are the substrate of affective empathy, we’ll probably need a good model of them before we can have a good model of affective empathy).
And wrt learning vs. online learning, I think I’m largely in agreement with Steve’s reply. I would also add that this might end up just being a terminological dispute depending on how flexible we are with calling particular phases “training” vs. “deployment.” E.g., is a brain “deployed” when the person’s genetic make-up as a zygote is determined? Or is it when they’re born? When their brain stops developing? When they learn the last thing they’ll ever learn? To the degree we think these questions are awkward/their answers are arbitrary, I would think this counts as evidence that the notion of “online learning” is useful to invoke here/gives us more parsimonious answers.
I take your point that there is probably nothing special about the specific way(s) that humans get good at predicting other humans. I do think that “help[ing] our AIs generalize in a similar way to humans” might be important for safety (e.g., we probably don’t want an AGI that figures out its programmers way faster/more deeply than they can figure it out). I also think it’s the case that we don’t currently have a learning algorithm that can predict humans as well as humans can predict humans. (Some attempts, but not there yet.) So to the degree that current approaches are lacking, it makes sense to me to draw some inspiration from the brain-based algorithms that already implement these processes extremely well—i.e., to first understand these algorithms, and to later develop training goals in accordance with the heuristics/architecture these algorithms seem to instantiate.
Agreed! I think it’s worth noting that if you take seriously the ‘hierarchical IRL’ model I proposed in the ToM section, understanding the algorithm(s) underlying affective empathy might actually require understanding cognitive and affective ToM (i.e., if these are the substrate of affective empathy, we’ll probably need a good model of them before we can have a good model of affective empathy).
And wrt learning vs. online learning, I think I’m largely in agreement with Steve’s reply. I would also add that this might end up just being a terminological dispute depending on how flexible we are with calling particular phases “training” vs. “deployment.” E.g., is a brain “deployed” when the person’s genetic make-up as a zygote is determined? Or is it when they’re born? When their brain stops developing? When they learn the last thing they’ll ever learn? To the degree we think these questions are awkward/their answers are arbitrary, I would think this counts as evidence that the notion of “online learning” is useful to invoke here/gives us more parsimonious answers.