Charlie Steiner comments on Towards empathy in RL agents and beyond: Insights from cognitive science for AI Alignment

Charlie Steiner 5 Apr 2023 2:49 UTC
3 points
0
Interesting talk, though I skimmed a fair bit of it once I felt like the experiments weren’t telling me what I wanted. I think the crucial thing to try to do here, to show me that you’re seeing something interesting, is interventions. Can you increase or decrease some measure of empathy without changing orthogonal metrics very much?
The second things to do is to not use a toy NN model on a gridworld. The early results are suggestive, but I’m really wary of generalizing suggestive results from 3x3 gridworlds to the much-more-than-3x3 real world. So I’m looking forward to future scaled-up work.