Charlie Steiner comments on Empathy as a natural consequence of learnt reward models

Charlie Steiner 4 Feb 2023 19:37 UTC
2 points
1
Very interesting, thanks. I’m unconvinced that the motivational aspects of empathy are common in learning algorithms that look like gradient descent—if flinching when someone else is hurt doesn’t harm your reproductive fitness then maybe it’s easy for evolution to stick with it, but substantively changing your plans to avoid causing that flinch (as in the rats not shocking other rats) should rise to the attention of gradient descent and get massaged out.

My prediction is that there really is an evolved nudge towards empathy in the human motivational system, and that human psychology—like usually being empathetic but sometimes modulating it and often justifying self-serving actions—is sculpted by such evolved nudges, and wouldn’t be recapitulated in AI lacking those nudges.
- beren 6 Feb 2023 20:28 UTC
  3 points
  0
  Parent
  My prediction is that there really is an evolved nudge towards empathy in the human motivational system, and that human psychology—like usually being empathetic but sometimes modulating it and often justifying self-serving actions—is sculpted by such evolved nudges, and wouldn’t be recapitulates in AI lacking those nudges.
  I agree—this is partly what I am trying to say in the contextual modulation section. The important thing is that the base capability for empathy might exist as a substrate to then get sculpted by gradient descent / evolution to implement a wide range of adaptive pro or anti-social emotions/behaviours. Which of these behaviours, if any, get used by the AI will depend on the reward function / training data it sees.