Charlie Steiner comments on RL, but don’t do anything I wouldn’t do

Charlie Steiner 9 Dec 2024 6:18 UTC
4 points
0
I dunno, I think you can generalize reward farther than behavior. E.g. I might very reasonably issue high reward for winning a game of chess, or arriving at my destination safe and sound, or curing malaria, even if each involved intermediate steps that don’t make sense as ‘things I might do.’
I do agree there are limits to how much extrapolation we actually want, I just think there’s a lot of headroom for AIs to achieve ‘normal’ ends via ‘abnormal’ means.
- Gunnar_Zarncke 9 Dec 2024 10:52 UTC
  2 points
  0
  Parent
  I would be interested in what the questions of the uncertain imitator would look like in these cases.