Dave Orr comments on AGI with RL is Bad News for Safety

Dave Orr 21 Dec 2024 19:49 UTC
3 points
0
Why does RL necessarily mean that AIs are trained to plan ahead?
- Nadav Brandes 21 Dec 2024 20:04 UTC
  2 points
  0
  Parent
  I explain it in more detail in my original post.
  In short, in standard language modeling the model only tries to predict the most likely immediate next token (T1), and then the most likely token after that (T2) given T1, and so on; whereas in RL it’s trying to optimize a whole sequence of next tokens (T1, …, Tn) such that the rewards for all the tokens (up to Tn) are taken into account in the reward of the immediate next token (T1).