Charbel-Raphaël comments on Don’t you think RLHF solves outer alignment?

Charbel-Raphaël 7 Nov 2022 8:13 UTC
1 point
0
Not important, but I don’t think RLHF can qualify as model-based RL. We usually use PPO in RLHF, and it’s a model-free RL algorithm.
- Jacob_Hilton 8 Nov 2022 0:05 UTC
  1 point
  0
  Parent
  I just meant that the usual RLHF setup is essentially RL in which the reward is provided by a learned model, but I agree that I was stretching the way the terminology is normally used.