Well I thought about that but I wasn’t sure whether reinforcement learning from human feedback wouldn’t be just a strict subset of reward learning from human feedback. If reinforcement is indeed the strict definition then I concede but I dont think it makes sense.
The acronym is definitely used for reinforcement learning. [“RLHF” “reinforcement learning from human feedback”] gets 564 hits on google, [“RLHF” “reward learning from human feedback”] gets 0.
I think historically reinforcement has been used more in that particular constellation (see eg deep RL from HP paper) but as I noted I find reward learning more apt as it points to the hard thing being the reward learning, i.e. distilling human feedback into an objective, rather than the optimization of any given reward function (which technically need not involve reinforcement learning)
Reinforcement* learning from human feedback
Well I thought about that but I wasn’t sure whether reinforcement learning from human feedback wouldn’t be just a strict subset of reward learning from human feedback. If reinforcement is indeed the strict definition then I concede but I dont think it makes sense.
The acronym is definitely used for reinforcement learning. [“RLHF” “reinforcement learning from human feedback”] gets 564 hits on google, [“RLHF” “reward learning from human feedback”] gets 0.
Thanks for verifying! I retract my comment.
I think historically reinforcement has been used more in that particular constellation (see eg deep RL from HP paper) but as I noted I find reward learning more apt as it points to the hard thing being the reward learning, i.e. distilling human feedback into an objective, rather than the optimization of any given reward function (which technically need not involve reinforcement learning)