RSS

Ran W

Karma: 17

Why do we need RLHF? Imi­ta­tion, In­verse RL, and the role of reward

Ran WFeb 3, 2024, 4:00 AM
16 points
0 comments5 min readLW link