David James comments on Do models say what they learn?

David James 28 Mar 2025 1:19 UTC
1 point
0

The recent rise of reinforcement learning (RL) for language models introduces an interesting dynamic to this problem.

Saying “recent rise” feels wrong to me. In any case, it is vague. Better to state the details. What do you consider to be the first LLM? The first use of RLHF with a LLM? My answers would probably be 2018 (BERT) and 2019 (OpenAI), respectively.