It’s strange that he doesn’t mention DeepSeek-R1-Zero anywhere in that blogpost, which is arguably the most important development DeepSeek announced (self-play RL on reasoning models). R1-Zero is what stuck out to me in DeepSeek’s papers, and ex. the Arc Prize team behind the Arc-Agi benchmark says:
R1-Zero is significantly more important than R1.
Was R1-Zero already obvious to the big labs, or is Amodei deliberately underemphasizing that part?
It’s strange that he doesn’t mention DeepSeek-R1-Zero anywhere in that blogpost, which is arguably the most important development DeepSeek announced (self-play RL on reasoning models). R1-Zero is what stuck out to me in DeepSeek’s papers, and ex. the Arc Prize team behind the Arc-Agi benchmark says:
Was R1-Zero already obvious to the big labs, or is Amodei deliberately underemphasizing that part?