Vladimir_Nesov comments on Bogdan Ionut Cirstea’s Shortform

Vladimir_Nesov 28 Nov 2024 22:22 UTC
8 points
0
From proliferation perspective, it reduces overhang, makes it more likely that Llama 4 gets long reasoning trace post-training in-house rather than later, and so initial capability evaluations give more relevant results. But if Llama 4 is already training, there might not be enough time for the technique to mature, and Llamas have been quite conservative in their techniques so far.