lemonhope comments on Llama Llama-3-405B?

lemonhope 26 Jul 2024 4:45 UTC
6 points
0
It’s unbelievable how similar/convergent the big LLMs are. Only a slight improvement with 100x compute?? People have much bigger differences with much less variation of the core inputs (eg number of neurons). I wonder what the best explanation is. I can think of a few mediocre explanations.