As far as I can tell, Sam is saying no to size. That does not mean saying no to compute, data, or scaling.
“Hundreds of complicated things” comment definitely can’t be interpreted to be against transformers, since “simply” scaling transformers fits the description perfectly. “Simply” scaling transformers involves things like writing a new compiler. It is simple in strategy, not in execution.
As far as I can tell, Sam is saying no to size. That does not mean saying no to compute, data, or scaling.
“Hundreds of complicated things” comment definitely can’t be interpreted to be against transformers, since “simply” scaling transformers fits the description perfectly. “Simply” scaling transformers involves things like writing a new compiler. It is simple in strategy, not in execution.
This seems to insinuate a cool down in scaling compute and Sam previously acknowledged that the data bottleneck was a real roadblock.