jacquesthibs comments on jacquesthibs’s Shortform

jacquesthibs 15 Nov 2023 17:34 UTC
3 points
0
More predictions/insights from Jimmy and crew. He’s implying that people (like I have also been saying) that some people are far too focused on scale over training data and architectural improvements. IMO, the bitter lesson is a thing, but I think we’ve over-updated on it.
Relatedly, someone shared a new 13B model that apparently is better and comparable to GPT-4 in logical reasoning (based on benchmarks, which I don’t usually trust too much). Note that the model is a solver-augmented LM.
Here’s some context regarding the paper: