calef comments on [Link] Training Compute-Optimal Large Language Models

calef 31 Mar 2022 18:20 UTC
16 points
Something worth reemphasizing for folks not in the field is that these benchmarks are not like usual benchmarks where you train the model on the task, and then see how good it does on a held-out set. Chinchilla was not explicitly trained on any of these problems. It’s typically given some context like: “Q: What is the southernmost continent? A: Antarctica Q: What is the continent north of Africa? A:” and then simply completes the prompt until a stop token is emitted, like a newline character.

And it’s performing above-average-human on these benchmarks.