Daniel Kokotajlo comments on Fun with +12 OOMs of Compute

Daniel Kokotajlo 4 Mar 2021 21:13 UTC
LW: 2 AF: 1
AF
[5] One might worry that the original paper had a biased sample of tasks. I do in fact worry about this. However, this paper tests GPT-3 on a sample of actual standardized tests used for admission to colleges, grad schools, etc. and GPT-3 exhibits similar performance (around 50% correct), and also shows radical improvement over smaller versions of GPT.