gwern comments on Testing for parallel reasoning in LLMs

gwern 19 May 2024 18:10 UTC
7 points
0
Yeah, that’s part of why I’m suspicious. I remember the original OA finetuning as being quite expensive, but the current one is not that expensive. If a GPT-3 is like 100GB of weights, say, after optimization, and it’s doing true finetuning, how is OA making it so cheap and so low-latency?