My guess based on how fast GPT-3 feels is that one OOM bigger will lead to noticeable latency (you’ll have to watch as the text appears on your screen, just like it does when you type) and that an OOM on top of that will result in annoying latency (you’ll press GO and then go browse other tabs while you wait for it to finish).
My guess based on how fast GPT-3 feels is that one OOM bigger will lead to noticeable latency (you’ll have to watch as the text appears on your screen, just like it does when you type) and that an OOM on top of that will result in annoying latency (you’ll press GO and then go browse other tabs while you wait for it to finish).
That sounds reasonable, GPT-4, or 5, may be latency constrained for real time applications. They may even have multiple variants for either case.