current AIs are not thinking faster than humans [...] GPT-4 has higher token latency than GPT-3.5, but I think it’s fair to say that GPT-4 is the model that “thinks faster”
This notion of thinking speed depends on the difficulty of a task. If one of the systems can’t solve a problem at all, it’s neither faster nor slower. If both systems can solve a problem, we can compare the time they take. In that sense, current LLMs are 1-2 OOMs faster than humans at the tasks both can solve, and much cheaper.
Old chess AIs were slower than humans good at chess. If future AIs can take advantage of search to improve quality, they might again get slower than humans at sufficiently difficult tasks, while simultaneously being faster than humans at easier tasks.
Sure, but in that case I would not say the AI thinks faster than humans, I would say the AI is faster than humans at a specific range of tasks where the AI can do those tasks in a “reasonable” amount of time.
As I’ve said elsewhere, there is a quality or breadth vs serial speed tradeoff in ML systems: a system that only does one narrow and simple task can do that task at a high serial speed, but as you make systems more general and get them to handle more complex tasks, serial speed tends to fall. The same logic that people are using to claim GPT-4 thinks faster than humans should also lead them to think a calculator thinks faster than GPT-4, which is an unproductive way to use the one-dimensional abstraction of “thinking faster vs. slower”.
You might ask “Well, why use that abstraction at all? Why not talk about how fast the AIs can do specific tasks instead of trying to come up with some general notion of if their thinking is faster or slower?” I think a big reason is that people typically claim the faster “cognitive speed” of AIs can have impacts such as “accelerating the pace of history”, and I’m trying to argue that the case for such an effect is not as trivial to make as some people seem to think.
This notion of thinking speed makes sense for large classes of tasks, not just specific tasks. And a natural class of tasks to focus on is the harder tasks among all the tasks both systems can solve.
So in this sense a calculator is indeed much faster than GPT-4, and GPT-4 is 2 OOMs faster than humans. An autonomous research AGI is capable of autonomous research, so its speed can be compared to humans at that class of tasks.
AI accelerates the pace of history only when it’s capable of making the same kind of progress as humans in advancing history, at which point we need to compare their speed to that of humans at that activity (class of tasks). Currently AIs are not capable of that at all. If hypothetically 1e28 training FLOPs LLMs become capable of autonomous research (with scaffolding that doesn’t incur too much latency overhead), we can expect that they’ll be 1-2 OOMs faster than humans, because we know how they work. Thus it makes sense to claim that 1e28 FLOPs LLMs will accelerate history if they can do research autonomously. If AIs need to rely on extensive search on top of LLMs to get there, or if they can’t do it at all, we can instead predict that they don’t accelerate history, again based on what we know of how they work.
This notion of thinking speed depends on the difficulty of a task. If one of the systems can’t solve a problem at all, it’s neither faster nor slower. If both systems can solve a problem, we can compare the time they take. In that sense, current LLMs are 1-2 OOMs faster than humans at the tasks both can solve, and much cheaper.
Old chess AIs were slower than humans good at chess. If future AIs can take advantage of search to improve quality, they might again get slower than humans at sufficiently difficult tasks, while simultaneously being faster than humans at easier tasks.
Sure, but in that case I would not say the AI thinks faster than humans, I would say the AI is faster than humans at a specific range of tasks where the AI can do those tasks in a “reasonable” amount of time.
As I’ve said elsewhere, there is a quality or breadth vs serial speed tradeoff in ML systems: a system that only does one narrow and simple task can do that task at a high serial speed, but as you make systems more general and get them to handle more complex tasks, serial speed tends to fall. The same logic that people are using to claim GPT-4 thinks faster than humans should also lead them to think a calculator thinks faster than GPT-4, which is an unproductive way to use the one-dimensional abstraction of “thinking faster vs. slower”.
You might ask “Well, why use that abstraction at all? Why not talk about how fast the AIs can do specific tasks instead of trying to come up with some general notion of if their thinking is faster or slower?” I think a big reason is that people typically claim the faster “cognitive speed” of AIs can have impacts such as “accelerating the pace of history”, and I’m trying to argue that the case for such an effect is not as trivial to make as some people seem to think.
This notion of thinking speed makes sense for large classes of tasks, not just specific tasks. And a natural class of tasks to focus on is the harder tasks among all the tasks both systems can solve.
So in this sense a calculator is indeed much faster than GPT-4, and GPT-4 is 2 OOMs faster than humans. An autonomous research AGI is capable of autonomous research, so its speed can be compared to humans at that class of tasks.
AI accelerates the pace of history only when it’s capable of making the same kind of progress as humans in advancing history, at which point we need to compare their speed to that of humans at that activity (class of tasks). Currently AIs are not capable of that at all. If hypothetically 1e28 training FLOPs LLMs become capable of autonomous research (with scaffolding that doesn’t incur too much latency overhead), we can expect that they’ll be 1-2 OOMs faster than humans, because we know how they work. Thus it makes sense to claim that 1e28 FLOPs LLMs will accelerate history if they can do research autonomously. If AIs need to rely on extensive search on top of LLMs to get there, or if they can’t do it at all, we can instead predict that they don’t accelerate history, again based on what we know of how they work.