Second, there is a tradeoff between “thinking speed” and “thinking quality”. There’s no fundamental reason, as far as I can tell, that the tradeoff favors running minds at speeds way faster than human subjective times. Indeed, GPT-4 seems to run significantly subjectively slower in terms of tokens processed per second compared to GPT-3.5. And there seems to be a broad trend here towards something resembling human subjective speeds.
This reasoning seems extremely unlikely to hold deep into the singularity for any reasonable notion of subjective speed.
Deep in the singularity we expect economic doubling times of weeks. This will likely involve designing and building physical structures at extremely rapid speeds such that baseline processing will need to be way, way faster.
Are there any short-term predictions that your model makes here? For example do you expect tokens processed per second will start trending substantially up at some point in future multimodal models?
My main prediction would be that for various applications, people will considerably prefer models that generate tokens faster, including much faster than humans. And, there will be many applications where speed is prefered over quality.
I might try to think of some precise predictions later.
If the claim is about whether AI latency will be high for “various applications” then I agree. We already have some applications, such as integer arithmetic, where speed is optimized heavily, and computers can do it much faster than humans.
In context, it sounded like you were referring to tasks like automating a CEO, or physical construction work. In these cases, it seems likely to me that quality will be generally preferred over speed, and sequential processing times for AIs automating these tasks will not vastly exceed that of humans (more precisely, something like >2 OOM faster). Indeed, for some highly important tasks that future superintelligences automate, sequential processing times may even be lower for AIs compared to humans, because decision-making quality will just be that important.
I was refering to tasks like automating a CEO or construction work. I was just trying to think of the most relevant and easy to measure short term predictions (if there are already AI CEOs then the world is already pretty crazy).
The main thing here is that as models become more capable and general in the near-term future, I expect there will be intense demand for models that can solve ever larger and more complex problems. For these models, people will be willing to pay the costs of high latency, given the benefit of increased quality. We’ve already seen this in the way people prefer GPT-4 to GPT-3.5 in a large fraction of cases (for me, a majority of cases).
I expect this trend will continue into the foreseeable future until at least the period slightly after we’ve automated most human labor, and potentially into the very long-run too depending on physical constraints. I am not sufficiently educated about physical constraints here to predict what will happen “deep into the singularity”, but it’s important to note that physical constraints can cut both ways here.
To the extent that physics permits extremely useful models by virtue of them being very large and capable, you should expect people to optimize heavily for that despite the cost in terms of latency. By contrast, to the extent physics permits extremely useful models by virtue of them being very fast, then you should expect people to optimize heavily for that despite the cost in terms of quality. The balance that we strike here is not a simple function of how far we are from some abstract physical limit, but instead a function of how these physical constraints trade off against each other.
There is definitely a conceivable world in which the correct balance still favors much-faster-than-human-level latency, but it’s not clear to me that this is the world we actually live in. My intuitive, random speculative guess is that we live in the world where, for the most complex tasks that bottleneck important economic decision-making, people will optimize heavily for model quality at the cost of latency until settling on something within 1-2 OOMs of human-level latency.
Thanks for the clarification.
I think my main crux is:
This reasoning seems extremely unlikely to hold deep into the singularity for any reasonable notion of subjective speed.
Deep in the singularity we expect economic doubling times of weeks. This will likely involve designing and building physical structures at extremely rapid speeds such that baseline processing will need to be way, way faster.
See also Age of Em.
Are there any short-term predictions that your model makes here? For example do you expect tokens processed per second will start trending substantially up at some point in future multimodal models?
My main prediction would be that for various applications, people will considerably prefer models that generate tokens faster, including much faster than humans. And, there will be many applications where speed is prefered over quality.
I might try to think of some precise predictions later.
If the claim is about whether AI latency will be high for “various applications” then I agree. We already have some applications, such as integer arithmetic, where speed is optimized heavily, and computers can do it much faster than humans.
In context, it sounded like you were referring to tasks like automating a CEO, or physical construction work. In these cases, it seems likely to me that quality will be generally preferred over speed, and sequential processing times for AIs automating these tasks will not vastly exceed that of humans (more precisely, something like >2 OOM faster). Indeed, for some highly important tasks that future superintelligences automate, sequential processing times may even be lower for AIs compared to humans, because decision-making quality will just be that important.
I was refering to tasks like automating a CEO or construction work. I was just trying to think of the most relevant and easy to measure short term predictions (if there are already AI CEOs then the world is already pretty crazy).
The main thing here is that as models become more capable and general in the near-term future, I expect there will be intense demand for models that can solve ever larger and more complex problems. For these models, people will be willing to pay the costs of high latency, given the benefit of increased quality. We’ve already seen this in the way people prefer GPT-4 to GPT-3.5 in a large fraction of cases (for me, a majority of cases).
I expect this trend will continue into the foreseeable future until at least the period slightly after we’ve automated most human labor, and potentially into the very long-run too depending on physical constraints. I am not sufficiently educated about physical constraints here to predict what will happen “deep into the singularity”, but it’s important to note that physical constraints can cut both ways here.
To the extent that physics permits extremely useful models by virtue of them being very large and capable, you should expect people to optimize heavily for that despite the cost in terms of latency. By contrast, to the extent physics permits extremely useful models by virtue of them being very fast, then you should expect people to optimize heavily for that despite the cost in terms of quality. The balance that we strike here is not a simple function of how far we are from some abstract physical limit, but instead a function of how these physical constraints trade off against each other.
There is definitely a conceivable world in which the correct balance still favors much-faster-than-human-level latency, but it’s not clear to me that this is the world we actually live in. My intuitive, random speculative guess is that we live in the world where, for the most complex tasks that bottleneck important economic decision-making, people will optimize heavily for model quality at the cost of latency until settling on something within 1-2 OOMs of human-level latency.