Compute is a very important input, important enough that it makes sense IMO to use it as the currency by which we measure the other inputs (this is basically what Bio Anchors + Tom’s model do).
There is a question of whether we’ll be bottlenecked on it in a way that throttles takeoff; it may not matter if you have AGI, if the only way to get AGI+ is to wait for another even bigger training run to complete.
I think in some sense we will indeed be bottlenecked by compute during takeoff… but that nevertheless we’ll be going something like 10x − 1000x faster than we currently go, because labor can substitute for compute to some extent (Not so much if it’s going at 1x speed; but very much if it’s going at 10x, 100x speed) and we’ll have a LOT of sped-up labor. Like, I do a little exercise where I think about what my coworkers are doing and I imagine what if they had access to AGI that was exactly as good as they are at everything, only 100x faster. I feel like they’d make progress on their current research agendas about 10x as fast. Could be a bit less, could be a lot more. Especially once we start getting qualitative intelligence improvements over typical OAI researchers, it could be a LOT more, because in scientific research there seems to be HUGE returns to quality, the smartest geniuses seem to accomplish more in a year than 90th-percentile scientists accomplish in their lifetime.
Training data also might be a bottleneck. However I think that by the time we are about to hit AGI and/or just having hit AGI, it won’t be. Smart humans are able to generate their own training data, so to speak; the entire field of mathematics is a bunch of people talking to each other and iteratively adding proofs to the blockchain so to speak and learning from each other’s proofs. That’s just an example, I think, of how around AGI we should basically have a self-sustaining civilization of AGIs talking to each other and evaluating each other’s outputs and learning from them. And this is just one of several ways in which training data bottleneck could be overcome. Another is better algorithms that are more data-efficient. The human brain seems to be more data-efficient than modern LLMs, for example. Maybe we can figure out how it manages that.
Sounds like we are basically on the same page!
Re: your question:
Compute is a very important input, important enough that it makes sense IMO to use it as the currency by which we measure the other inputs (this is basically what Bio Anchors + Tom’s model do).
There is a question of whether we’ll be bottlenecked on it in a way that throttles takeoff; it may not matter if you have AGI, if the only way to get AGI+ is to wait for another even bigger training run to complete.
I think in some sense we will indeed be bottlenecked by compute during takeoff… but that nevertheless we’ll be going something like 10x − 1000x faster than we currently go, because labor can substitute for compute to some extent (Not so much if it’s going at 1x speed; but very much if it’s going at 10x, 100x speed) and we’ll have a LOT of sped-up labor. Like, I do a little exercise where I think about what my coworkers are doing and I imagine what if they had access to AGI that was exactly as good as they are at everything, only 100x faster. I feel like they’d make progress on their current research agendas about 10x as fast. Could be a bit less, could be a lot more. Especially once we start getting qualitative intelligence improvements over typical OAI researchers, it could be a LOT more, because in scientific research there seems to be HUGE returns to quality, the smartest geniuses seem to accomplish more in a year than 90th-percentile scientists accomplish in their lifetime.
Training data also might be a bottleneck. However I think that by the time we are about to hit AGI and/or just having hit AGI, it won’t be. Smart humans are able to generate their own training data, so to speak; the entire field of mathematics is a bunch of people talking to each other and iteratively adding proofs to the blockchain so to speak and learning from each other’s proofs. That’s just an example, I think, of how around AGI we should basically have a self-sustaining civilization of AGIs talking to each other and evaluating each other’s outputs and learning from them. And this is just one of several ways in which training data bottleneck could be overcome. Another is better algorithms that are more data-efficient. The human brain seems to be more data-efficient than modern LLMs, for example. Maybe we can figure out how it manages that.