Transformative: Which of these do you agree with and when do you think this might happen?
For some timelines see my other comment; they aren’t specifically about the definitions you list here but my error bars on timelines are huge anyway so I don’t think I’ll try to write down separate ones for different definitions.
Compared to definitions 2. and 3., I might be more bullish on AIs having pretty big effects even if they can “only” automate tasks that would take human experts a few days (without intermediate human feedback). A key uncertainty I have though is how much of a bottleneck human supervision time and quality would be in this case. E.g. could many of the developers who’re currently writing a lot of code just transition to reviewing code and giving high-level instructions full-time, or would there just be a senior management bottleneck and you can’t actually use the AIs all that effectively? My very rough guess is you can pretty easily get a 10x speedup in software engineering, maybe more. And maybe something similar in ML research though compute might be an additional important bottleneck there (including walltime until experiments finish). If it’s “only” 10x, then arguably that’s just mildly transformative, but if it happens across a lot of domains at once it’s still a huge deal.
I think whether robotics are really good or not matters, but I don’t think it’s crucial (e.g. I’d be happy to call definition 1. “transformative”).
The combination of 5a and 5b obviously seems important (since it determines whether you can finance ever bigger training runs). But not sure how to use this as a definition of “transformative”; right now 5a is clearly already met, and on long enough time scales, 5b also seems easy to meet right now (OpenAI might even already have broken even on GPT-4, not sure off the top of my head).
Also, how much compute do you think an AGI or superintelligence will require at inference time initially? What is a reasonable level of optimization? Do you agree that many doom scenarios require it to be possible for an AGI to compress to fit on very small host PCs? Is this plausible? (eg can a single 2070 8gb host a model with general human intelligence at human scale speeds and vision processing and robotics proprioception and control...?)
I don’t see why you need to run AGI on a single 2070 for many doom scenarios. I do agree that if AGI can only run on a specific giant data center, that makes many forms of doom less likely. But in the current paradigm, training compute is roughly the square of inference compute, so as models are scaled, I think inference should become cheaper relative to training. (And even now, SOTA models could be run on relatively modest compute clusters, though maybe not consumer hardware.)
In terms of the absolute level of inference compute needed, I could see a single 2070 being enough in the limit of optimal algorithms, but naturally I’d expect we’ll first have AGI that can automate a lot of things if run with way more compute than that, and then I expect it would take a while to get it down this much. Though even if we’re asking whether AGI can run on consumer-level hardware, a single 2070 seems pretty low (e.g. seems like a 4090 already has 5.5x as many FLOP/s as a 2070, and presumably we’ll have more in the future).
with general human intelligence at human scale speeds and vision processing and robotics proprioception and control...
Like I mentioned above, I don’t think robotics are absolutely crucial, and especially if you’re specifically optimizing for running under heavy resource constraints, you might want to just not bother with that.
For some timelines see my other comment; they aren’t specifically about the definitions you list here but my error bars on timelines are huge anyway so I don’t think I’ll try to write down separate ones for different definitions.
Compared to definitions 2. and 3., I might be more bullish on AIs having pretty big effects even if they can “only” automate tasks that would take human experts a few days (without intermediate human feedback). A key uncertainty I have though is how much of a bottleneck human supervision time and quality would be in this case. E.g. could many of the developers who’re currently writing a lot of code just transition to reviewing code and giving high-level instructions full-time, or would there just be a senior management bottleneck and you can’t actually use the AIs all that effectively? My very rough guess is you can pretty easily get a 10x speedup in software engineering, maybe more. And maybe something similar in ML research though compute might be an additional important bottleneck there (including walltime until experiments finish). If it’s “only” 10x, then arguably that’s just mildly transformative, but if it happens across a lot of domains at once it’s still a huge deal.
I think whether robotics are really good or not matters, but I don’t think it’s crucial (e.g. I’d be happy to call definition 1. “transformative”).
The combination of 5a and 5b obviously seems important (since it determines whether you can finance ever bigger training runs). But not sure how to use this as a definition of “transformative”; right now 5a is clearly already met, and on long enough time scales, 5b also seems easy to meet right now (OpenAI might even already have broken even on GPT-4, not sure off the top of my head).
I don’t see why you need to run AGI on a single 2070 for many doom scenarios. I do agree that if AGI can only run on a specific giant data center, that makes many forms of doom less likely. But in the current paradigm, training compute is roughly the square of inference compute, so as models are scaled, I think inference should become cheaper relative to training. (And even now, SOTA models could be run on relatively modest compute clusters, though maybe not consumer hardware.)
In terms of the absolute level of inference compute needed, I could see a single 2070 being enough in the limit of optimal algorithms, but naturally I’d expect we’ll first have AGI that can automate a lot of things if run with way more compute than that, and then I expect it would take a while to get it down this much. Though even if we’re asking whether AGI can run on consumer-level hardware, a single 2070 seems pretty low (e.g. seems like a 4090 already has 5.5x as many FLOP/s as a 2070, and presumably we’ll have more in the future).
Like I mentioned above, I don’t think robotics are absolutely crucial, and especially if you’re specifically optimizing for running under heavy resource constraints, you might want to just not bother with that.