I don’t love “smooth” vs “sharp” because these words don’t naturally point at what seem to me to be the key concept: the duration from the first AI capable of being transformatively useful to the first system which is very qualitatively generally superhuman[1]. You can have “smooth” takeoff driven by purely scaling things up where this duration is short or nonexistent.
I also care a lot about the duration from AIs which are capable enough to 3x R&D labor to AIs which are capable enough to strictly dominate (and thus obsolete) top human scientists but which aren’t necessarly very smarter. (I also care some about the duration between a bunch of different milestones and I’m not sure that my operationalizations of the milestones is the best one.)
Paul originally operationalized this as seeing an economic doubling over 4 years prior to a doubling within a year, but I’d prefer for now to talk about qualitative level of capabilities rather than also entangling questions about how AI will effect the world[2].
So, I’m tempted by “long duration” vs “short duration” takeoff, though this is pretty clumsy.
Really, there are bunch of different distinctions we care about with respect to takeoff and the progress of AI capabilities:
As discussed above, the duration from the first transformatively useful AIs to AIs which are generally superhuman. (And between very useful AIs to top human scientist level AIs.)
The duration from huge impacts in the world from AI (e.g. much higher GDP growth) to very superhuman AIs. This is like the above, but also folding in economic effects and other effects on the world at large which could come apart from AI capabilities even if there is a long duration takeoff in terms of capabilities.
Software only singularity. How much the singularity is downstream of AIs working on hardware (and energy) vs just software. (Or if something well described as a singularity even happens.)
Smoothness of AI progress vs jumpyness. As in, is progress driven by a larger number of smaller innovations and/or continuous scale ups rather being substantially driven by a small number of innovations and/or large phase changes that emerge with scale.
Predictability of AI progress. Even if AI progress is smooth in the sense of the prior bullet, it may not follow a very predictable trend if the rate of innovations or scaling varies a lot.
Tunability of AI capability. Is is possible to get a fully sweep of models which continuously interpolates over a range of capabilities?[3]
Of course, these properties are quite correlated. For instance, if the relevant durations for the first bullet are very short, then I also don’t expect economic impacts until AIs are much smarter. And, if the singularity requires AIs working on increasing available hardware (software only doesn’t work or doesn’t go very far), then you expect more economic impact and more delay.
In short timelines, with a software only intelligence explosion, and with relevant actors not intentionally slowing down, I think I don’t expect huge global GDP growth (e.g. 25% annualized global GDP growth rate) prior to very superhuman AI. I’m not very confident in this, but I think both inference availability and takeoff duration point to this.
A thing that didn’t appear on your list, and which I think is pretty important (cruxy for a lot of discussions; closest to what Hanson meant in the FOOM debate), is “human-relative discontinuity/speed”. Here the question is something like: “how much faster does AI get smarter, compared to humans?”. There’s conceptual confusion / talking past each other in part because one aspect of the debate is:
how much locking force there is between AI and humans (e.g. humans can learn from AIs teaching them, can learn from AI’s internals, can use AIs, and humans share ideas with other humans about AI (this was what Hanson argued))
and other aspect is
how fast does an intelligence explosion go, by the stars (sidereal).
If you think there’s not much coupling, then sidereal speed is the crux about whether takeoff will look discontinuous. But if you think there’s a lot of coupling, then you might think something else is a crux about continuity, e.g. “how big are the biggest atomic jumps in capability”.
Not sure I understand your question. If you mean just what I think is the case about FOOM:
Obviously, there’s no strong reason humans will stay coupled with an AGI. The AGI’s thoughts will be highly alien—that’s kinda the point.
Obviously, new ways of thinking recursively beget powerful new ways of thinking. This is obvious from the history of thinking and from introspection. And obviously this goes faster and faster. And obviously will go much faster in an AGI.
Therefore, from our perspective, there will be a fast-and-sharp FOOM.
But I don’t really know what to think about Christiano-slow takeoff.
I.e. a 4-year GDP doubling before a 1-year GDP doubling.
I think Christiano agrees that there will later be a sharp/fast/discontinuous(??) FOOM, but he thinks things will get really weird and fast before that point. To me this is vaguely in the genre of trying to predict whether you can usefully get nuclear power out of a pile without setting off a massive explosion, when you’ve only heard conceptually about the idea of nuclear decay. But I imagine Christiano actually did some BOTECs to get the numbers “4” and “1″.
If I were to guess at where I’d disagree with Christiano: Maybe he thinks that in the slow part of the slow takeoff, humans can make a bunch of progress on aligning / interfacing with / getting work out of AI stuff, to such an extent that from those future humans’s perspectives, the fast part of the slow takeoff will actually be slow, in the relative sense. In other words, if the fast part came today, it would be fast, but if it came later, it would be slow, because we’d be able to keep up. Whereas I think aligning/interfacing, in the part where it counts, is crazy hard, and doesn’t especially have to be coupled with nascent-AGI-driven capabilities advances. A lot of Christiano’s work has (explicitly) a strategy-stealing flavor: if capability X exists, then we / an aligned thingy should be able to steal the way to do X and do it alignedly. If you think you can do that, then it makes sense to think that our understanding will be coupled with AGI’s understanding.
I meant ‘do you think it’s good, bad, or neutral that people use the phrase ‘slow’/‘fast’ takeoff? And, if bad, what do you wish people did instead in those sentences?
Depends on context; I guess by raw biomass, it’s bad because those phrases would probably indicate that people aren’t really thinking and they should taboo those phrases and ask why they wanted to discuss them? But if that’s the case and they haven’t already done that, maybe there’s a more important underlying problem, such as Sinclair’s razor.
I think long duration is way too many syllables, and I think I have similar problems with this naming schema as Fast/Slow, but, if you were going to go with this naming schema I think just saying “short takeoff” and “long takeoff” seems about as clear (“duration” comes implied IMO)
I don’t love “smooth” vs “sharp” because these words don’t naturally point at what seem to me to be the key concept: the duration from the first AI capable of being transformatively useful to the first system which is very qualitatively generally superhuman[1]. You can have “smooth” takeoff driven by purely scaling things up where this duration is short or nonexistent.
I’m not sure I buy the distinction mattering?
Here’s a few worlds:
Smooth takeoff to superintelligence via scaling the whole way, no RSI
Smooth takeoff to superintelligence via a mix of scaling, algorithmic advance, RSI, etc
smoothish looking takeoff via scaling (like we currently see) but then suddenly the shape of the curve changes dramatically due to RSI or similar
smoothish looking takeoff via scaling like we see, but, and then RSI is the mechanism by which the curve continues, but not very quickly (maybe this implies the curve actively levels off S-curve style before eventually picking up again)
alt-world where we weren’t even seeing similar types of smoothly advancing AI, and then there’s abrupt RSI takeoff in days or months
alt-world where we weren’t seeing similar smooth scaling AI, and then RSI is the thing that initiates our current level of growth
At least with the previous way I’d been thinking about things, I think the worlds above that look smooth, I feel like “yep, that was a smooth takeoff.”
Or, okay, I thought about it a bit more and maybe agree that “time between first transformatively-useful to superintelligence” is a key variable. But, I also think that variable is captured by saying “smooth takeoff/long timelines?” (which is approximately what people are currently saying?
Hmm, I updated towards being less confident while thinking about this.
But, I also think that variable is captured by saying “smooth takeoff/long timelines?” (which is approximately what people are currently saying?
You can have smooth and short takeoff with long timelines. E.g., imagine that scaling works all the way to ASI, but requires a ton of baremetal flop (e.g. 1e34) implying longer timelines and early transformative AI requires almost as much flop (e.g. 3e33) such that these events are only 1 year apart.
I think we’re pretty likely to see a smooth and short takeoff with ASI prior to 2029. Now, imagine that you were making this exactly prediction up through 2029 in 2000. From the perspective in 2000, you are exactly predicting smooth and short takeoff with long timelines!
So, I think this is actually a pretty natural prediction.
For instance, you get this prediction if you think that a scalable paradigm will be found in the future and will scale up to ASI and on this scalable paradigm the delay between ASI and transformative AI will be short (either because the flop difference is small or because flop scaling will be pretty rapid at the relevant point because it is still pretty cheap, perhaps <$100 billion).
I agree with the spirit of what you are saying but I want to register a desire for “long timelines” to mean “>50 years” or “after 2100”. In public discourse, heading Yann LeCunn say something like “I have long timelines, by which I mean, no crazy event in the next 5 years”—it’s simply not what people think when they think long timelines, outside of the AI sphere.
I don’t love “smooth” vs “sharp” because these words don’t naturally point at what seem to me to be the key concept: the duration from the first AI capable of being transformatively useful to the first system which is very qualitatively generally superhuman[1]. You can have “smooth” takeoff driven by purely scaling things up where this duration is short or nonexistent.
I also care a lot about the duration from AIs which are capable enough to 3x R&D labor to AIs which are capable enough to strictly dominate (and thus obsolete) top human scientists but which aren’t necessarly very smarter. (I also care some about the duration between a bunch of different milestones and I’m not sure that my operationalizations of the milestones is the best one.)
Paul originally operationalized this as seeing an economic doubling over 4 years prior to a doubling within a year, but I’d prefer for now to talk about qualitative level of capabilities rather than also entangling questions about how AI will effect the world[2].
So, I’m tempted by “long duration” vs “short duration” takeoff, though this is pretty clumsy.
Really, there are bunch of different distinctions we care about with respect to takeoff and the progress of AI capabilities:
As discussed above, the duration from the first transformatively useful AIs to AIs which are generally superhuman. (And between very useful AIs to top human scientist level AIs.)
The duration from huge impacts in the world from AI (e.g. much higher GDP growth) to very superhuman AIs. This is like the above, but also folding in economic effects and other effects on the world at large which could come apart from AI capabilities even if there is a long duration takeoff in terms of capabilities.
Software only singularity. How much the singularity is downstream of AIs working on hardware (and energy) vs just software. (Or if something well described as a singularity even happens.)
Smoothness of AI progress vs jumpyness. As in, is progress driven by a larger number of smaller innovations and/or continuous scale ups rather being substantially driven by a small number of innovations and/or large phase changes that emerge with scale.
Predictability of AI progress. Even if AI progress is smooth in the sense of the prior bullet, it may not follow a very predictable trend if the rate of innovations or scaling varies a lot.
Tunability of AI capability. Is is possible to get a fully sweep of models which continuously interpolates over a range of capabilities?[3]
Of course, these properties are quite correlated. For instance, if the relevant durations for the first bullet are very short, then I also don’t expect economic impacts until AIs are much smarter. And, if the singularity requires AIs working on increasing available hardware (software only doesn’t work or doesn’t go very far), then you expect more economic impact and more delay.
One could think that there will be no delay between these points, though I personally think this is unlikely.
In short timelines, with a software only intelligence explosion, and with relevant actors not intentionally slowing down, I think I don’t expect huge global GDP growth (e.g. 25% annualized global GDP growth rate) prior to very superhuman AI. I’m not very confident in this, but I think both inference availability and takeoff duration point to this.
This is a very weak property, though I think some people are skeptical of this.
A thing that didn’t appear on your list, and which I think is pretty important (cruxy for a lot of discussions; closest to what Hanson meant in the FOOM debate), is “human-relative discontinuity/speed”. Here the question is something like: “how much faster does AI get smarter, compared to humans?”. There’s conceptual confusion / talking past each other in part because one aspect of the debate is:
how much locking force there is between AI and humans (e.g. humans can learn from AIs teaching them, can learn from AI’s internals, can use AIs, and humans share ideas with other humans about AI (this was what Hanson argued))
and other aspect is
how fast does an intelligence explosion go, by the stars (sidereal).
If you think there’s not much coupling, then sidereal speed is the crux about whether takeoff will look discontinuous. But if you think there’s a lot of coupling, then you might think something else is a crux about continuity, e.g. “how big are the biggest atomic jumps in capability”.
What does this cache out to in terms of what terms you think make sense?
Not sure I understand your question. If you mean just what I think is the case about FOOM:
Obviously, there’s no strong reason humans will stay coupled with an AGI. The AGI’s thoughts will be highly alien—that’s kinda the point.
Obviously, new ways of thinking recursively beget powerful new ways of thinking. This is obvious from the history of thinking and from introspection. And obviously this goes faster and faster. And obviously will go much faster in an AGI.
Therefore, from our perspective, there will be a fast-and-sharp FOOM.
But I don’t really know what to think about Christiano-slow takeoff.
I.e. a 4-year GDP doubling before a 1-year GDP doubling.
I think Christiano agrees that there will later be a sharp/fast/discontinuous(??) FOOM, but he thinks things will get really weird and fast before that point. To me this is vaguely in the genre of trying to predict whether you can usefully get nuclear power out of a pile without setting off a massive explosion, when you’ve only heard conceptually about the idea of nuclear decay. But I imagine Christiano actually did some BOTECs to get the numbers “4” and “1″.
If I were to guess at where I’d disagree with Christiano: Maybe he thinks that in the slow part of the slow takeoff, humans can make a bunch of progress on aligning / interfacing with / getting work out of AI stuff, to such an extent that from those future humans’s perspectives, the fast part of the slow takeoff will actually be slow, in the relative sense. In other words, if the fast part came today, it would be fast, but if it came later, it would be slow, because we’d be able to keep up. Whereas I think aligning/interfacing, in the part where it counts, is crazy hard, and doesn’t especially have to be coupled with nascent-AGI-driven capabilities advances. A lot of Christiano’s work has (explicitly) a strategy-stealing flavor: if capability X exists, then we / an aligned thingy should be able to steal the way to do X and do it alignedly. If you think you can do that, then it makes sense to think that our understanding will be coupled with AGI’s understanding.
I meant ‘do you think it’s good, bad, or neutral that people use the phrase ‘slow’/‘fast’ takeoff? And, if bad, what do you wish people did instead in those sentences?
Depends on context; I guess by raw biomass, it’s bad because those phrases would probably indicate that people aren’t really thinking and they should taboo those phrases and ask why they wanted to discuss them? But if that’s the case and they haven’t already done that, maybe there’s a more important underlying problem, such as Sinclair’s razor.
I think long duration is way too many syllables, and I think I have similar problems with this naming schema as Fast/Slow, but, if you were going to go with this naming schema I think just saying “short takeoff” and “long takeoff” seems about as clear (“duration” comes implied IMO)
I’m not sure I buy the distinction mattering?
Here’s a few worlds:
Smooth takeoff to superintelligence via scaling the whole way, no RSI
Smooth takeoff to superintelligence via a mix of scaling, algorithmic advance, RSI, etc
smoothish looking takeoff via scaling (like we currently see) but then suddenly the shape of the curve changes dramatically due to RSI or similar
smoothish looking takeoff via scaling like we see, but, and then RSI is the mechanism by which the curve continues, but not very quickly (maybe this implies the curve actively levels off S-curve style before eventually picking up again)
alt-world where we weren’t even seeing similar types of smoothly advancing AI, and then there’s abrupt RSI takeoff in days or months
alt-world where we weren’t seeing similar smooth scaling AI, and then RSI is the thing that initiates our current level of growth
At least with the previous way I’d been thinking about things, I think the worlds above that look smooth, I feel like “yep, that was a smooth takeoff.”
Or, okay, I thought about it a bit more and maybe agree that “time between first transformatively-useful to superintelligence” is a key variable. But, I also think that variable is captured by saying “smooth takeoff/long timelines?” (which is approximately what people are currently saying?
Hmm, I updated towards being less confident while thinking about this.
You can have smooth and short takeoff with long timelines. E.g., imagine that scaling works all the way to ASI, but requires a ton of baremetal flop (e.g. 1e34) implying longer timelines and early transformative AI requires almost as much flop (e.g. 3e33) such that these events are only 1 year apart.
I think we’re pretty likely to see a smooth and short takeoff with ASI prior to 2029. Now, imagine that you were making this exactly prediction up through 2029 in 2000. From the perspective in 2000, you are exactly predicting smooth and short takeoff with long timelines!
So, I think this is actually a pretty natural prediction.
For instance, you get this prediction if you think that a scalable paradigm will be found in the future and will scale up to ASI and on this scalable paradigm the delay between ASI and transformative AI will be short (either because the flop difference is small or because flop scaling will be pretty rapid at the relevant point because it is still pretty cheap, perhaps <$100 billion).
I agree with the spirit of what you are saying but I want to register a desire for “long timelines” to mean “>50 years” or “after 2100”. In public discourse, heading Yann LeCunn say something like “I have long timelines, by which I mean, no crazy event in the next 5 years”—it’s simply not what people think when they think long timelines, outside of the AI sphere.
Long takeoff and short takeoff sound strange to me. Maybe because they are too close to long timelines and short timelines.
Yeah I think the similarity of takeoff and timelines is maybe the real problem.
Like if Takeoff wasn’t two syllables that starts with T I might be happy with ‘short/long’ being the prefix for both.