I’m reluctant to reply because it sounds like you’re looking for rebuttals by explicit proponents of hard takeoff who have thought a great deal about takeoff speeds, and neither of that applies to me. But I could sketch some intuitions why reading the pieces by AI Impacts and by Christiano hasn’t felt wholly convincing to me. (I’ve never run these intuitions past anyone and don’t know if they’re similar to cruxes held by proponents of hard takeoff who are more confident in hard takeoff than I am – therefore I hope people don’t update much further against hard takeoff in case they find the sketch below unconvincing.) I found that it’s easiest for me to explain something if I can gesture towards some loosely related “themes” rather than go through a structured argument, so here are some of these themes and maybe people see underlying connections between them:
Culture overhang
Shulman and Sandberg have argued that one way to get hard takeoff is via hardware overhang: when a new algorithmic insight can be used immediately to its full potential, because much more hardware is available than one would have needed to overtake state of the art performance metric with the new algorithms. I think there’s a similar dynamic at work with culture: If you placed an AGI into the stone age, it would be inefficient at taking over the world even with appropriately crafted output channels because stone age tools (which include stone age humans the AGI could manipulate) are neither very useful nor reliable. It would be easier for an AGI to achieve influence in 1995 when the environment contained a greater variety of increasingly far-reaching tools. But with the internet being new, particular strategies to attain power (or even just rapidly acquire knowledge) were not yet available. Today, it is arguably easier than ever for an AGI to quickly and more-or-less single-handedly transform the world.
Snapshot intelligence versus intelligence as learning potential
There’s a sense in which cavemen are similarly intelligent as modern-day humans. If we time-traveled back into the stone age, found the couples with the best predictors for having gifted children, gave these couples access to 21st century nutrition and childbearing assistance, and then took their newborns back into today’s world where they’d grow up in a loving foster family with access to high-quality personalized education, there’s a good chance some of those babies would grow up to be relatively ordinary people of close to average intelligence. Those former(?) cavemen and cavewomen would presumably be capable of dealing with many if not most aspects of contemporary life and modern technology.
However, there’s also a sense in which cavemen are very unintelligent compared to modern-day humans. Culture, education, possibly even things like the Flynn effect, etc. – these really do change the way people think and act in the world. Cavemen are incredibly uneducated and untrained concerning knowledge and skills that are useful in modern, tool-rich environments.
We can think of this difference as the difference between the snapshot of someone’s intelligence at the peak of their development and their (initial) learning potential. Caveman and modern-day humans might be relatively close to each other in terms of the latter, but when considering their abilities at the peak of their personal development, the modern humans are much better at achieving goals in tool-rich environments. I sometimes get the impression that proponents of soft takeoffs underappreciate this difference when addressing comparisons between, for instance, early humans and chimpanzees (this is just a vague general impression which doesn’t apply to the arguments presented by AI impacts or by Paul Christiano).
How to make use of culture: The importance of distinguishing good ideas from bad ones
Both for productive engineers and creative geniuses, it holds that they could only have developed their full potential because they picked up useful pieces of insight from other people. But some people cannot tell the difference between high-quality information and low-quality information, or might make wrong use even of high-quality information, reasoning themselves into biased conclusions. An AI system capable of absorbing the entire internet but terrible at telling good ideas from bad ideas won’t make too much of a splash (at least not in terms of being able to take over the world). But what about an AI system just slightly above some cleverness threshold for adopting an increasingly efficient information diet? Couldn’t it absorb the internet in a highly systematic way rather than just soaking in everything indiscriminately, learning many essential meta-skills on its way, improving how it goes about the task of further learning?
Small differences in learning potential have compounded benefits over time
If the child in the chair next to me in fifth grade was slightly more intellectually curious, somewhat more productive, and marginally better dispositioned to adopt a truth-seeking approach and self-image than I am, this could initially mean they score 100%, and I score 95% on fifth-grade tests – no big difference. But as time goes on, their productivity gets them to read more books, their intellectual curiosity and good judgment get them to read more unusually useful books, and their cleverness gets them to integrate all this knowledge in better and increasingly more creative ways. I’ll reach a point where I’m just sort of skimming things because I’m not motivated enough to understand complicated ideas deeply, whereas they find it rewarding to comprehend everything that gives them a better sense of where to go next on their intellectual journey. By the time we graduate university, my intellectual skills are mostly useless, while they have technical expertise in several topics, can match or even exceed my thinking even on areas I specialized in, and get hired by some leading AI company. The point being: an initially small difference in dispositions becomes almost incomprehensibly vast over time.
Knowing how to learn strategically: A candidate for secret sauce??
(I realized that in this title/paragraph, the word “knowing” is meant both in the sense of “knowing how to do x” and “being capable of executing x very well.” It might be useful to try to disentangle this some more.) The standard AI foom narrative sounds a bit unrealistic when discussed in terms of some AI system inspecting itself and remodeling its inner architecture in a very deliberate way driven by architectural self-understanding. But what about the framing of being good at learning how to learn? There’s at least a plausible-sounding story we can tell where such an ability might qualify as the “secret sauce” that gives rise to a discontinuity in the returns of increased AI capabilities. In humans – and admittedly this might be too anthropomorphic – I’d think about it in this way: If my 12-year-old self had been brain-uploaded to a suitable virtual reality, made copies of, and given the task of devouring the entire internet in 1,000 years of subjective time (with no aging) to acquire enough knowledge and skill to produce novel and for-the-world useful intellectual contributions, the result probably wouldn’t be much of a success. If we imagined the same with my 19-year-old self, there’s a high chance the result wouldn’t be useful either – but also some chance it would be extremely useful. Assuming, for the sake of the comparison, that a copy clan of 19-year olds can produce highly beneficial research outputs this way, and a copy clan of 12-year olds can’t, what does the landscape look like in between? I don’t find it evident that the in-between is gradual. I think it’s at least plausible that there’s a jump once the copies reach a level of intellectual maturity to make plans which are flexible enough at the meta-level and divide labor sensibly enough to stay open to reassessing their approach as time goes on and they learn new things. Maybe all of that is gradual, and there are degrees of dividing labor sensibly or of staying open to reassessing one’s approach – but that doesn’t seem evident to me. Maybe this works more as an on/off thing.
How could natural selection produce on/off abilities?
It makes sense to be somewhat suspicious about any hypotheses according to which the evolution of general intelligence made a radical jump in Homo sapiens, creating thinking that is “discontinuous” from what came before. If knowing how to learn is an on/off ability that plays a vital role in the ways I described above, how could it evolve? We’re certainly also talking culture, not just genes. And via the Baldwin effect, natural selection can move individuals closer towards picking up surprisingly complex strategies via learning from their environment. At this point at latest, my thinking becomes highly speculative. But here’s one hypothesis: In its generalization, this effect is about learning how to learn. And maybe there is something like a “broad basin of attraction” (inspired by Christiano’s broad basin of attraction for corrigibility) for robustly good reasoning / knowing how to learn. Picking up some of the right ideas initially and early on, combined with being good at picking up things in general, produces in people an increasingly better sense of how to order and structure other ideas, and over time, the best human learners start to increasingly resemble each other, having honed in on the best general strategies.
The mediocre success of self-improvement literature
For most people, the returns of self-improvement literature (by which I mean not just productivity advice, but also information on “how to be more rational,” etc.) might be somewhat useful, but rarely life-changing. People don’t tend to “go foom” from reading self-improvement advice. Why is that, and how does it square with my hypothesis above, that “knowing how to learn” could be a highly valuable skill with potentially huge compounding benefits? Maybe the answer is that the bottleneck is rarely knowledge about self-improvement, but rather the ability to make the best use of such knowledge? This would support the hypothesis mentioned above: If the critical skill is finding useful information in a massive sea of both useful and not-so-useful information, that doesn’t necessarily mean that people will get better at that skill if we gave them curated access to highly useful information (even if it’s information about how to find useful information, i.e., good self-improvement advice). Maybe humans don’t tend to go foom after receiving humanity’s best self-improvement advice because too much of that is too obvious for people who were already unusually gifted and then grew up in modern society where they could observe and learn from other people and their habits. However, now imagine someone who had never read any self-improvement advice, and could never observe others. For that person, we might have more reason to expect them to go foom – at least compared to their previous baseline – after reading curated advice on self-improvement (or, if it is true that self-improvement literature is often somewhat redundant, even just from joining an environment where they can observe and learn from other people and from society). And maybe that’s the situation in which the first AI system above a certain critical capabilities threshold finds itself. The threshold I mean is (something like) the ability to figure out how to learn quickly enough to then approach the information on the internet like the hypothetical 19-year olds (as opposed to the 12-year olds) from the thought experiment above.
---
Hard takeoff without a discontinuity
(This argument is separate from all the other arguments above.) Here’s something I never really understood about the framing of the hard vs. soft takeoff discussion. Let’s imagine a graph with inputs such as algorithmic insights and compute/hardware on the x-axis, and general intelligence (it doesn’t matter for my purposes whether we use learning potential or snapshot intelligence) on the y-axis. Typically, the framing is that proponents of hard takeoff believe that this graph contains a discontinuity where the growth mode changes, and suddenly the returns (for inputs such as compute) are vastly higher than the outside view would have predicted, meaning that the graph makes a jump upwards in the y-axis. But what about hard takeoff without such a discontinuity? If our graph starts to be steep enough at the point where AI systems reach human-level research capabilities and beyond, then that could in itself allow for some hard (or “quasi-hard”) takeoff. After all, we are not going to be sampling points (in the sense of deploying cutting-edge AI systems) from that curve every day – that simply wouldn’t work logistically even granted all the pressures to be cutting-edge competitive. If we assume that we only sample points from the curve every two months, for instance, is it possible that for whatever increase in compute and algorithmic insights we’d get in those two months, the differential on the y-axis (some measure of general intelligence) could be vast enough to allow for attaining a decisive strategic advantage (DSA) from being first? I don’t have strong intuitions about just how strongly the offense-defense balance will shift to once we are close to AGI, but it at least seems plausible that it turns a lot more towards offense, in which case arguably a lower differential is needed for attaining a DSA. In addition, based on the classical arguments put forward by researchers such as Bostrom and Yudkowsky, it also seems at least plausible to me that we are potentially dealing with a curve that is very steep around the human level. So, if one AGI project is two months ahead of another project, and we for the sake of argument assume that there are no inherent discontinuities in the graph in question, it’s still not evident to me that this couldn’t lead to something that very much looks like hard takeoff, just without an underlying discontinuity in the graph.
I’m reluctant to reply because it sounds like you’re looking for rebuttals by explicit proponents of hard takeoff who have thought a great deal about takeoff speeds, and neither of that applies to me. But I could sketch some intuitions why reading the pieces by AI Impacts and by Christiano hasn’t felt wholly convincing to me. (I’ve never run these intuitions past anyone and don’t know if they’re similar to cruxes held by proponents of hard takeoff who are more confident in hard takeoff than I am – therefore I hope people don’t update much further against hard takeoff in case they find the sketch below unconvincing.) I found that it’s easiest for me to explain something if I can gesture towards some loosely related “themes” rather than go through a structured argument, so here are some of these themes and maybe people see underlying connections between them:
Culture overhang
Shulman and Sandberg have argued that one way to get hard takeoff is via hardware overhang: when a new algorithmic insight can be used immediately to its full potential, because much more hardware is available than one would have needed to overtake state of the art performance metric with the new algorithms. I think there’s a similar dynamic at work with culture: If you placed an AGI into the stone age, it would be inefficient at taking over the world even with appropriately crafted output channels because stone age tools (which include stone age humans the AGI could manipulate) are neither very useful nor reliable. It would be easier for an AGI to achieve influence in 1995 when the environment contained a greater variety of increasingly far-reaching tools. But with the internet being new, particular strategies to attain power (or even just rapidly acquire knowledge) were not yet available. Today, it is arguably easier than ever for an AGI to quickly and more-or-less single-handedly transform the world.
Snapshot intelligence versus intelligence as learning potential
There’s a sense in which cavemen are similarly intelligent as modern-day humans. If we time-traveled back into the stone age, found the couples with the best predictors for having gifted children, gave these couples access to 21st century nutrition and childbearing assistance, and then took their newborns back into today’s world where they’d grow up in a loving foster family with access to high-quality personalized education, there’s a good chance some of those babies would grow up to be relatively ordinary people of close to average intelligence. Those former(?) cavemen and cavewomen would presumably be capable of dealing with many if not most aspects of contemporary life and modern technology.
However, there’s also a sense in which cavemen are very unintelligent compared to modern-day humans. Culture, education, possibly even things like the Flynn effect, etc. – these really do change the way people think and act in the world. Cavemen are incredibly uneducated and untrained concerning knowledge and skills that are useful in modern, tool-rich environments.
We can think of this difference as the difference between the snapshot of someone’s intelligence at the peak of their development and their (initial) learning potential. Caveman and modern-day humans might be relatively close to each other in terms of the latter, but when considering their abilities at the peak of their personal development, the modern humans are much better at achieving goals in tool-rich environments. I sometimes get the impression that proponents of soft takeoffs underappreciate this difference when addressing comparisons between, for instance, early humans and chimpanzees (this is just a vague general impression which doesn’t apply to the arguments presented by AI impacts or by Paul Christiano).
How to make use of culture: The importance of distinguishing good ideas from bad ones
Both for productive engineers and creative geniuses, it holds that they could only have developed their full potential because they picked up useful pieces of insight from other people. But some people cannot tell the difference between high-quality information and low-quality information, or might make wrong use even of high-quality information, reasoning themselves into biased conclusions. An AI system capable of absorbing the entire internet but terrible at telling good ideas from bad ideas won’t make too much of a splash (at least not in terms of being able to take over the world). But what about an AI system just slightly above some cleverness threshold for adopting an increasingly efficient information diet? Couldn’t it absorb the internet in a highly systematic way rather than just soaking in everything indiscriminately, learning many essential meta-skills on its way, improving how it goes about the task of further learning?
Small differences in learning potential have compounded benefits over time
If the child in the chair next to me in fifth grade was slightly more intellectually curious, somewhat more productive, and marginally better dispositioned to adopt a truth-seeking approach and self-image than I am, this could initially mean they score 100%, and I score 95% on fifth-grade tests – no big difference. But as time goes on, their productivity gets them to read more books, their intellectual curiosity and good judgment get them to read more unusually useful books, and their cleverness gets them to integrate all this knowledge in better and increasingly more creative ways. I’ll reach a point where I’m just sort of skimming things because I’m not motivated enough to understand complicated ideas deeply, whereas they find it rewarding to comprehend everything that gives them a better sense of where to go next on their intellectual journey. By the time we graduate university, my intellectual skills are mostly useless, while they have technical expertise in several topics, can match or even exceed my thinking even on areas I specialized in, and get hired by some leading AI company. The point being: an initially small difference in dispositions becomes almost incomprehensibly vast over time.
Knowing how to learn strategically: A candidate for secret sauce??
(I realized that in this title/paragraph, the word “knowing” is meant both in the sense of “knowing how to do x” and “being capable of executing x very well.” It might be useful to try to disentangle this some more.) The standard AI foom narrative sounds a bit unrealistic when discussed in terms of some AI system inspecting itself and remodeling its inner architecture in a very deliberate way driven by architectural self-understanding. But what about the framing of being good at learning how to learn? There’s at least a plausible-sounding story we can tell where such an ability might qualify as the “secret sauce” that gives rise to a discontinuity in the returns of increased AI capabilities. In humans – and admittedly this might be too anthropomorphic – I’d think about it in this way: If my 12-year-old self had been brain-uploaded to a suitable virtual reality, made copies of, and given the task of devouring the entire internet in 1,000 years of subjective time (with no aging) to acquire enough knowledge and skill to produce novel and for-the-world useful intellectual contributions, the result probably wouldn’t be much of a success. If we imagined the same with my 19-year-old self, there’s a high chance the result wouldn’t be useful either – but also some chance it would be extremely useful. Assuming, for the sake of the comparison, that a copy clan of 19-year olds can produce highly beneficial research outputs this way, and a copy clan of 12-year olds can’t, what does the landscape look like in between? I don’t find it evident that the in-between is gradual. I think it’s at least plausible that there’s a jump once the copies reach a level of intellectual maturity to make plans which are flexible enough at the meta-level and divide labor sensibly enough to stay open to reassessing their approach as time goes on and they learn new things. Maybe all of that is gradual, and there are degrees of dividing labor sensibly or of staying open to reassessing one’s approach – but that doesn’t seem evident to me. Maybe this works more as an on/off thing.
How could natural selection produce on/off abilities?
It makes sense to be somewhat suspicious about any hypotheses according to which the evolution of general intelligence made a radical jump in Homo sapiens, creating thinking that is “discontinuous” from what came before. If knowing how to learn is an on/off ability that plays a vital role in the ways I described above, how could it evolve?
We’re certainly also talking culture, not just genes. And via the Baldwin effect, natural selection can move individuals closer towards picking up surprisingly complex strategies via learning from their environment. At this point at latest, my thinking becomes highly speculative. But here’s one hypothesis: In its generalization, this effect is about learning how to learn. And maybe there is something like a “broad basin of attraction” (inspired by Christiano’s broad basin of attraction for corrigibility) for robustly good reasoning / knowing how to learn. Picking up some of the right ideas initially and early on, combined with being good at picking up things in general, produces in people an increasingly better sense of how to order and structure other ideas, and over time, the best human learners start to increasingly resemble each other, having honed in on the best general strategies.
The mediocre success of self-improvement literature
For most people, the returns of self-improvement literature (by which I mean not just productivity advice, but also information on “how to be more rational,” etc.) might be somewhat useful, but rarely life-changing. People don’t tend to “go foom” from reading self-improvement advice. Why is that, and how does it square with my hypothesis above, that “knowing how to learn” could be a highly valuable skill with potentially huge compounding benefits? Maybe the answer is that the bottleneck is rarely knowledge about self-improvement, but rather the ability to make the best use of such knowledge? This would support the hypothesis mentioned above: If the critical skill is finding useful information in a massive sea of both useful and not-so-useful information, that doesn’t necessarily mean that people will get better at that skill if we gave them curated access to highly useful information (even if it’s information about how to find useful information, i.e., good self-improvement advice). Maybe humans don’t tend to go foom after receiving humanity’s best self-improvement advice because too much of that is too obvious for people who were already unusually gifted and then grew up in modern society where they could observe and learn from other people and their habits. However, now imagine someone who had never read any self-improvement advice, and could never observe others. For that person, we might have more reason to expect them to go foom – at least compared to their previous baseline – after reading curated advice on self-improvement (or, if it is true that self-improvement literature is often somewhat redundant, even just from joining an environment where they can observe and learn from other people and from society). And maybe that’s the situation in which the first AI system above a certain critical capabilities threshold finds itself. The threshold I mean is (something like) the ability to figure out how to learn quickly enough to then approach the information on the internet like the hypothetical 19-year olds (as opposed to the 12-year olds) from the thought experiment above.
---
Hard takeoff without a discontinuity
(This argument is separate from all the other arguments above.) Here’s something I never really understood about the framing of the hard vs. soft takeoff discussion. Let’s imagine a graph with inputs such as algorithmic insights and compute/hardware on the x-axis, and general intelligence (it doesn’t matter for my purposes whether we use learning potential or snapshot intelligence) on the y-axis. Typically, the framing is that proponents of hard takeoff believe that this graph contains a discontinuity where the growth mode changes, and suddenly the returns (for inputs such as compute) are vastly higher than the outside view would have predicted, meaning that the graph makes a jump upwards in the y-axis. But what about hard takeoff without such a discontinuity? If our graph starts to be steep enough at the point where AI systems reach human-level research capabilities and beyond, then that could in itself allow for some hard (or “quasi-hard”) takeoff. After all, we are not going to be sampling points (in the sense of deploying cutting-edge AI systems) from that curve every day – that simply wouldn’t work logistically even granted all the pressures to be cutting-edge competitive. If we assume that we only sample points from the curve every two months, for instance, is it possible that for whatever increase in compute and algorithmic insights we’d get in those two months, the differential on the y-axis (some measure of general intelligence) could be vast enough to allow for attaining a decisive strategic advantage (DSA) from being first? I don’t have strong intuitions about just how strongly the offense-defense balance will shift to once we are close to AGI, but it at least seems plausible that it turns a lot more towards offense, in which case arguably a lower differential is needed for attaining a DSA. In addition, based on the classical arguments put forward by researchers such as Bostrom and Yudkowsky, it also seems at least plausible to me that we are potentially dealing with a curve that is very steep around the human level. So, if one AGI project is two months ahead of another project, and we for the sake of argument assume that there are no inherent discontinuities in the graph in question, it’s still not evident to me that this couldn’t lead to something that very much looks like hard takeoff, just without an underlying discontinuity in the graph.