A bunch of this was frustrating to read because it seemed like Paul was yelling “we should model continuous changes!” and Eliezer was yelling “we should model discrete events!” and these were treated as counter-arguments to each other.
It seems obvious from having read about dynamical systems that continuous models still have discrete phase changes. E.g. consider boiling water. As you put in energy the temperature increases until it gets to the boiling point, at which point more energy put in doesn’t increase the temperature further (for a while), it converts more of the water to steam; after all the water is converted to steam, more energy put in increases the temperature further.
So there are discrete transitions from (a) energy put in increases water temperature to (b) energy put in converts water to steam to (c) energy put in increases steam temperature.
In the case of AI improving AI vs. humans improving AI, a simple model to make would be one where AI quality is modeled as a variable, a, with the following dynamical equation:
dadt=h+ra
where h is the speed at which humans improve AI and r is a recursive self-improvement efficiency factor. The curve transitions from a line at early times (where h>>ra) to an exponential at later times (where ra>>h). It could be approximated as a piecewise function with a linear part followed by an exponential part, which is a more-discrete approximation than the original function, which has a continuous transition between linear and exponential.
This is nowhere near an adequate model of AI progress, but it’s the sort of model that would be created in the course of a mathematically competent discourse on this subject on the way to creating an adequate model.
Dynamical systems contains many beautiful and useful concepts like basins of attraction which make sense of discrete and continuous phenomena simultaneously (i.e. there are a discrete number of basins of attraction which points fall into based on their continuous properties).
I’ve found Strogatz’s book, Nonlinear Dynamics and Chaos, helpful for explaining the basics of dynamical systems.
I don’t really feel like anything you are saying undermines my position here, or defends the part of Eliezer’s picture I’m objecting to.
(ETA: but I agree with you that it’s the right kind of model to be talking about and is good to bring up explicitly in discussion. I think my failure to do so is mostly a failure of communication.)
I usually think about models that show the same kind of phase transition you discuss, though usually significantly more sophisticated models and moving from exponential to hyperbolic growth (you only get an exponential in your model because of the specific and somewhat implausible functional form for technology in your equation).
With humans alone I expect efficiency to double roughly every year based on the empirical returns curves, though it depends a lot on the trajectory of investment over the coming years. I’ve spent a long time thinking and talking with people about these issues.
At the point when the work is largely done by AI, I expect progress to be maybe 2x faster, so doubling every 6 months. And them from there I expect a roughly hyperbolic trajectory over successive doublings.
If takeoff is fast I still expect it to most likely be through a similar situation, where e.g. total human investment in AI R&D never grows above 1% and so at the time when takeoff occurs the AI companies are still only 1% of the economy.
1/(singularity_year–current_year). It’s the solution to the differential equation f′(x)=f(x)2 instead of f′(x)=f(x). I usually use it more broadly for 1/(singularity_year–current_year)α, which is the solution to f′(x)=f(x)1+1/α
Why do you use this form? Do you lean more on: 1. Historical trends that look hyperbolic; 2. Specific dynamical models like: let α be the synergy between “different innovations” as they’re producing more innovations; this gives f’(x) = f(x)^(1+α) *; or another such model?; 3. Something else?
I wonder if there’s a Paul-Eliezer crux here about plausible functional forms. For example, if Eliezer thinks that there’s very likely also a tech tree of innovations that change the synergy factor α, we get something like e.g. (a lower bound of) f’(x) = f(x)^f(x). IDK if there’s any help from specific forms; just that, it’s plausible that there’s forms that are (1) pretty simple, pretty straightforward lower bounds from simple (not necessarily high confidence) considerations of the dynamics of intelligence, and (2) look pretty similar to hyperbolic growth, until they don’t, and the transition happens quickly. Though maybe, if Eliezer thinks any of this and also thinks that these superhyperbolic synergy dynamics are already going on, and we instead use a stochastic differential equation, there should be something more to say about variance or something pre-End-times.
*ETA: for example, if every innovation combines with every other existing innovation to give one unit of progress per time, we get the hyperbolic f’(x) = f(x)^2; if innovations each give one progress per time but don’t combine, we get the exponential f’(x) = f(x).
I think there are two easy ways to get hyperbolic growth:
As long as there is free energy in the environment, without any technological change you can grow like f′(x)=f(x). Then if there is any technological progress that can be driven by your expanding physical civilization, then you get f′(x)=f(x)1+α, where α depends on how fast the returns to technology diminish.
Even without physical growth, if you have sufficiently good returns to technology (as we observe for historical technologies, if you treat doubling food as doubling output, or for modern information technology) then you end up with a similar functional form.
That would feel more like “plausible guess” if we didn’t have any historical data, but given that historical growth has in fact accelerated a huge amount it seems like a solid best guess to me. There’s been a bunch of debate about whether the historical data implies something kind of like this kind of functional form, or merely implies some kind of dramatic acceleration and is consistent with this functional form. But either way, it seems like the good bet is further dramatic acceleration if we either start returning energy capture to output (via AI) or start getting overall technological progress that is similar to existing rates of progress in computer hardware and software (via AI).
This section seemed like an instance of you and Eliezer talking past each other in a way that wasn’t locating a mathematical model containing the features you both believed were important (e.g. things could go “whoosh” while still being continuous):
[Christiano][13:46]
Even if we just assume that your AI needs to go off in the corner and not interact with humans, there’s still a question of why the self-contained AI civilization is making ~0 progress and then all of a sudden very rapid progress
[Yudkowsky][13:46]
unfortunately a lot of what you are saying, from my perspective, has the flavor of, “but can’t you tell me about your predictions earlier on of the impact on global warming at the Homo erectus level”
you have stories about why this is like totally not a fair comparison
I do not share these stories
[Christiano][13:46]
I don’t understand either your objection nor the reductio
like, here’s how I think it works: AI systems improve gradually, including on metrics like “How long does it take them to do task X?” or “How high-quality is their output on task X?”
[Yudkowsky][13:47]
I feel like the thing we know is something like, there is a sufficiently high level where things go whooosh humans-from-hominids style
[Christiano][13:47]
We can measure the performance of AI on tasks like “Make further AI progress, without human input”
Any way I can slice the analogy, it looks like AI will get continuously better at that task
My claim is that the timescale of AI self-improvement, at the point it takes over from humans, is the same as the previous timescale of human-driven AI improvement. If it was a lot faster, you would have seen a takeover earlier instead.
This claim is true in your model. It also seems true to me about hominids, that is I think that cultural evolution took over roughly when its timescale was comparable to the timescale for biological improvements, though Eliezer disagrees
I thought Eliezer’s comment “there is a sufficiently high level where things go whooosh humans-from-hominids style” was missing the point. I think it might have been good to offer some quantitative models at that point though I haven’t had much luck with that.
I can totally grant there are possible models for why the AI moves quickly from “much slower than humans” to “much faster than humans,” but I wanted to get some model from Eliezer to see what he had in mind.
(I find fast takeoff from various frictions more plausible, so that the question mostly becomes one about how close we are to various kinds of efficient frontiers, and where we respectively predict civilization to be adequate/inadequate or progress to be predictable/jumpy.)
It seems to me that Eliezer’s model of AGI is bit like an engine, where if any important part is missing, the entire engine doesn’t move. You can move a broken steam locomotive as fast as you can push it, maybe 1km/h. The moment you insert the missing part, the steam locomotive accelerates up to 100km/h. Paul is asking “when does the locomotive move at 20km/h” and Eliezer says “when the locomotive is already at full steam and accelerating to 100km/h.” There’s no point where the locomotive is moving at 20km/h and not accelerating, because humans can’t push it that fast, and once the engine is working, it’s already accelerating to a much faster speed.
In Paul’s model, there IS such a thing as 95% AGI, and it’s 80% or 20% or 2% as powerful on some metric we can measure, whereas in Eliezer’s model there’s no such thing as 95% AGI. The 95% AGI is like a steam engine that’s missing it’s pistons, or some critical valve, and so it doesn’t provide any motive power at all. It can move as fast as humans can push it, but it doesn’t provide any power of it’s own.
And then Paul’s response to Eliezer is like “but engines don’t just appear without precedent, there’s worse partial versions of them beforehand, much more so if people are actually trying to do locomotion; so even if knocking out a piece of the AI that FOOMs would make it FOOM much slower, that doesn’t tell us much about the lead-up to FOOM, and doesn’t tell us that the design considerations that go into the FOOMer are particularly discontinuous with previously explored design considerations”?
Right, and history sides with Paul. The earliest steam engines were missing key insights and so operated slowly, used their energy very inefficiently, and were limited in what they could do. The first steam engines were used as pumps, and it took a while before they were powerful enough to even move their own weight (locomotion). Each progressive invention, from Savery to Newcomen to Watt dramatically improved the efficiency of the engine, and over time engines could do more and more things, from pumping to locomotion to machining to flight. It wasn’t just one sudden innovation and now we have an engine that can do all the things including even lifting itself against the pull of Earth’s gravity. It took time, and progress on smooth metrics, before we had extremely powerful and useful engines that powered the industrial revolution. That’s why the industrial revolution(s) took hundreds of years. It wasn’t one sudden insight that made it all click.
To which my Eliezer-model’s response is “Indeed, we should expect that the first AGI systems will be pathetic in relative terms, comparing them to later AGI systems. But the impact of the first AGI systems in absolute terms is dependent on computer-science facts, just as the impact of the first nuclear bombs was dependent on facts of nuclear physics. Nuclear bombs have improved enormously since Trinity and Little Boy, but there is no law of nature requiring all prototypes to have approximately the same real-world impact, independent of what the thing is a prototype of.”
My main concern is that progress on the frontier tends to be bursty.
There are many metrics of AI performance on particular tasks where performance does indeed increase fairly continuously on the larger scale, but not in detail. Over the scale of many years it goes from abysmal to terrible to merely bad to nearly human to worse than human in some ways but better than human in others, and then to superhuman. Each of these transitions is often a sharp jump, but you see steady progress if you plot it on a graph. When you combine with having thousands of types of tasks, you end up with an overview of even smoother progress over the whole field.
There are three problems I’m worried about.
The first is that “designing better AIs” may turn out to be a relatively narrow task, and subject to a lot more burstiness than broad spectrum performance that could steadily increase world GDP.
The second is that for purposes of the future of humanity, only the last step from human-adjacent to strictly superhuman really matters. On the scale of intelligence for all the beings we know about, chimpanzees are very nearly human, but the economic effect of chimpanzees is essentially zero.
The third is that we are nowhere near fully exploiting the hardware we have for AI, and I expect that to continue for quite a while.
I think any two of these three are enough for a fast takeoff with little warning.
+1 on using dynamical systems models to try to formalize the frameworks in this debate. I also give Eliezer points for trying to do something similar in Intelligence Explosion Microeconomics (and to people who have looked at this from the macro perspective).
A bunch of this was frustrating to read because it seemed like Paul was yelling “we should model continuous changes!” and Eliezer was yelling “we should model discrete events!” and these were treated as counter-arguments to each other.
It seems obvious from having read about dynamical systems that continuous models still have discrete phase changes. E.g. consider boiling water. As you put in energy the temperature increases until it gets to the boiling point, at which point more energy put in doesn’t increase the temperature further (for a while), it converts more of the water to steam; after all the water is converted to steam, more energy put in increases the temperature further.
So there are discrete transitions from (a) energy put in increases water temperature to (b) energy put in converts water to steam to (c) energy put in increases steam temperature.
In the case of AI improving AI vs. humans improving AI, a simple model to make would be one where AI quality is modeled as a variable, a, with the following dynamical equation:
dadt=h+ra
where h is the speed at which humans improve AI and r is a recursive self-improvement efficiency factor. The curve transitions from a line at early times (where h>>ra) to an exponential at later times (where ra>>h). It could be approximated as a piecewise function with a linear part followed by an exponential part, which is a more-discrete approximation than the original function, which has a continuous transition between linear and exponential.
This is nowhere near an adequate model of AI progress, but it’s the sort of model that would be created in the course of a mathematically competent discourse on this subject on the way to creating an adequate model.
Dynamical systems contains many beautiful and useful concepts like basins of attraction which make sense of discrete and continuous phenomena simultaneously (i.e. there are a discrete number of basins of attraction which points fall into based on their continuous properties).
I’ve found Strogatz’s book, Nonlinear Dynamics and Chaos, helpful for explaining the basics of dynamical systems.
I don’t really feel like anything you are saying undermines my position here, or defends the part of Eliezer’s picture I’m objecting to.
(ETA: but I agree with you that it’s the right kind of model to be talking about and is good to bring up explicitly in discussion. I think my failure to do so is mostly a failure of communication.)
I usually think about models that show the same kind of phase transition you discuss, though usually significantly more sophisticated models and moving from exponential to hyperbolic growth (you only get an exponential in your model because of the specific and somewhat implausible functional form for technology in your equation).
With humans alone I expect efficiency to double roughly every year based on the empirical returns curves, though it depends a lot on the trajectory of investment over the coming years. I’ve spent a long time thinking and talking with people about these issues.
At the point when the work is largely done by AI, I expect progress to be maybe 2x faster, so doubling every 6 months. And them from there I expect a roughly hyperbolic trajectory over successive doublings.
If takeoff is fast I still expect it to most likely be through a similar situation, where e.g. total human investment in AI R&D never grows above 1% and so at the time when takeoff occurs the AI companies are still only 1% of the economy.
Excuse my ignorance, what does a hyperbolic function look like? If an exponential is f(x) = r^x, what is f(x) for a hyperbolic function?
1/(singularity_year–current_year). It’s the solution to the differential equation f′(x)=f(x)2 instead of f′(x)=f(x). I usually use it more broadly for 1/(singularity_year–current_year)α, which is the solution to f′(x)=f(x)1+1/α
Why do you use this form? Do you lean more on:
1. Historical trends that look hyperbolic;
2. Specific dynamical models like: let α be the synergy between “different innovations” as they’re producing more innovations; this gives f’(x) = f(x)^(1+α) *; or another such model?;
3. Something else?
I wonder if there’s a Paul-Eliezer crux here about plausible functional forms. For example, if Eliezer thinks that there’s very likely also a tech tree of innovations that change the synergy factor α, we get something like e.g. (a lower bound of) f’(x) = f(x)^f(x). IDK if there’s any help from specific forms; just that, it’s plausible that there’s forms that are (1) pretty simple, pretty straightforward lower bounds from simple (not necessarily high confidence) considerations of the dynamics of intelligence, and (2) look pretty similar to hyperbolic growth, until they don’t, and the transition happens quickly. Though maybe, if Eliezer thinks any of this and also thinks that these superhyperbolic synergy dynamics are already going on, and we instead use a stochastic differential equation, there should be something more to say about variance or something pre-End-times.
*ETA: for example, if every innovation combines with every other existing innovation to give one unit of progress per time, we get the hyperbolic f’(x) = f(x)^2; if innovations each give one progress per time but don’t combine, we get the exponential f’(x) = f(x).
I think there are two easy ways to get hyperbolic growth:
As long as there is free energy in the environment, without any technological change you can grow like f′(x)=f(x). Then if there is any technological progress that can be driven by your expanding physical civilization, then you get f′(x)=f(x)1+α, where α depends on how fast the returns to technology diminish.
Even without physical growth, if you have sufficiently good returns to technology (as we observe for historical technologies, if you treat doubling food as doubling output, or for modern information technology) then you end up with a similar functional form.
That would feel more like “plausible guess” if we didn’t have any historical data, but given that historical growth has in fact accelerated a huge amount it seems like a solid best guess to me. There’s been a bunch of debate about whether the historical data implies something kind of like this kind of functional form, or merely implies some kind of dramatic acceleration and is consistent with this functional form. But either way, it seems like the good bet is further dramatic acceleration if we either start returning energy capture to output (via AI) or start getting overall technological progress that is similar to existing rates of progress in computer hardware and software (via AI).
Nitpick: Isn’t 1/xα the solution for f′(x)=f(x)1+1α modulo constants? Or equivalently, 1x1α is the solution to f′(x)=f(x)1+α.
Yep, will fix.
-r/x
Finally a definitely of The Singularity that actually involves a mathematical singularity! Thank you.
(I’m interested in which of my claims seem to dismiss or not adequately account for the possibility that continuous systems have phase changes.)
This section seemed like an instance of you and Eliezer talking past each other in a way that wasn’t locating a mathematical model containing the features you both believed were important (e.g. things could go “whoosh” while still being continuous):
[Christiano][13:46]
Even if we just assume that your AI needs to go off in the corner and not interact with humans, there’s still a question of why the self-contained AI civilization is making ~0 progress and then all of a sudden very rapid progress
[Yudkowsky][13:46]
unfortunately a lot of what you are saying, from my perspective, has the flavor of, “but can’t you tell me about your predictions earlier on of the impact on global warming at the Homo erectus level”
you have stories about why this is like totally not a fair comparison
I do not share these stories
[Christiano][13:46]
I don’t understand either your objection nor the reductio
like, here’s how I think it works: AI systems improve gradually, including on metrics like “How long does it take them to do task X?” or “How high-quality is their output on task X?”
[Yudkowsky][13:47]
I feel like the thing we know is something like, there is a sufficiently high level where things go whooosh humans-from-hominids style
[Christiano][13:47]
We can measure the performance of AI on tasks like “Make further AI progress, without human input”
Any way I can slice the analogy, it looks like AI will get continuously better at that task
My claim is that the timescale of AI self-improvement, at the point it takes over from humans, is the same as the previous timescale of human-driven AI improvement. If it was a lot faster, you would have seen a takeover earlier instead.
This claim is true in your model. It also seems true to me about hominids, that is I think that cultural evolution took over roughly when its timescale was comparable to the timescale for biological improvements, though Eliezer disagrees
I thought Eliezer’s comment “there is a sufficiently high level where things go whooosh humans-from-hominids style” was missing the point. I think it might have been good to offer some quantitative models at that point though I haven’t had much luck with that.
I can totally grant there are possible models for why the AI moves quickly from “much slower than humans” to “much faster than humans,” but I wanted to get some model from Eliezer to see what he had in mind.
(I find fast takeoff from various frictions more plausible, so that the question mostly becomes one about how close we are to various kinds of efficient frontiers, and where we respectively predict civilization to be adequate/inadequate or progress to be predictable/jumpy.)
It seems to me that Eliezer’s model of AGI is bit like an engine, where if any important part is missing, the entire engine doesn’t move. You can move a broken steam locomotive as fast as you can push it, maybe 1km/h. The moment you insert the missing part, the steam locomotive accelerates up to 100km/h. Paul is asking “when does the locomotive move at 20km/h” and Eliezer says “when the locomotive is already at full steam and accelerating to 100km/h.” There’s no point where the locomotive is moving at 20km/h and not accelerating, because humans can’t push it that fast, and once the engine is working, it’s already accelerating to a much faster speed.
In Paul’s model, there IS such a thing as 95% AGI, and it’s 80% or 20% or 2% as powerful on some metric we can measure, whereas in Eliezer’s model there’s no such thing as 95% AGI. The 95% AGI is like a steam engine that’s missing it’s pistons, or some critical valve, and so it doesn’t provide any motive power at all. It can move as fast as humans can push it, but it doesn’t provide any power of it’s own.
And then Paul’s response to Eliezer is like “but engines don’t just appear without precedent, there’s worse partial versions of them beforehand, much more so if people are actually trying to do locomotion; so even if knocking out a piece of the AI that FOOMs would make it FOOM much slower, that doesn’t tell us much about the lead-up to FOOM, and doesn’t tell us that the design considerations that go into the FOOMer are particularly discontinuous with previously explored design considerations”?
Right, and history sides with Paul. The earliest steam engines were missing key insights and so operated slowly, used their energy very inefficiently, and were limited in what they could do. The first steam engines were used as pumps, and it took a while before they were powerful enough to even move their own weight (locomotion). Each progressive invention, from Savery to Newcomen to Watt dramatically improved the efficiency of the engine, and over time engines could do more and more things, from pumping to locomotion to machining to flight. It wasn’t just one sudden innovation and now we have an engine that can do all the things including even lifting itself against the pull of Earth’s gravity. It took time, and progress on smooth metrics, before we had extremely powerful and useful engines that powered the industrial revolution. That’s why the industrial revolution(s) took hundreds of years. It wasn’t one sudden insight that made it all click.
To which my Eliezer-model’s response is “Indeed, we should expect that the first AGI systems will be pathetic in relative terms, comparing them to later AGI systems. But the impact of the first AGI systems in absolute terms is dependent on computer-science facts, just as the impact of the first nuclear bombs was dependent on facts of nuclear physics. Nuclear bombs have improved enormously since Trinity and Little Boy, but there is no law of nature requiring all prototypes to have approximately the same real-world impact, independent of what the thing is a prototype of.”
My main concern is that progress on the frontier tends to be bursty.
There are many metrics of AI performance on particular tasks where performance does indeed increase fairly continuously on the larger scale, but not in detail. Over the scale of many years it goes from abysmal to terrible to merely bad to nearly human to worse than human in some ways but better than human in others, and then to superhuman. Each of these transitions is often a sharp jump, but you see steady progress if you plot it on a graph. When you combine with having thousands of types of tasks, you end up with an overview of even smoother progress over the whole field.
There are three problems I’m worried about.
The first is that “designing better AIs” may turn out to be a relatively narrow task, and subject to a lot more burstiness than broad spectrum performance that could steadily increase world GDP.
The second is that for purposes of the future of humanity, only the last step from human-adjacent to strictly superhuman really matters. On the scale of intelligence for all the beings we know about, chimpanzees are very nearly human, but the economic effect of chimpanzees is essentially zero.
The third is that we are nowhere near fully exploiting the hardware we have for AI, and I expect that to continue for quite a while.
I think any two of these three are enough for a fast takeoff with little warning.
+1 on using dynamical systems models to try to formalize the frameworks in this debate. I also give Eliezer points for trying to do something similar in Intelligence Explosion Microeconomics (and to people who have looked at this from the macro perspective).