I think we can separate the arguments into about three camps, based on their purpose (though they (EDIT: whoops, forgot a don’t) don’t all cleanly sit in one camp):
Arguments why progress might be generally fast: Hominid variation, Brain scaling.
Arguments why a local advantage in AI might develop: Intelligence explosion, One algorithm, Starting high, Awesome AlphaZero.
Arguments why a local advantage in AI could cause a global discontinuity: Deployment scaling, Train vs. test, Payoff thresholds, Human-competition threshold, Uneven skills.
These facts need to work together to get the thesis of a single disruptive actor to go through: you need there to be jumps in AI intelligence, you need them to be fairly large even near human intelligence, and you need those increases to translate into a discontinuous impact on the world. This framework helps me evaluate arguments and counterarguments—for example, you don’t just argue against Hominid variation as showing that there will be a singularity, you argue against its more limited implications as well.
Bits I didn’t agree with, and therefore have lots to say about:
Intelligence Explosion:
The counterargument seems pretty wishy-washy. You say: “Positive feedback loops are common in the world, and very rarely move fast enough and far enough to become a dominant dynamic in the world.” How common? How rare? How dominant? Is global warming a dominant positive feedback loop because warming leads to increased water in the atmosphere which leads to more warming, and it’s going to have a big effect on the world? Or is it none of those, because Earth won’t get all that much warmer, because there are other well-understood effects keeping it in homeostasis?
More precisely, I think the argument from reference class that a positive feedback loop (or rather, the behavior that we approximate as a positive feedback loop) will be limited in time and space is hardly an argument at all—it practically concedes that the feedback loop argument works for the middle of the three camps above, but merely points out that it’s not also an argument that intelligence will be important. A strong argument against the intelligence feedback hypothesis has to argue that a positive feedback loop is unlikely.
One can obviously respond by emphasizing that objects in the reference class you’ve chosen (e.g. tipping back too far in your chair and falling) don’t generally impact the world, and therefore this is a reference class argument against AI impacting the world. But AI is not drawn uniformly from this reference class—the only reason we’re talking about it is because it’s been selected for the possibility of impacting the world. Failure to account for this selection pressure is why the strength of the argument seemed to change upon breaking it into parts vs. keeping it as a whole.
Deployment scaling:
We agree that slow deployment speed can “smooth out” a discontinuous jump in the state of the art into a continuous change in what people actually experience. You present each section as a standalone argument, and so we also agree that fast deployment speed alone does not imply discontinuous jumps.
But I think keeping things so separate misses the point that fast deployment is among the necessary conditions for a discontinuous impact. There’s also risk, if we think of things separately, of not remembering these necessary conditions when thinking about historical examples. Like, we might look at the history of drug development, where drug deployment and adoption takes a few years, and costs falling to allow more people to access the treatment takes more years, and notice that even though there’s an a priori argument for a discontinuous jump in best practices, peoples’ outcomes are continuous on the scale of several years. And then, if we’ve forgotten about other necessary factors, we might just attribute this to some mysterious low base rate of discontinuous jumps.
Payoff thresholds:
The counterargument doesn’t really hold together. We start ex hypothesi with some threshold effect in usefulness (e.g. good enough boats let you reach another island). Then you say that it won’t cause a discontinuity in things we care about directly; people might buy better boats, but because of this producers will spend more effort making better boats and sell them more expensively, so the “value per dollar” doesn’t jump. But this just assumes without justification that the production eats up all the value—why can’t the buyer and the producer both capture part of the increase in value? The only way the theoretical argument seems to work is in equilibrium—which isn’t what we care about.
Nuclear weapons are a neat example, but may be a misleading one. Nuclear weapons could have had half the yield, or twice the yield, without altering much about when they were built—although if you’d disagree with this, I’d be interested in in hearing about it. (Looking at your link, it seems like nuclear weapons were in fact more expensive per ton of TNT when they were first built—and yet they were built, which suggests there’s something fishy about their fit to this argument).
Awesome AlphaZero:
I think we can turn this into a more general thesis: Research is often local, and often discontinuous, and that’s important in AI. Fields whose advance seems continuous on the several-year scale may look jumpy on the six-month scale, and those jumps are usually localized to one research team rather than distributed. You can draw a straight line through a plot of e.g. performance of image-recognition AI, but that doesn’t mean that at the times in between the points there was a program with that intermediate skill at image-recognition. This is important to AI if the scale of the jumps, and the time between them, allows one team to jump through some region (not necessarily a discontinuity) of large gain in effect and gain a global advantage.
The missing argument about strategy:
There’s one possible factor contributing to the likelihood of discontinuity that I didn’t see, and that’s the strategic one. If people think that there is some level of advantage in AI that will allow them to have an important global impact, then they might not release their intermediate work to the public (so that other groups don’t know their status, and so their work can’t be copied), creating an apparent discontinuity when they decide to go public, even if 90% of their AI research would have gotten them 90% of the taking-over-the-world power.
I don’t quite follow you on the intelligence explosion issue. For instance, why does a strong argument against the intelligence explosion hypothesis need to show that a feedback loop is unlikely? Couldn’t we believe that it is likely, but not likely to be very rapid for a while? For instance, there is probably a feedback loop in intelligence already, where humans with better thoughts and equipment are effectively smarter, and can then devise better thoughts and equipment. But this has been true for a while, and is a fairly slow process (at least for now, relative to our ability to deal with things).
Yeah, upon rereading that response, I think I created a few non sequiturs in revision. I’m not even 100% sure what I meant by some bits. I think the arguments that now seem confusing were me was saying that by putting an intelligence feedback loop in the reference class of “feedback loops in general” and then using that to forecast low impact, the thing that is doing most of the work is simply how low impact most stuff is.
A nuclear bomb (or a raindrop forming, or tipping back a little too far in your chair) can be modeled as a feedback loop through several orders of magnitude of power output, and then eventually that model breaks down and the explosion dissipates, and the world might be a little scarred and radioactive, but it is overall not much different. But if your AI increased by several orders of magnitude in intelligence (let’s just pretend that’s meaningful for a second), I would expect that to be a much bigger deal, just because the thing that’s increasing is different. That is, I was thinking that the implicit model used by the reference class argument from the original link seems to predict local advantages in AI, but predict *against* those local advantages being important to the world at large, which I think is putting the most weight on the weakest link.
Part of this picture I had comes from what I’m imagining as prototypical reference class members—note that I only imagined self-sustaining feedback, not “subcritical” feedback. In retrospect, this seems to be begging the question somewhat—subcritical feedback speeds up progress, but doesn’t necessarily concentrate it, unless there is some specific threshold effect for getting that feedback. Another feature of my prototypes was that they’re out-of-equilibrium rather than in-equilibrium (an example of feedback in equilibrium is global warming, where there’s lots of feedback effects but they’re more or less canceling each other out), but this seems justified.
I would agree that one can imagine some kind of feedback loop in “effective smartness” of humans, but I am not sure how natural it is to divorce this from the economic / technological revolution that has radically reshaped our planet, since so much of our effective smartness enhancement is also economy / technology. But this is ye olde reference class ping pong.
A lot of great points!
I think we can separate the arguments into about three camps, based on their purpose (though they (EDIT: whoops, forgot a don’t) don’t all cleanly sit in one camp):
Arguments why progress might be generally fast: Hominid variation, Brain scaling.
Arguments why a local advantage in AI might develop: Intelligence explosion, One algorithm, Starting high, Awesome AlphaZero.
Arguments why a local advantage in AI could cause a global discontinuity: Deployment scaling, Train vs. test, Payoff thresholds, Human-competition threshold, Uneven skills.
These facts need to work together to get the thesis of a single disruptive actor to go through: you need there to be jumps in AI intelligence, you need them to be fairly large even near human intelligence, and you need those increases to translate into a discontinuous impact on the world. This framework helps me evaluate arguments and counterarguments—for example, you don’t just argue against Hominid variation as showing that there will be a singularity, you argue against its more limited implications as well.
Bits I didn’t agree with, and therefore have lots to say about:
Intelligence Explosion:
The counterargument seems pretty wishy-washy. You say: “Positive feedback loops are common in the world, and very rarely move fast enough and far enough to become a dominant dynamic in the world.” How common? How rare? How dominant? Is global warming a dominant positive feedback loop because warming leads to increased water in the atmosphere which leads to more warming, and it’s going to have a big effect on the world? Or is it none of those, because Earth won’t get all that much warmer, because there are other well-understood effects keeping it in homeostasis?
More precisely, I think the argument from reference class that a positive feedback loop (or rather, the behavior that we approximate as a positive feedback loop) will be limited in time and space is hardly an argument at all—it practically concedes that the feedback loop argument works for the middle of the three camps above, but merely points out that it’s not also an argument that intelligence will be important. A strong argument against the intelligence feedback hypothesis has to argue that a positive feedback loop is unlikely.
One can obviously respond by emphasizing that objects in the reference class you’ve chosen (e.g. tipping back too far in your chair and falling) don’t generally impact the world, and therefore this is a reference class argument against AI impacting the world. But AI is not drawn uniformly from this reference class—the only reason we’re talking about it is because it’s been selected for the possibility of impacting the world. Failure to account for this selection pressure is why the strength of the argument seemed to change upon breaking it into parts vs. keeping it as a whole.
Deployment scaling:
We agree that slow deployment speed can “smooth out” a discontinuous jump in the state of the art into a continuous change in what people actually experience. You present each section as a standalone argument, and so we also agree that fast deployment speed alone does not imply discontinuous jumps.
But I think keeping things so separate misses the point that fast deployment is among the necessary conditions for a discontinuous impact. There’s also risk, if we think of things separately, of not remembering these necessary conditions when thinking about historical examples. Like, we might look at the history of drug development, where drug deployment and adoption takes a few years, and costs falling to allow more people to access the treatment takes more years, and notice that even though there’s an a priori argument for a discontinuous jump in best practices, peoples’ outcomes are continuous on the scale of several years. And then, if we’ve forgotten about other necessary factors, we might just attribute this to some mysterious low base rate of discontinuous jumps.
Payoff thresholds:
The counterargument doesn’t really hold together. We start ex hypothesi with some threshold effect in usefulness (e.g. good enough boats let you reach another island). Then you say that it won’t cause a discontinuity in things we care about directly; people might buy better boats, but because of this producers will spend more effort making better boats and sell them more expensively, so the “value per dollar” doesn’t jump. But this just assumes without justification that the production eats up all the value—why can’t the buyer and the producer both capture part of the increase in value? The only way the theoretical argument seems to work is in equilibrium—which isn’t what we care about.
Nuclear weapons are a neat example, but may be a misleading one. Nuclear weapons could have had half the yield, or twice the yield, without altering much about when they were built—although if you’d disagree with this, I’d be interested in in hearing about it. (Looking at your link, it seems like nuclear weapons were in fact more expensive per ton of TNT when they were first built—and yet they were built, which suggests there’s something fishy about their fit to this argument).
Awesome AlphaZero:
I think we can turn this into a more general thesis: Research is often local, and often discontinuous, and that’s important in AI. Fields whose advance seems continuous on the several-year scale may look jumpy on the six-month scale, and those jumps are usually localized to one research team rather than distributed. You can draw a straight line through a plot of e.g. performance of image-recognition AI, but that doesn’t mean that at the times in between the points there was a program with that intermediate skill at image-recognition. This is important to AI if the scale of the jumps, and the time between them, allows one team to jump through some region (not necessarily a discontinuity) of large gain in effect and gain a global advantage.
The missing argument about strategy:
There’s one possible factor contributing to the likelihood of discontinuity that I didn’t see, and that’s the strategic one. If people think that there is some level of advantage in AI that will allow them to have an important global impact, then they might not release their intermediate work to the public (so that other groups don’t know their status, and so their work can’t be copied), creating an apparent discontinuity when they decide to go public, even if 90% of their AI research would have gotten them 90% of the taking-over-the-world power.
Thanks for your thoughts!
I don’t quite follow you on the intelligence explosion issue. For instance, why does a strong argument against the intelligence explosion hypothesis need to show that a feedback loop is unlikely? Couldn’t we believe that it is likely, but not likely to be very rapid for a while? For instance, there is probably a feedback loop in intelligence already, where humans with better thoughts and equipment are effectively smarter, and can then devise better thoughts and equipment. But this has been true for a while, and is a fairly slow process (at least for now, relative to our ability to deal with things).
Yeah, upon rereading that response, I think I created a few non sequiturs in revision. I’m not even 100% sure what I meant by some bits. I think the arguments that now seem confusing were me was saying that by putting an intelligence feedback loop in the reference class of “feedback loops in general” and then using that to forecast low impact, the thing that is doing most of the work is simply how low impact most stuff is.
A nuclear bomb (or a raindrop forming, or tipping back a little too far in your chair) can be modeled as a feedback loop through several orders of magnitude of power output, and then eventually that model breaks down and the explosion dissipates, and the world might be a little scarred and radioactive, but it is overall not much different. But if your AI increased by several orders of magnitude in intelligence (let’s just pretend that’s meaningful for a second), I would expect that to be a much bigger deal, just because the thing that’s increasing is different. That is, I was thinking that the implicit model used by the reference class argument from the original link seems to predict local advantages in AI, but predict *against* those local advantages being important to the world at large, which I think is putting the most weight on the weakest link.
Part of this picture I had comes from what I’m imagining as prototypical reference class members—note that I only imagined self-sustaining feedback, not “subcritical” feedback. In retrospect, this seems to be begging the question somewhat—subcritical feedback speeds up progress, but doesn’t necessarily concentrate it, unless there is some specific threshold effect for getting that feedback. Another feature of my prototypes was that they’re out-of-equilibrium rather than in-equilibrium (an example of feedback in equilibrium is global warming, where there’s lots of feedback effects but they’re more or less canceling each other out), but this seems justified.
I would agree that one can imagine some kind of feedback loop in “effective smartness” of humans, but I am not sure how natural it is to divorce this from the economic / technological revolution that has radically reshaped our planet, since so much of our effective smartness enhancement is also economy / technology. But this is ye olde reference class ping pong.