When I met Robin in Oxford for a recent conference, we had a preliminary discussion on the Singularity—this is where Robin suggested using production functions. And at one point Robin said something like, “Well, let’s see whether your theory’s predictions fit previously observed growth rate curves,” which surprised me, because I’d never thought of that at all.
It had never occurred to me that my view of optimization ought to produce quantitative predictions. It seemed like something only an economist would try to do, as ’twere. (In case it’s not clear, sentence 1 is self-deprecating and sentence 2 is a compliment to Robin. --EY)
Looking back, it’s not that I made a choice to deal only in qualitative predictions, but that it didn’t really occur to me to do it any other way.
Perhaps I’m prejudiced against the Kurzweilian crowd, and their Laws of Accelerating Change and the like. Way back in the distant beginning that feels like a different person, I went around talking about Moore’s Law and the extrapolated arrival time of “human-equivalent hardware” a la Moravec. But at some point I figured out that if you weren’t exactly reproducing the brain’s algorithms, porting cognition to fast serial hardware and to human design instead of evolved adaptation would toss the numbers out the window—and that how much hardware you needed depended on how smart you were—and that sort of thing.
Betrayed, I decided that the whole Moore’s Law thing was silly and a corruption of futurism, and I restrained myself to qualitative predictions (and retrodictions) thenceforth.
Though this is to some extent an argument produced after the conclusion, I would explain my reluctance to venture into quantitative futurism, via the following trichotomy:
On problems whose pieces are individually precisely predictable, you can use the Strong Inside View to calculate a final outcome that has never been seen before—plot the trajectory of the first moon rocket before it is ever launched, or verify a computer chip before it is ever manufactured.
On problems that are drawn from a barrel of causally similar problems, where human optimism runs rampant and unforeseen troubles are common, the Outside View beats the Inside View. Trying to visualize the course of history piece by piece, will turn out to not (for humans) work so well, and you’ll be better off assuming a probable distribution of results similar to previous historical occasions—without trying to adjust for all the reasons why this time will be different and better.
But on problems that are new things under the Sun, where there’s a huge change of context and a structural change in underlying causal forces, the Outside View also fails—try to use it, and you’ll just get into arguments about what is the proper domain of “similar historical cases” or what conclusions can be drawn therefrom. In this case, the best we can do is use the Weak Inside View—visualizing the causal process—to produce loose qualitative conclusions about only those issues where there seems to be lopsided support.
So to me it seems “obvious” that my view of optimization is only strong enough to produce loose qualitative conclusions, and that it can only be matched to its retrodiction of history, or wielded to produce future predictions, on the level of qualitative physics.
“Things should speed up here”, I could maybe say. But not “The doubling time of this exponential should be cut in half.”
I aspire to a deeper understanding of intelligence than this, mind you. But I’m not sure that even perfect Bayesian enlightenment, would let me predict quantitatively how long it will take an AI to solve various problems in advance of it solving them. That might just rest on features of an unexplored solution space which I can’t guess in advance, even though I understand the process that searches.
Robin keeps asking me what I’m getting at by talking about some reasoning as “deep” while other reasoning is supposed to be “surface”. One thing which makes me worry that something is “surface”, is when it involves generalizing a level N feature across a shift in level N-1 causes.
For example, suppose you say “Moore’s Law has held for the last sixty years, so it will hold for the next sixty years, even after the advent of superintelligence” (as Kurzweil seems to believe, since he draws his graphs well past the point where you’re buying “a billion times human brainpower for $1000″).
Now, if the Law of Accelerating Change were an exogenous, ontologically fundamental, precise physical law, then you wouldn’t expect it to change with the advent of superintelligence.
But to the extent that you believe Moore’s Law depends on human engineers, and that the timescale of Moore’s Law has something to do with the timescale on which human engineers think, then extrapolating Moore’s Law across the advent of superintelligence is extrapolating it across a shift in the previous causal generator of Moore’s Law.
So I’m worried when I see generalizations extrapolated across a change in causal generators not themselves described—i.e. the generalization itself is on a level of the outputs of those generators and doesn’t describe the generators directly.
If, on the other hand, you extrapolate Moore’s Law out to 2015 because it’s been reasonably steady up until 2008 - well, Reality is still allowed to say “So what?”, to a greater extent than we can expect to wake up one morning and find Mercury in Mars’s orbit. But I wouldn’t bet against you, if you just went ahead and drew the graph.
So what’s “surface” or “deep” depends on what kind of context shifts you try to extrapolate past.
Taking a long historical long view, we see steady total growth rates punctuated by rare transitions when new faster growth modes appeared with little warning. We know of perhaps four such “singularities”: animal brains (~600MYA), humans (~2MYA), farming (~1OKYA), and industry (~0.2KYA). The statistics of previous transitions suggest we are perhaps overdue for another one, and would be substantially overdue in a century. The next transition would change the growth rate rather than capabilities directly, would take a few years at most, and the new doubling time would be a week to a month.
Why do these transitions occur? Why have they been similar to each other? Are the same causes still operating? Can we expect the next transition to be similar for the same reasons?
One may of course say, “I don’t know, I just look at the data, extrapolate the line, and venture this guess—the data is more sure than any hypotheses about causes.” And that will be an interesting projection to make, at least.
But you shouldn’t be surprised at all if Reality says “So what?” I mean—real estate prices went up for a long time, and then they went down. And that didn’t even require a tremendous shift in the underlying nature and causal mechanisms of real estate.
To stick my neck out further: I am liable to trust the Weak Inside View over a “surface” extrapolation, if the Weak Inside View drills down to a deeper causal level and the balance of support is sufficiently lopsided.
I will go ahead and say, “I don’t care if you say that Moore’s Law has held for the last hundred years. Human thought was a primary causal force in producing Moore’s Law, and your statistics are all over a domain of human neurons running at the same speed. If you substitute better-designed minds running at a million times human clock speed, the rate of progress ought to speed up—qualitatively speaking.”
That is, the prediction is without giving precise numbers, or supposing that it’s still an exponential curve; computation might spike to the limits of physics and then stop forever, etc. But I’ll go ahead and say that the rate of technological progress ought to speed up, given the said counterfactual intervention on underlying causes to increase the thought speed of engineers by a factor of a million. I’ll be downright indignant if Reality says “So what?” and has the superintelligence make slower progress than human engineers instead. It really does seem like an argument so strong that even Reality ought to be persuaded.
It would be interesting to ponder what kind of historical track records have prevailed in such a clash of predictions—trying to extrapolate “surface” features across shifts in underlying causes without speculating about those underlying causes, versus trying to use the Weak Inside View on those causes and arguing that there is “lopsided” support for a qualitative conclusion; in a case where the two came into conflict...
...kinda hard to think of what that historical case would be, but perhaps I only lack history.
Robin, how surprised would you be if your sequence of long-term exponentials just… didn’t continue? If the next exponential was too fast, or too slow, or something other than an exponential? To what degree would you be indignant, if Reality said “So what?”
The Weak Inside View
Followup to: The Outside View’s Domain
When I met Robin in Oxford for a recent conference, we had a preliminary discussion on the Singularity—this is where Robin suggested using production functions. And at one point Robin said something like, “Well, let’s see whether your theory’s predictions fit previously observed growth rate curves,” which surprised me, because I’d never thought of that at all.
It had never occurred to me that my view of optimization ought to produce quantitative predictions. It seemed like something only an economist would try to do, as ’twere. (In case it’s not clear, sentence 1 is self-deprecating and sentence 2 is a compliment to Robin. --EY)
Looking back, it’s not that I made a choice to deal only in qualitative predictions, but that it didn’t really occur to me to do it any other way.
Perhaps I’m prejudiced against the Kurzweilian crowd, and their Laws of Accelerating Change and the like. Way back in the distant beginning that feels like a different person, I went around talking about Moore’s Law and the extrapolated arrival time of “human-equivalent hardware” a la Moravec. But at some point I figured out that if you weren’t exactly reproducing the brain’s algorithms, porting cognition to fast serial hardware and to human design instead of evolved adaptation would toss the numbers out the window—and that how much hardware you needed depended on how smart you were—and that sort of thing.
Betrayed, I decided that the whole Moore’s Law thing was silly and a corruption of futurism, and I restrained myself to qualitative predictions (and retrodictions) thenceforth.
Though this is to some extent an argument produced after the conclusion, I would explain my reluctance to venture into quantitative futurism, via the following trichotomy:
On problems whose pieces are individually precisely predictable, you can use the Strong Inside View to calculate a final outcome that has never been seen before—plot the trajectory of the first moon rocket before it is ever launched, or verify a computer chip before it is ever manufactured.
On problems that are drawn from a barrel of causally similar problems, where human optimism runs rampant and unforeseen troubles are common, the Outside View beats the Inside View. Trying to visualize the course of history piece by piece, will turn out to not (for humans) work so well, and you’ll be better off assuming a probable distribution of results similar to previous historical occasions—without trying to adjust for all the reasons why this time will be different and better.
But on problems that are new things under the Sun, where there’s a huge change of context and a structural change in underlying causal forces, the Outside View also fails—try to use it, and you’ll just get into arguments about what is the proper domain of “similar historical cases” or what conclusions can be drawn therefrom. In this case, the best we can do is use the Weak Inside View—visualizing the causal process—to produce loose qualitative conclusions about only those issues where there seems to be lopsided support.
So to me it seems “obvious” that my view of optimization is only strong enough to produce loose qualitative conclusions, and that it can only be matched to its retrodiction of history, or wielded to produce future predictions, on the level of qualitative physics.
“Things should speed up here”, I could maybe say. But not “The doubling time of this exponential should be cut in half.”
I aspire to a deeper understanding of intelligence than this, mind you. But I’m not sure that even perfect Bayesian enlightenment, would let me predict quantitatively how long it will take an AI to solve various problems in advance of it solving them. That might just rest on features of an unexplored solution space which I can’t guess in advance, even though I understand the process that searches.
Robin keeps asking me what I’m getting at by talking about some reasoning as “deep” while other reasoning is supposed to be “surface”. One thing which makes me worry that something is “surface”, is when it involves generalizing a level N feature across a shift in level N-1 causes.
For example, suppose you say “Moore’s Law has held for the last sixty years, so it will hold for the next sixty years, even after the advent of superintelligence” (as Kurzweil seems to believe, since he draws his graphs well past the point where you’re buying “a billion times human brainpower for $1000″).
Now, if the Law of Accelerating Change were an exogenous, ontologically fundamental, precise physical law, then you wouldn’t expect it to change with the advent of superintelligence.
But to the extent that you believe Moore’s Law depends on human engineers, and that the timescale of Moore’s Law has something to do with the timescale on which human engineers think, then extrapolating Moore’s Law across the advent of superintelligence is extrapolating it across a shift in the previous causal generator of Moore’s Law.
So I’m worried when I see generalizations extrapolated across a change in causal generators not themselves described—i.e. the generalization itself is on a level of the outputs of those generators and doesn’t describe the generators directly.
If, on the other hand, you extrapolate Moore’s Law out to 2015 because it’s been reasonably steady up until 2008 - well, Reality is still allowed to say “So what?”, to a greater extent than we can expect to wake up one morning and find Mercury in Mars’s orbit. But I wouldn’t bet against you, if you just went ahead and drew the graph.
So what’s “surface” or “deep” depends on what kind of context shifts you try to extrapolate past.
Robin Hanson said:
Why do these transitions occur? Why have they been similar to each other? Are the same causes still operating? Can we expect the next transition to be similar for the same reasons?
One may of course say, “I don’t know, I just look at the data, extrapolate the line, and venture this guess—the data is more sure than any hypotheses about causes.” And that will be an interesting projection to make, at least.
But you shouldn’t be surprised at all if Reality says “So what?” I mean—real estate prices went up for a long time, and then they went down. And that didn’t even require a tremendous shift in the underlying nature and causal mechanisms of real estate.
To stick my neck out further: I am liable to trust the Weak Inside View over a “surface” extrapolation, if the Weak Inside View drills down to a deeper causal level and the balance of support is sufficiently lopsided.
I will go ahead and say, “I don’t care if you say that Moore’s Law has held for the last hundred years. Human thought was a primary causal force in producing Moore’s Law, and your statistics are all over a domain of human neurons running at the same speed. If you substitute better-designed minds running at a million times human clock speed, the rate of progress ought to speed up—qualitatively speaking.”
That is, the prediction is without giving precise numbers, or supposing that it’s still an exponential curve; computation might spike to the limits of physics and then stop forever, etc. But I’ll go ahead and say that the rate of technological progress ought to speed up, given the said counterfactual intervention on underlying causes to increase the thought speed of engineers by a factor of a million. I’ll be downright indignant if Reality says “So what?” and has the superintelligence make slower progress than human engineers instead. It really does seem like an argument so strong that even Reality ought to be persuaded.
It would be interesting to ponder what kind of historical track records have prevailed in such a clash of predictions—trying to extrapolate “surface” features across shifts in underlying causes without speculating about those underlying causes, versus trying to use the Weak Inside View on those causes and arguing that there is “lopsided” support for a qualitative conclusion; in a case where the two came into conflict...
...kinda hard to think of what that historical case would be, but perhaps I only lack history.
Robin, how surprised would you be if your sequence of long-term exponentials just… didn’t continue? If the next exponential was too fast, or too slow, or something other than an exponential? To what degree would you be indignant, if Reality said “So what?”