daozaich comments on Takeoff Speed: Simple Asymptotics in a Toy Model.

daozaich 7 Mar 2018 1:54 UTC
4 points
0
(1) As Paul noted, the question of the exponent alpha is just the question of diminishing returns vs returns-to-scale.
Especially if you believe that the rate $f = f (R)$ is a product of multiple terms (like e.g. Paul’s suggestion $f = R^{α_{t}} \cdot R^{α_{a}}$ with one exponent for computer tech advances and another for algorithmic advances) then you get returns-to-scale type dynamics (over certain regimes, i.e. until all fruit are picked) with finite-time blow-up.
(2) Also, an imho crucial aspect is the separation of time-scales between human-driven research and computation done by machines (transistors are faster than neurons and buying more hardware scales better than training a new person up to the bleeding edge of research, especially considering Scott’s amusing parable of the alchemists).
Let’s add a little flourish to your model: You had the rate of research $I$ and the cumulative research $R$ ; let’s give a name $C$ to the capability of the AI system. Then, we can model $\partial_{t} R = I = f (R) = g (C) = g (h (R))$ . This is your model, just splitting terms into $h$ , which tells us how hard AI progress is, and $g$ which tells us how good we are at producing research.
Now denote by $q = q (C)$ the fraction of work that absolutely has to be done by humans, and by $ε$ the speed-up factor for silicon over biology. Amdahl’s law gives you $g (C) = \frac{1}{q (C) + ε (1 - q (C)) C}$ , or somewhat simplified $g (C) \geq \frac{1}{q + ε C}$ . This predicts a rate of progress that first looks like $1 / q$ , as long as human researcher input is the limiting factor, then becomes $1 / (ε C)$ when we have AIs designing AIs (recursive self-improvement, aka explosion), and then probably saturates at something (when the AI approaches optimality).
The crucial argument for fast take-off (as far as I understood it) is that we can expect $q (C)$ to hit $q = 0$ at some cross-over $C^{*}$ , and we can expect this to happen with a nonzero derivative $\partial_{C} q (C^{*}) \neq 0$ . This is just the claim that human-level AI is possible, and that the intelligence of the human parts of the AI research project is not sitting at a magical point (aka: this is generic, you would need to fine-tune your model to get something else).
The change of the rate of research output from the $1 / q (C)$ regime to the $1 / (ε C)$ regime sure looks like a hard-take-off singularity to me! And I would like to note that the function $h$ , i.e. the hardness AI research and the diminishing-returns vs returns-to-scale debate does not enter this discussion at any point.
In other words: If you model AI research as done by a team of humans and proto-AIs assisting the humans; and if you assert non-fungibility of humans vs proto-AI-assistents (even if you buy a thousand times more hardware, you still need the generally intelligent human researchers for some parts); and if you assert that better proto-AI-assistents can do a larger proportion of the work (at all); and if you assert that computers are faster than humans; then you get a possibly quite wild change at $q = 0$ .
I’d like to note that the cross-over is not “human-level AI”, but rather “ $q \approx 0$ ” , i.e. an AI that needs (almost) no human assistence to progress the field of AI research.
On the opposing side (that’s what Robin Hanson would probably say) you have the empirical argument that $q$ should decay like a power-law long before we $q = 0$ (“the last 10% take 90% of the work” is a folk formulation for “percentile 90-99 take nine time as much work as percentile 0-89″ aka power law, and is borne out quite well, empirically).
This does not have any impact on whether we cross $q = 0$ with non-vanishing derivative, but would support Paul’s view that the world will be unrecognizably crazy long before $q = 0$ .
PS. I am currently agnostic about the hard vs soft take-off debate. Yeah, I know, cowardly cop-out.
edit: In the above, C kinda encodes how fast / good our AI is and q encodes how general it is compared to humans. All AI singularity stuff tacitly assumes that human intelligence (assisted by stupid proto-AI) is sufficiently general to design an AI that exceeds or matches the generality of human intelligence. I consider this likely. The counterfactual world would have our AI capabilities saturate at some subhuman level for a long time, using terribly bad randomized/evolutionary algorithms, until it either stumbles unto an AI design that has better generality or we suffer unrelated extinction/heat-death. I consider it likely that human intelligence (assisted by proto-AI) is sufficiently general for a take-off. Heat-death is not an exaggeration: Algorithms with exponentially bad run-time are effectively useless.
Conversely, I consider it very well possible that human intelligence is insufficiently general to understand how human intelligence works! (we are really, really bad at understanding evolution/gradient-descent optimized anything, an that’s what we are)