I imagine the “secret sauce” line of thinking as “we are solving certain problems in the wrong complexity class”. Changing complexity class of an algorithm introduces a discontinuity; when near a take-off, then this discontinuity can get amplified into a fast take-off. The take-off can be especially fast if the compute hardware is already sufficient at the time of the break-through.
In other words: In order to expect a fast take-off, you only need to assume that the last crucial sub-problem for recursive self-improvement / explosion is done in the wrong complexity class prior to the discovery of a good algorithm.
For strong historical precedents, I would look for algorithmic advances that improved empirical average complexity class, and at the same time got a speed-up of e.g. 100 x on problem instances that were typical prior to the algorithmic discovery (so Strassen matrix-multiply is out).
For weaker historical precedent, I would look for advances that single-handedly made the entire field viable—that is, prior to the advance one had a tiny community caring about the problem; post the advance, the field (e.g. type of data analysis) became viable at all (hence, very limited commercial / academic interest in the subfield prior to its breakthrough). I think that this is meaningful precedent because people optimize for expected pay-off, and it is sometimes surprising that some small-but-crucial-if-possible subproblem can be solved at all (reasonable quickly)!
And I do believe that there are many parts of modern ML that are in the wrong complexity class (this does not mean that I could do better, nor that I necessarily expect an improvement or even discontinuous jump in usefulness).
For strong historical precedents, I would look for algorithmic advances that improved empirical average complexity class, and at the same time got a speed-up of e.g. 100 x on problem instances that were typical prior to the algorithmic discovery (so Strassen matrix-multiply is out).
Do you have any examples of this phenomenon in mind? I’m not aware of any examples with significant economic impact. If this phenomenon were common, it would probably change my view a lot. If it happened ever it would at least make me more sympathetic to the fast takeoff view and would change my view a bit.
Not sure. I encountered this once in my research, but the preprint is not out yet (alas, I’m pretty sure that this will still be not enough to reach commercial viability, so pretty niche and academic and not a very strong example).
Regarding “this is not common”: Of course not for problems many people care about. Once you are in the almost-optimal class, there are no more giant-sized fruit to pick, so most problems will experience that large jumps never, once or twice over all of expected human history (sorting is NlogN even if you are a super-intelligence) (pulling the numbers 0,1,2 out of my ass; feel free to do better). On the other hand, there is a long tail of problems very few people care about (e.g. because we have no fast solution and hence cannot incorporate a solver into a bigger construction). There, complexity-class jumps do not appear to be so uncommon.
Cheap prediction: Many visual machine-learning algos will gain in complexity class once they can handle compressed data (under the usual assumption that, after ordering by magnitude, coefficients will decay like a power-law for suitable wavelet-transforms of real-world input data; the “compression” introduces a cut-off / discretization, and a cool ML algo would only look at the coefficients it cares about and hence be able to e.g. output a classification in finite time for infinite input data). Also cheap prediction: This will turn out to be not as hot as it sounds, since GPUs (and to a lesser extent CPUs) are just too damn good at dealing with dense matrices and FFT is just too damn fast.
-----
Since I’m bad at computer history, some maybe-examples that spring to mind:
Probable example: Asymmetric crypto.
Very-maybe examples:
Fast fourier transform (except if you want to attribute it to Gauss). Linear programming (simplex algorithm). Multi-grid methods for PDE / the general shift from matrix-based direct solvers to iterative operator-based solvers. Probabilistic polynomial identity testing. Bitcoin (not the currency, rather the solution to trust-less sybil-safe global consensus that made the currency viable). Maybe Zk-snark (to my eternal shame I must admit that I don’t understand the details of how they work).
For large economic impact, possibly discontinuous: MP3 / the moment where good audio compression suddenly became viable and made many applications viable. I think this was more of an engineering advance (fast software decoding, with large hardware overhang on general-purpose CPUs), but others here probably know more about the history.
Everyone, feel free to list other historic examples of kinda-discontinuous algorithmic advances and use your superior historical knowledge to tear down my probably very bad examples.
-----
Separate point: Many processes require “critical mass” (which I called “viability”), and such processes amplify discontinuities / should be modeled as discontinuities.
Physics / maths intuition pump would be phase transitions or bifurcations in dynamical systems; e.g. falling off a saddle-node looks quite discontinuous if you have a separation of time-scales, even if it is not discontinuous once you zoom in far enough. Recursive self-improvement / general learning does have some superficial resemblance to such processes, and does have a separation of time-scales. Tell me if you want me to talk about the analogies more, but I was under the impression that they have been discussed ad-nauseam.
-----
I’ll continue to think about historic examples of complexity-class-like algorithmic advancements with economic impact and post if anything substantial comes to mind.
I imagine the “secret sauce” line of thinking as “we are solving certain problems in the wrong complexity class”. Changing complexity class of an algorithm introduces a discontinuity; when near a take-off, then this discontinuity can get amplified into a fast take-off. The take-off can be especially fast if the compute hardware is already sufficient at the time of the break-through.
In other words: In order to expect a fast take-off, you only need to assume that the last crucial sub-problem for recursive self-improvement / explosion is done in the wrong complexity class prior to the discovery of a good algorithm.
For strong historical precedents, I would look for algorithmic advances that improved empirical average complexity class, and at the same time got a speed-up of e.g. 100 x on problem instances that were typical prior to the algorithmic discovery (so Strassen matrix-multiply is out).
For weaker historical precedent, I would look for advances that single-handedly made the entire field viable—that is, prior to the advance one had a tiny community caring about the problem; post the advance, the field (e.g. type of data analysis) became viable at all (hence, very limited commercial / academic interest in the subfield prior to its breakthrough). I think that this is meaningful precedent because people optimize for expected pay-off, and it is sometimes surprising that some small-but-crucial-if-possible subproblem can be solved at all (reasonable quickly)!
And I do believe that there are many parts of modern ML that are in the wrong complexity class (this does not mean that I could do better, nor that I necessarily expect an improvement or even discontinuous jump in usefulness).
Do you have any examples of this phenomenon in mind? I’m not aware of any examples with significant economic impact. If this phenomenon were common, it would probably change my view a lot. If it happened ever it would at least make me more sympathetic to the fast takeoff view and would change my view a bit.
Not sure. I encountered this once in my research, but the preprint is not out yet (alas, I’m pretty sure that this will still be not enough to reach commercial viability, so pretty niche and academic and not a very strong example).
Regarding “this is not common”: Of course not for problems many people care about. Once you are in the almost-optimal class, there are no more giant-sized fruit to pick, so most problems will experience that large jumps never, once or twice over all of expected human history (sorting is NlogN even if you are a super-intelligence) (pulling the numbers 0,1,2 out of my ass; feel free to do better). On the other hand, there is a long tail of problems very few people care about (e.g. because we have no fast solution and hence cannot incorporate a solver into a bigger construction). There, complexity-class jumps do not appear to be so uncommon.
Cheap prediction: Many visual machine-learning algos will gain in complexity class once they can handle compressed data (under the usual assumption that, after ordering by magnitude, coefficients will decay like a power-law for suitable wavelet-transforms of real-world input data; the “compression” introduces a cut-off / discretization, and a cool ML algo would only look at the coefficients it cares about and hence be able to e.g. output a classification in finite time for infinite input data). Also cheap prediction: This will turn out to be not as hot as it sounds, since GPUs (and to a lesser extent CPUs) are just too damn good at dealing with dense matrices and FFT is just too damn fast.
-----
Since I’m bad at computer history, some maybe-examples that spring to mind:
Probable example: Asymmetric crypto.
Very-maybe examples:
Fast fourier transform (except if you want to attribute it to Gauss). Linear programming (simplex algorithm). Multi-grid methods for PDE / the general shift from matrix-based direct solvers to iterative operator-based solvers. Probabilistic polynomial identity testing. Bitcoin (not the currency, rather the solution to trust-less sybil-safe global consensus that made the currency viable). Maybe Zk-snark (to my eternal shame I must admit that I don’t understand the details of how they work).
For large economic impact, possibly discontinuous: MP3 / the moment where good audio compression suddenly became viable and made many applications viable. I think this was more of an engineering advance (fast software decoding, with large hardware overhang on general-purpose CPUs), but others here probably know more about the history.
Everyone, feel free to list other historic examples of kinda-discontinuous algorithmic advances and use your superior historical knowledge to tear down my probably very bad examples.
-----
Separate point: Many processes require “critical mass” (which I called “viability”), and such processes amplify discontinuities / should be modeled as discontinuities.
Physics / maths intuition pump would be phase transitions or bifurcations in dynamical systems; e.g. falling off a saddle-node looks quite discontinuous if you have a separation of time-scales, even if it is not discontinuous once you zoom in far enough. Recursive self-improvement / general learning does have some superficial resemblance to such processes, and does have a separation of time-scales. Tell me if you want me to talk about the analogies more, but I was under the impression that they have been discussed ad-nauseam.
-----
I’ll continue to think about historic examples of complexity-class-like algorithmic advancements with economic impact and post if anything substantial comes to mind.