Note that Nate and Eliezer expect there to be some curves you can draw after-the-fact that shows continuity in AGI progress on particular dimensions. They just don’t expect these to be the curves with the most practical impact (and they don’t think we can identify the curves with foresight, in 2022, to make strong predictions about AGI timing or rates of progress).
Yes, but conversely, I could say I’d expect some curves to show discontinuous jumps, mostly in dimensions which no one really cares about. Clearly the cruxes are about discontinuities in dimensions which matter.
As I tried to explain in the post, I think continuity assumptions mostly get you different things than “strong predictions about AGI timing”.
...
My point here isn’t to throw ‘AGI will undergo discontinuous leaps as they learn’ under the bus. Self-rewriting systems likely will (on my models) gain intelligence in leaps and bounds. What I’m trying to say is that I don’t think this disagreement is the central disagreement. I think the key disagreement is instead about where the main force of improvement in early human-designed AGI systems comes from — is it from existing systems progressing up their improvement curves, or from new systems coming online on qualitatively steeper improvement curves?
I would paraphrase this as “assuming discontinuities at every level”—both one-system training, and the more macroscopic exploration in the “space of learning systems”—but stating the key disagreement is about the discontinuities in the space of model architectures, rather than in jumpiness of single model training.
Personally, I don’t think the distinction between ‘movement by learning of a single model’ and ‘movement by scaling’ and ‘movement by architectural changes’ will be necessarily big.
There is, I think, a really basic difference of thinking here, which is that on my view, AGI erupting is just a Thing That Happens and not part of a Historical Worldview or a Great Trend.
This seem more or less support what I wrote? Expecting a Big Discontinuity, and this being a pretty deep difference?
I think the Hansonian viewpoint—which I consider another gradualist viewpoint, and whose effects were influential on early EA and which I think are still lingering around in EA—seemed surprised by AlphaGo and Alpha Zero, when you contrast its actual advance language with what actually happened. Inevitably, you can go back afterwards and claim it wasn’t really a surprise in terms of the abstractions that seem so clear and obvious now, but I think it was surprised then; and I also think that “there’s always a smooth abstraction in hindsight, so what, there’ll be one of those when the world ends too”, is a huge big deal in practice with respect to the future being unpredictable.
My overall impression is Eliezer likes to argue against “Hansonian views”, but something like “continuity assumptions” seem much broader category than Robin’s views.
Paul and Eliezer have had lots of discussions over the years, but I don’t think they talked about takeoff speeds between the 2018 post and the 2021 debate?
In my view continuity assumptions are not just about takeoff speeds. E.g, IDA make much more sense in a continuous world—if you reach a cliff, working IDA should slow down, and warn you. In the Truly Discontinuous world, you just jump off the cliff at some unknown step.
I would guess probably a majority of all debates and disagreements between Paul and Eliezer has some “continuity” component: e.g. the question whether we can learn a lot of important alignment stuff on non-AGI systems is a typical continuity problem, but only tangentially relevant to takeoff speeds.
Yes, but conversely, I could say I’d expect some curves to show discontinuous jumps, mostly in dimensions which no one really cares about. Clearly the cruxes are about discontinuities in dimensions which matter.
As I tried to explain in the post, I think continuity assumptions mostly get you different things than “strong predictions about AGI timing”.
I would paraphrase this as “assuming discontinuities at every level”—both one-system training, and the more macroscopic exploration in the “space of learning systems”—but stating the key disagreement is about the discontinuities in the space of model architectures, rather than in jumpiness of single model training.
Personally, I don’t think the distinction between ‘movement by learning of a single model’ and ‘movement by scaling’ and ‘movement by architectural changes’ will be necessarily big.
This seem more or less support what I wrote? Expecting a Big Discontinuity, and this being a pretty deep difference?
My overall impression is Eliezer likes to argue against “Hansonian views”, but something like “continuity assumptions” seem much broader category than Robin’s views.
In my view continuity assumptions are not just about takeoff speeds. E.g, IDA make much more sense in a continuous world—if you reach a cliff, working IDA should slow down, and warn you. In the Truly Discontinuous world, you just jump off the cliff at some unknown step.
I would guess probably a majority of all debates and disagreements between Paul and Eliezer has some “continuity” component: e.g. the question whether we can learn a lot of important alignment stuff on non-AGI systems is a typical continuity problem, but only tangentially relevant to takeoff speeds.