So basically, this is a really complex thing.. throwing some definitions and math at it isn’t going to be very useful, I’m sorry to say. Throwing math and definitions at stuff is easy. Modeling data by fitting functions is easy. Neither is very useful in terms of actually being able to predict in novel situations (ie extrapolation / generalization), which is what we need to predict AI take-off dynamics.
I disagree. The theoretical framework is a first step to allow us to reason more clearly about the topic. I expect to eventually bridge the gap between the theoretical and the empirical eventually. In fact, I just added some concrete empirical research directions I think could be pursued later on:
Even Further Future Directions
Some stuff I might like to do (much) later on. I would like to eventually bridge this theoretical framework to empirical work with neural networks. I’ll describe in brief two approaches to do that I’m interested in.
Estimating RCR From ML History
We could try to estimate the nature and/or behaviour of RCR across particular ML architectures by e.g. looking at progress across assorted performance benchmarks (and perhaps the computational resources [data, flops, parameter size, etc.] required to reach each benchmark) and comparing across various architectural and algorithmic lineage(s) for ML models. We’d probably need to compile a comprehensive genealogy of ML architectures and algorithms in pursuit of this approach.
This estimation may be necessary, because we may be unable to measure RCR across an agent’s genealogy before it is too late (if e.g. the design of more capable successors is something that agents can only do after crossing the human barrier).
Directly Measuring RCR in the Subhuman to Near Human Ranges
I am not fully convinced in the assumption behind that danger though. There is no complete map/full description of the human brain. No human has the equivalent of their “source code” or “model weights” with which to start designing a successor. It seems plausible that we could equip sufficiently subhuman (generality) agents with detailed descriptions/models of their own architectures, and some inbuilt heuristics/algorithms for how they might vary those designs to come up with new ones. We could select a few of the best candidate designs, train all of them to a similar extent and evaluate. We could repeat the experiment iteratively, across many generations of agents.
We could probably extrapolate the lineages pretty far (we might be able to reach the near-human domain without the experiment becoming too risky). Though there’s a point in the capability curve at which we would want to stop such experiments. And I wouldn’t be surprised if it turned out that the agents could reach superhuman ability in designing successors (able to improve their architectures faster than humans can), without reaching human generality across the full range of cognitive tasks.
(It may be wise not to test those assumptions if we did decide to run such an experiment).
Conclusions
Such empirical projects are far beyond the scope of this series (and my current research abilities). However, it’s something I might try to attempt in a few years after upskilling some more in AI/ML.
Recall that I called this “a rough draft of the first draft of one part of the nth post of what I hope to one day turn into a proper sequence”. There’s a lot of surrounding context that I haven’t gotten around to writing yet. And I do have a coherent narrative of where this all fits together in my broader project to investigate takeoff dynamics.
The formalisations aren’t useless; they serve to refine and sharpen thinking. Making things formal forces you to make explicit some things you’d left implicit.
I disagree. The theoretical framework is a first step to allow us to reason more clearly about the topic. I expect to eventually bridge the gap between the theoretical and the empirical eventually. In fact, I just added some concrete empirical research directions I think could be pursued later on:
Recall that I called this “a rough draft of the first draft of one part of the nth post of what I hope to one day turn into a proper sequence”. There’s a lot of surrounding context that I haven’t gotten around to writing yet. And I do have a coherent narrative of where this all fits together in my broader project to investigate takeoff dynamics.
The formalisations aren’t useless; they serve to refine and sharpen thinking. Making things formal forces you to make explicit some things you’d left implicit.
Glad you added these empirical research directions! If I were you I’d prioritize these over the theoretical framework.
Theory is needed to interpret the results of experiment. Only with a solid theoretical framework can useful empirical research be done.