The thing you are trying to study (“returns on cognitive reinvestment”) is probably one of the hardest things in the world to understand scientifically. It requires understanding both the capabilities of specific self-modifying agents and the complexity of the world. It depends what problem you are focusing on too—the shape of the curve may be very different for chess vs something like curing disease. Why? Because chess I can simulate on a computer, so throwing more compute at it leads to some returns. I can’t simulate human biology in a computer—we have to actually have people in labs doing complicated experiments just to understand one tiny bit of human biology.. so having more compute / cognitive power in any given agent isn’t necessarily going to speed things along.. you also need a way of manipulating things in labs (either humans or robots doing lots of experiments). Maybe in the future an AI could read massive numbers of scientific papers and synthesize them into new insights, but precisely what sort of “cognitive engine” is required to do that is also very controversial (could GPT-N do it?).
I don’t necessarily agree with it but I found it stimulating and helpful for understanding some of the complexities here.
So basically, this is a really complex thing.. throwing some definitions and math at it isn’t going to be very useful, I’m sorry to say. Throwing math and definitions at stuff is easy. Modeling data by fitting functions is easy. Neither is very useful in terms of actually being able to predict in novel situations (ie extrapolation / generalization), which is what we need to predict AI take-off dynamics. Actually understanding things mechanistically and coming up with explanatory theories that can withstand criticism and repeated experimental tests is very hard. That’s why typically people break hard questions/problems down into easier sub-questions/problems.
So basically, this is a really complex thing.. throwing some definitions and math at it isn’t going to be very useful, I’m sorry to say. Throwing math and definitions at stuff is easy. Modeling data by fitting functions is easy. Neither is very useful in terms of actually being able to predict in novel situations (ie extrapolation / generalization), which is what we need to predict AI take-off dynamics.
I disagree. The theoretical framework is a first step to allow us to reason more clearly about the topic. I expect to eventually bridge the gap between the theoretical and the empirical eventually. In fact, I just added some concrete empirical research directions I think could be pursued later on:
Even Further Future Directions
Some stuff I might like to do (much) later on. I would like to eventually bridge this theoretical framework to empirical work with neural networks. I’ll describe in brief two approaches to do that I’m interested in.
Estimating RCR From ML History
We could try to estimate the nature and/or behaviour of RCR across particular ML architectures by e.g. looking at progress across assorted performance benchmarks (and perhaps the computational resources [data, flops, parameter size, etc.] required to reach each benchmark) and comparing across various architectural and algorithmic lineage(s) for ML models. We’d probably need to compile a comprehensive genealogy of ML architectures and algorithms in pursuit of this approach.
This estimation may be necessary, because we may be unable to measure RCR across an agent’s genealogy before it is too late (if e.g. the design of more capable successors is something that agents can only do after crossing the human barrier).
Directly Measuring RCR in the Subhuman to Near Human Ranges
I am not fully convinced in the assumption behind that danger though. There is no complete map/full description of the human brain. No human has the equivalent of their “source code” or “model weights” with which to start designing a successor. It seems plausible that we could equip sufficiently subhuman (generality) agents with detailed descriptions/models of their own architectures, and some inbuilt heuristics/algorithms for how they might vary those designs to come up with new ones. We could select a few of the best candidate designs, train all of them to a similar extent and evaluate. We could repeat the experiment iteratively, across many generations of agents.
We could probably extrapolate the lineages pretty far (we might be able to reach the near-human domain without the experiment becoming too risky). Though there’s a point in the capability curve at which we would want to stop such experiments. And I wouldn’t be surprised if it turned out that the agents could reach superhuman ability in designing successors (able to improve their architectures faster than humans can), without reaching human generality across the full range of cognitive tasks.
(It may be wise not to test those assumptions if we did decide to run such an experiment).
Conclusions
Such empirical projects are far beyond the scope of this series (and my current research abilities). However, it’s something I might try to attempt in a few years after upskilling some more in AI/ML.
Recall that I called this “a rough draft of the first draft of one part of the nth post of what I hope to one day turn into a proper sequence”. There’s a lot of surrounding context that I haven’t gotten around to writing yet. And I do have a coherent narrative of where this all fits together in my broader project to investigate takeoff dynamics.
The formalisations aren’t useless; they serve to refine and sharpen thinking. Making things formal forces you to make explicit some things you’d left implicit.
The thing you are trying to study (“returns on cognitive reinvestment”) is probably one of the hardest things in the world to understand scientifically. It requires understanding both the capabilities of specific self-modifying agents and the complexity of the world. It depends what problem you are focusing on too—the shape of the curve may be very different for chess vs something like curing disease. Why? Because chess I can simulate on a computer, so throwing more compute at it leads to some returns. I can’t simulate human biology in a computer—we have to actually have people in labs doing complicated experiments just to understand one tiny bit of human biology.. so having more compute / cognitive power in any given agent isn’t necessarily going to speed things along.. you also need a way of manipulating things in labs (either humans or robots doing lots of experiments). Maybe in the future an AI could read massive numbers of scientific papers and synthesize them into new insights, but precisely what sort of “cognitive engine” is required to do that is also very controversial (could GPT-N do it?).
Are you familiar with the debate about Bloom et al and whether ideas are getting harder to find? (https://guzey.com/economics/bloom/ , https://www.cold-takes.com/why-it-matters-if-ideas-get-harder-to-find/). That’s relevant to predicting take-off.
The other post I always point people too is this one by Chollet.
I don’t necessarily agree with it but I found it stimulating and helpful for understanding some of the complexities here.
So basically, this is a really complex thing.. throwing some definitions and math at it isn’t going to be very useful, I’m sorry to say. Throwing math and definitions at stuff is easy. Modeling data by fitting functions is easy. Neither is very useful in terms of actually being able to predict in novel situations (ie extrapolation / generalization), which is what we need to predict AI take-off dynamics. Actually understanding things mechanistically and coming up with explanatory theories that can withstand criticism and repeated experimental tests is very hard. That’s why typically people break hard questions/problems down into easier sub-questions/problems.
I disagree. The theoretical framework is a first step to allow us to reason more clearly about the topic. I expect to eventually bridge the gap between the theoretical and the empirical eventually. In fact, I just added some concrete empirical research directions I think could be pursued later on:
Recall that I called this “a rough draft of the first draft of one part of the nth post of what I hope to one day turn into a proper sequence”. There’s a lot of surrounding context that I haven’t gotten around to writing yet. And I do have a coherent narrative of where this all fits together in my broader project to investigate takeoff dynamics.
The formalisations aren’t useless; they serve to refine and sharpen thinking. Making things formal forces you to make explicit some things you’d left implicit.
Glad you added these empirical research directions! If I were you I’d prioritize these over the theoretical framework.
Theory is needed to interpret the results of experiment. Only with a solid theoretical framework can useful empirical research be done.