I’m fundamentally sceptical that general intelligence is simple.
By “simple”, I mostly mean “non composite” . General intelligence would be simple if there were universal/general optimisers for real world problem sets that weren’t ensembles/compositions of many distinct narrow optimisers.
AIXI and its approximations are in this sense “not simple” (even if their Kolmogorov complexity might appear to be low).
Thus, I’m sceptical that efficient cross domain optimisation that isn’t just gluing a bunch of narrow optimisers together is feasible.
General Intelligence in Humans
Our brain is an ensemble of some inherited (e.g. circuits for face recognition, object recognition, navigation, text recognition, place recognition, etc.) and some dynamically generated narrow optimisers (depending on the individual: circuits for playing chess, musical instruments, soccer, typing, etc.; neuroplasticity more generally).
We probably do have some general meta machinery as a higher layer (I guess for stuff like abstraction, planning, learning new tasks/rewiring our neural circuits, inference, synthesising concepts, pattern recognition, etc.).
But we fundamentally learn/become good at new tasks by developing specialised neural circuits to perform those tasks, not leveraging a preexisting general optimiser.
(This is a very important difference).
We already self modify (just not in a conscious manner) and our ability to do general intelligence at all is strongly dependent on our self modification ability.
Our general optimiser is just a system/procedure for dynamically generating narrow optimisers to fit individual tasks.
Two Models of General Intelligence
This is an oversimplification, but to help gesture at what I’m talking about, I’d like to consider two distinct ways in which general intelligence might manifest.
A. Simple Intelligence
There exists a class of non compositional optimisation algorithms that are universal optimisers for the domains that actually manifest in the real world (these algorithms need not be universal for abitrary domains).
General intelligence is implemented by universal (non composit
General Intelligence and No Free Lunch Theorems
This suggests that reality is perhaps not so regular as for us to easily escape the No Free Lunch theorems. The more NFL theorems were a practical constraint, the more you’d expect general intelligence to look like an ensemble of narrow optimisers than a simple (non composite) universal optimiser.
People have rejected no free lunch theorems by specifying that reality was not a random distribution. There was intrinsic order and simplicity. It’s why humans could function as general optimisers in the first place.
But the ensemble like nature of human intelligence suggests that reality is not so simple and ordered for a single algorithm that does efficient cross domain optimisation.
We have an algorithm for generating algorithms. That is itself an algorithm, but it suggests that it’s not a simple one.
Conclusion
It seems to me that there is no simple general optimiser in humans.
The above has been my main takeaway from learning about how cognition works in humans (I’m still learning, but it seems to me like future learning would only deepen this insight instead of completely changing it).
We’re actually an ensemble of many narrow systems. Some are inherited because they were very useful in our evolutionary history.
But a lot are dynamically generated and regenerated. Our brain has the ability to rewire itself, create and modify its neural circuitry.
We constantly self modify our cognitive architectures (just without any conscious control over it). Maybe our meta machinery for coordinating and generating object level machinery remains intact?
This changes a lot about what I think is possible for intelligence. What “strongly superhuman intelligence” looks like.
A. There are universal non composite algorithms for predicting stimuli in the real world.
Becoming better at prediction transfers across all domains.
B. There are narrow algorithms good at predicting stimuli in distinct domains.
Becoming a good predictor in one domain doesn’t easily transfer to other domains.
Human intelligence being an ensemble makes it seem like we live in a world that looks more like B, than it does like A.
Predicting diverse stimuli involves composing many narrow algorithms. Specialising a neural circuit for predicting stimuli in one domain doesn’t easily transfer to predicting new domains.
Scepticism of Simple “General Intelligence”
Introduction
I’m fundamentally sceptical that general intelligence is simple.
By “simple”, I mostly mean “non composite” . General intelligence would be simple if there were universal/general optimisers for real world problem sets that weren’t ensembles/compositions of many distinct narrow optimisers.
AIXI and its approximations are in this sense “not simple” (even if their Kolmogorov complexity might appear to be low).
Thus, I’m sceptical that efficient cross domain optimisation that isn’t just gluing a bunch of narrow optimisers together is feasible.
General Intelligence in Humans
Our brain is an ensemble of some inherited (e.g. circuits for face recognition, object recognition, navigation, text recognition, place recognition, etc.) and some dynamically generated narrow optimisers (depending on the individual: circuits for playing chess, musical instruments, soccer, typing, etc.; neuroplasticity more generally).
We probably do have some general meta machinery as a higher layer (I guess for stuff like abstraction, planning, learning new tasks/rewiring our neural circuits, inference, synthesising concepts, pattern recognition, etc.).
But we fundamentally learn/become good at new tasks by developing specialised neural circuits to perform those tasks, not leveraging a preexisting general optimiser.
(This is a very important difference).
We already self modify (just not in a conscious manner) and our ability to do general intelligence at all is strongly dependent on our self modification ability.
Our general optimiser is just a system/procedure for dynamically generating narrow optimisers to fit individual tasks.
Two Models of General Intelligence
This is an oversimplification, but to help gesture at what I’m talking about, I’d like to consider two distinct ways in which general intelligence might manifest.
A. Simple Intelligence There exists a class of non compositional optimisation algorithms that are universal optimisers for the domains that actually manifest in the real world (these algorithms need not be universal for abitrary domains).
General intelligence is implemented by universal (non composit
General Intelligence and No Free Lunch Theorems
This suggests that reality is perhaps not so regular as for us to easily escape the No Free Lunch theorems. The more NFL theorems were a practical constraint, the more you’d expect general intelligence to look like an ensemble of narrow optimisers than a simple (non composite) universal optimiser.
People have rejected no free lunch theorems by specifying that reality was not a random distribution. There was intrinsic order and simplicity. It’s why humans could function as general optimisers in the first place.
But the ensemble like nature of human intelligence suggests that reality is not so simple and ordered for a single algorithm that does efficient cross domain optimisation.
We have an algorithm for generating algorithms. That is itself an algorithm, but it suggests that it’s not a simple one.
Conclusion
It seems to me that there is no simple general optimiser in humans.
Perhaps none exists in principle.
The above has been my main takeaway from learning about how cognition works in humans (I’m still learning, but it seems to me like future learning would only deepen this insight instead of completely changing it).
We’re actually an ensemble of many narrow systems. Some are inherited because they were very useful in our evolutionary history.
But a lot are dynamically generated and regenerated. Our brain has the ability to rewire itself, create and modify its neural circuitry.
We constantly self modify our cognitive architectures (just without any conscious control over it). Maybe our meta machinery for coordinating and generating object level machinery remains intact?
This changes a lot about what I think is possible for intelligence. What “strongly superhuman intelligence” looks like.
To illustrate how this matters.
Consider two scenarios:
A. There are universal non composite algorithms for predicting stimuli in the real world. Becoming better at prediction transfers across all domains.
B. There are narrow algorithms good at predicting stimuli in distinct domains. Becoming a good predictor in one domain doesn’t easily transfer to other domains.
Human intelligence being an ensemble makes it seem like we live in a world that looks more like B, than it does like A.
Predicting diverse stimuli involves composing many narrow algorithms. Specialising a neural circuit for predicting stimuli in one domain doesn’t easily transfer to predicting new domains.