I thought the point of the modularity hypothesis is that the brain only approximates a universal learning machine and has to be gerrymandered and trained to do so?
I’m not sure what you mean by gerrymandered. I summarized the modularity hypothesis in the beginning to differentiate it from the ULM hypothesis. There are a huge range of views in this space, so I reduced them to examplars of two important viewpoint clusters.
The specific key difference is the extent to which complex mental algorithms are learned vs innate.
If the brain were naturally a universal learner, then surely we wouldn’t have to learn universal learning (e.g. we wouldn’t have to learn to overcome cognitive biases, Bayesian reasoning wouldn’t be a recent discovery, etc.)?
You certainly don’t need to learn how to overcome cognitive biases to learn (this should be obvious). Knowledge of the brain’s limitations could be useful, but is probably more useful only in the context of having a high level understanding of how the brain works.
In regards to bayesian reasoning, the brain has a huge number of parallel systems and computations going on at once, many of which are implementing efficient approximate bayesian inference.
Verbal bayesian reasoning is just a subset of verbal mathematical reasoning—mapping sentences to equations, solving, and mapping back to sentences. It’s a specific complex ability that uses a number of brain regions. It’s something you need to learn for the same reasons you need to learn multiplication. The brain does tons of analog multiplications every second, but that doesn’t mean you have an automatic innate ability to do verbal math—as you don’t have an automatic innate ability to do much of anything.
The system seems too gappy and glitchy, too full of quick judgement and prejudice, to have been designed as a universal learner from the ground up.
One of the main points I make in the article is that universal learning machines are a very general thing that - in simplest form—can be specified in a small number of bits, just like a turing machine. So it’s a sort of obvious design that evolution would find.
What I meant is that you have sub-systems dedicated to (and originally evolved to perform) specific concrete tasks, and shifting coalitions of them (or rather shifting coalitions of their abstract core algorithms) are leveraged to work together to approximate a universal learning machine.
IOW any given specific subsystem (e.g. “recognizing a red spot in a patch of green”) has some abstract algorithm at its core which is then drawn upon at need by an organizing principle which utilizes it (plus other algorithms drawn from other task-specific brain gadgets) for more universal learning tasks.
That was my sketchy understanding of how it works from evol psych and things like Dennett’s books, Pinker, etc.
Furthermore, I thought the rationale of this explanation was that it’s hard to see how a universal learning machine can get off the ground evolutionarily (it’s going to be energetically expensive, not fast enough, etc.) whereas task-specific gadgets are easier to evolve (“need to know” principle), and it’s easier to later get an approximation of a universal machine off the ground on the back of shifting coalitions of them.
Ah ok your gerrymandering analogy now makes sense.
That was my sketchy understanding of how it works from evol psych and things like Dennett’s books, Pinker, etc.
I think that’s a good summary of the evolved modularity hypothesis. It turns out that we can actually look into the brain and test that hypothesis. Those tests were done, and lo and behold, the brain doesn’t work that way. The universal learning hypothesis emerged as the new theory to explain the new neuroscience data from the last decade or so.
So basically this is what the article is all about. You said earlier you skimmed it, so perhaps I need a better abstract or summary at the top, as oge suggested.
Furthermore, I thought the rationale of this explanation was that it’s hard to see how a universal learning machine can get off the ground evolutionarily (it’s going to be energetically expensive, not fast enough, etc.) whereas task-specific gadgets are easier to evolve (“need to know” principle),
This is a pretty good sounding rationale. It’s also probably wrong. It turns out a small ULM is relatively easy to specify, and also is completely compatible with innate task-specific gadgetry. In other words the universal learning machinery has very little drawbacks. All vertebrates have a similar core architecture based on the basal ganglia. In large brained mammals, the general purpose coprocessors (neocortex, cerebellum) are just expanded more than other structures.
In particular it looks like the brainstem has a bunch of old innate circuitry that the cortex and BG learns how to control (the BG does not just control the cortex), but I didn’t have time to get into the brainstem in the scope of this article.
I’m not sure what you mean by gerrymandered. I summarized the modularity hypothesis in the beginning to differentiate it from the ULM hypothesis. There are a huge range of views in this space, so I reduced them to examplars of two important viewpoint clusters.
The specific key difference is the extent to which complex mental algorithms are learned vs innate.
You certainly don’t need to learn how to overcome cognitive biases to learn (this should be obvious). Knowledge of the brain’s limitations could be useful, but is probably more useful only in the context of having a high level understanding of how the brain works.
In regards to bayesian reasoning, the brain has a huge number of parallel systems and computations going on at once, many of which are implementing efficient approximate bayesian inference.
Verbal bayesian reasoning is just a subset of verbal mathematical reasoning—mapping sentences to equations, solving, and mapping back to sentences. It’s a specific complex ability that uses a number of brain regions. It’s something you need to learn for the same reasons you need to learn multiplication. The brain does tons of analog multiplications every second, but that doesn’t mean you have an automatic innate ability to do verbal math—as you don’t have an automatic innate ability to do much of anything.
One of the main points I make in the article is that universal learning machines are a very general thing that - in simplest form—can be specified in a small number of bits, just like a turing machine. So it’s a sort of obvious design that evolution would find.
What I meant is that you have sub-systems dedicated to (and originally evolved to perform) specific concrete tasks, and shifting coalitions of them (or rather shifting coalitions of their abstract core algorithms) are leveraged to work together to approximate a universal learning machine.
IOW any given specific subsystem (e.g. “recognizing a red spot in a patch of green”) has some abstract algorithm at its core which is then drawn upon at need by an organizing principle which utilizes it (plus other algorithms drawn from other task-specific brain gadgets) for more universal learning tasks.
That was my sketchy understanding of how it works from evol psych and things like Dennett’s books, Pinker, etc.
Furthermore, I thought the rationale of this explanation was that it’s hard to see how a universal learning machine can get off the ground evolutionarily (it’s going to be energetically expensive, not fast enough, etc.) whereas task-specific gadgets are easier to evolve (“need to know” principle), and it’s easier to later get an approximation of a universal machine off the ground on the back of shifting coalitions of them.
Ah ok your gerrymandering analogy now makes sense.
I think that’s a good summary of the evolved modularity hypothesis. It turns out that we can actually look into the brain and test that hypothesis. Those tests were done, and lo and behold, the brain doesn’t work that way. The universal learning hypothesis emerged as the new theory to explain the new neuroscience data from the last decade or so.
So basically this is what the article is all about. You said earlier you skimmed it, so perhaps I need a better abstract or summary at the top, as oge suggested.
This is a pretty good sounding rationale. It’s also probably wrong. It turns out a small ULM is relatively easy to specify, and also is completely compatible with innate task-specific gadgetry. In other words the universal learning machinery has very little drawbacks. All vertebrates have a similar core architecture based on the basal ganglia. In large brained mammals, the general purpose coprocessors (neocortex, cerebellum) are just expanded more than other structures.
In particular it looks like the brainstem has a bunch of old innate circuitry that the cortex and BG learns how to control (the BG does not just control the cortex), but I didn’t have time to get into the brainstem in the scope of this article.
Great stuff, thanks! I’ll dig into the article more.