This assumes that task specific representations are hardwired in by evolution, which is mostly true only for the old brain. The cortex (along with the cerebellum) is essentially the biological equivalent of a large machine learning coprocessor, and at birth it has random connections, very much like any modern ML system, like ANNs. It appears that the cortex uses the same general learning algorithms to learn everything from vision to physics. This is the ‘one learning algorithm’ hypothesis, and has much support at this point.
I agree that there seems to be good evidence for the ‘one learning algorithm’ hypothesis… but there also seems to be reasonable evidence for modules that are specialized for particular tasks that were evolutionary useful; the most obvious example would be the extent to which we seem to have specialized reasoning capacity for modeling and interacting with other people, capacity which is to varying extent impaired in people on the autistic spectrum.
Even if one does assume that the cortex used the same learning algorithms for literally everything, one would still expect the parameters and properties of those algorithms to be at least partially genetically tuned towards the kinds of learning tasks that were most useful in the EEA (though of course the environment should be expected to carry out further tuning of the said parameters). I don’t think that the brain learning everything using the same algorithms would disprove the notion that there could exist alternative algorithms better optimized for learning e.g. abstract mathematics, and which could also employ a representation that was better optimized for abstract math, at the cost of being worse at more general learning of the type most useful in the EEA.
I agree that there seems to be good evidence for the ‘one learning algorithm’ hypothesis… but there also seems to be reasonable evidence for modules that are specialized for particular tasks that were evolutionary useful
The paper you linked to is long-winded. I jumped to the section titled “Do Modules Require Their Own Genes?”. I skimmed a bit and concluded that the authors were missing huge tracks of key recent knowledge from comp and developmental neuroscience and machine learning, and as a result they are fumbling in the dark.
the most obvious example would be the extent to which we seem to have specialized reasoning capacity for modeling and interacting with other people, capacity which is to varying extent impaired in people on the autistic spectrum.
Learning will automatically develop any number of specialized capabilities just as a natural organic process of interacting with the environment. Machine learning provides us with concrete specific knowledge of how this process actually works. The simplest explanation for autism inevitably involves disruptions to learning machinery, not disruptions to preconfigured “people interaction modules”.
Again to reiterate—obviously there are preconfigured modules—it is just that they necessarily form a tiny portion of the total circuitry.
Even if one does assume that the cortex used the same learning algorithms for literally everything, one would still expect the parameters and properties of those algorithms to be at least partially genetically tuned towards the kinds of learning tasks that were most useful in the EEA (though of course the environment should be expected to carry out further tuning of the said parameters).
Perhaps, perhaps not. Certainly genetics specifies a prior over model space. You can think of evolution wanting to specify as much as it can, but with only a tiny amount of code. So it specifies the brain in a sort of ultra-compressed hierarchical fashion. The rough number of main modules, neuron counts per module, and gross module connectivity are roughly pre-specified, and then within each module there are a just a few types of macrocircuits, each of which is composed of a few types of repeating microcircuits, and so on.
Using machine learning as an analogy, to solve a specific problem we typically come up with a general architecture that forms a prior over model space that we believe is well adapted to the problem. Then we use a standard optimization engine—like SGD—to handle the inference/learning given that model. The learning algorithms are very general purpose and cross domain.
I don’t think that the brain learning everything using the same algorithms would disprove the notion that there could exist alternative algorithms better optimized for learning e.g. abstract mathematics, and which could also employ a representation that was better optimized for abstract math, at the cost of being worse at more general learning of the type most useful in the EEA.
The distinction between the ‘model prior’ and the ‘learning algorithm’ is not always so clear cut, and some interesting successes in the field of metalearning suggest that there indeed exists highly effective specialized learning algorithms for at least some domains.
one would still expect the parameters and properties of those algorithms to be at least partially genetically tuned towards the kinds of learning tasks that were most useful in the EEA
Compare jacob_cannell’s earlier point that
obviously for any set of optimization criteria, constraints (including computational), and dataset there naturally can only ever be a single optimal solution (emphasis added)
Do we know or can we reasonably infer what those optimization criteria were like, so that we can implement them into our AI? If not, how likely and by how much would we expect the optimal solution to change?
I agree that there seems to be good evidence for the ‘one learning algorithm’ hypothesis… but there also seems to be reasonable evidence for modules that are specialized for particular tasks that were evolutionary useful; the most obvious example would be the extent to which we seem to have specialized reasoning capacity for modeling and interacting with other people, capacity which is to varying extent impaired in people on the autistic spectrum.
Even if one does assume that the cortex used the same learning algorithms for literally everything, one would still expect the parameters and properties of those algorithms to be at least partially genetically tuned towards the kinds of learning tasks that were most useful in the EEA (though of course the environment should be expected to carry out further tuning of the said parameters). I don’t think that the brain learning everything using the same algorithms would disprove the notion that there could exist alternative algorithms better optimized for learning e.g. abstract mathematics, and which could also employ a representation that was better optimized for abstract math, at the cost of being worse at more general learning of the type most useful in the EEA.
The paper you linked to is long-winded. I jumped to the section titled “Do Modules Require Their Own Genes?”. I skimmed a bit and concluded that the authors were missing huge tracks of key recent knowledge from comp and developmental neuroscience and machine learning, and as a result they are fumbling in the dark.
Learning will automatically develop any number of specialized capabilities just as a natural organic process of interacting with the environment. Machine learning provides us with concrete specific knowledge of how this process actually works. The simplest explanation for autism inevitably involves disruptions to learning machinery, not disruptions to preconfigured “people interaction modules”.
Again to reiterate—obviously there are preconfigured modules—it is just that they necessarily form a tiny portion of the total circuitry.
Perhaps, perhaps not. Certainly genetics specifies a prior over model space. You can think of evolution wanting to specify as much as it can, but with only a tiny amount of code. So it specifies the brain in a sort of ultra-compressed hierarchical fashion. The rough number of main modules, neuron counts per module, and gross module connectivity are roughly pre-specified, and then within each module there are a just a few types of macrocircuits, each of which is composed of a few types of repeating microcircuits, and so on.
Using machine learning as an analogy, to solve a specific problem we typically come up with a general architecture that forms a prior over model space that we believe is well adapted to the problem. Then we use a standard optimization engine—like SGD—to handle the inference/learning given that model. The learning algorithms are very general purpose and cross domain.
The distinction between the ‘model prior’ and the ‘learning algorithm’ is not always so clear cut, and some interesting successes in the field of metalearning suggest that there indeed exists highly effective specialized learning algorithms for at least some domains.
Compare jacob_cannell’s earlier point that
Do we know or can we reasonably infer what those optimization criteria were like, so that we can implement them into our AI? If not, how likely and by how much would we expect the optimal solution to change?