In the brain, the same circuitry that is used to solve vision is used to solve most of the rest of cognition
And in a laptop the same circuitry that it is used to run a spreadsheet is used to play a video game.
Systems that are Turing-complete (in the limit of infinite resources) tend to have an independence between hardware and possibly many layers of software (program running on VM running on VM running on VM and so on). Things that look similar at a some levels may have lots of difference at other levels, and thus things that look simple at some levels can have lots of hidden complexity at other levels.
Going from superhuman vision
Human-level (perhaps weakly superhuman) vision is achieved only in very specific tasks where large supervised datasets are available. This is not very surprising, since even traditional “hand-coded” computer vision could achieve superhuman performances in some narrow and clearly specified tasks.
Yes, but only because “ANN” is enormously broad (tensor/linear algebra program space), and basically includes all possible routes to AGI (all possible approximations of bayesian inference).
Again, ANN are Turing-complete, therefore in principle they include literally everything, but so does the brute-force search of C programs.
In practice if you try to generate C programs by brute-force search you will get stuck pretty fast, while ANN with gradient descent training empirically work well on various kinds of practical problems, but not on all kinds practical problems that humans are good at, and how to make them work on these problems, if it even efficiently possible, is a whole open research field.
Bayesian methods excel at one shot learning
With lots of task-specific engineering.
Generalized DL + MCTS is—rather obviously—a practical approximation of universal intelligence like AIXI.
So are things like AIXI-tl, Hutter-search, Gödel machine, and so on. Yet I would not consider any of them as the “foundational aspect” of intelligence.
And in a laptop the same circuitry that it is used to run a spreadsheet is used to play a video game.
Exactly, and this a good analogy to illustrate my point. Discovering that the cortical circuitry is universal vs task-specific (like an ASIC) was a key discovery.
Human-level (perhaps weakly superhuman) vision is achieved only in very specific tasks where large supervised datasets are available.
Note I didn’t say that we have solved vision to superhuman level, but this is simply not true. Current SOTA nets can achieve human-level performance in at least some domains using modest amounts of unsupervised data combined with small amounts of supervised data.
Human vision builds on enormous amounts of unsupervised data—much larger than ImageNet. Learning in the brain is complex and multi-objective, but perhaps best described as self-supervised (unsupervised meta-learning of sub-objective functions which then can be used for supervised learning).
A five year old will have experienced perhaps 50 million seconds worth of video data. Imagenet consists of 1 million images, which is vaguely equivalent to 1 million seconds of video if we include 30x amplification for small translations/rotations.
The brain’s vision system is about 100x larger than current ‘large’ vision ANNs. But If deepmind decided to spend the cash on that and make it a huge one off research priority, do you really doubt that they could build a superhuman general vision system that learns with a similar dataset and training duration?
So are things like AIXI-tl, Hutter-search, Gödel machine, and so on. Yet I would not consider any of them as the “foundational aspect” of intelligence.
The foundation of intelligence is just inference—simply because universal inference is sufficient to solve any other problem. AIXI is already simple, but you can make it even simpler by replacing the planning component with inference over high EV actions, or even just inference over program space to learn approx planning.
So it all boils down to efficient inference. The new exciting progress in DL—for me at least—is in understanding how successful empirical optimization techniques can be derived as approx inference update schemes with various types of priors. This is what I referred to as new and upcoming “Bayesian methods”—bayesian grounded DL.
And in a laptop the same circuitry that it is used to run a spreadsheet is used to play a video game.
Systems that are Turing-complete (in the limit of infinite resources) tend to have an independence between hardware and possibly many layers of software (program running on VM running on VM running on VM and so on). Things that look similar at a some levels may have lots of difference at other levels, and thus things that look simple at some levels can have lots of hidden complexity at other levels.
Human-level (perhaps weakly superhuman) vision is achieved only in very specific tasks where large supervised datasets are available. This is not very surprising, since even traditional “hand-coded” computer vision could achieve superhuman performances in some narrow and clearly specified tasks.
Again, ANN are Turing-complete, therefore in principle they include literally everything, but so does the brute-force search of C programs.
In practice if you try to generate C programs by brute-force search you will get stuck pretty fast, while ANN with gradient descent training empirically work well on various kinds of practical problems, but not on all kinds practical problems that humans are good at, and how to make them work on these problems, if it even efficiently possible, is a whole open research field.
With lots of task-specific engineering.
So are things like AIXI-tl, Hutter-search, Gödel machine, and so on. Yet I would not consider any of them as the “foundational aspect” of intelligence.
Exactly, and this a good analogy to illustrate my point. Discovering that the cortical circuitry is universal vs task-specific (like an ASIC) was a key discovery.
Note I didn’t say that we have solved vision to superhuman level, but this is simply not true. Current SOTA nets can achieve human-level performance in at least some domains using modest amounts of unsupervised data combined with small amounts of supervised data.
Human vision builds on enormous amounts of unsupervised data—much larger than ImageNet. Learning in the brain is complex and multi-objective, but perhaps best described as self-supervised (unsupervised meta-learning of sub-objective functions which then can be used for supervised learning).
A five year old will have experienced perhaps 50 million seconds worth of video data. Imagenet consists of 1 million images, which is vaguely equivalent to 1 million seconds of video if we include 30x amplification for small translations/rotations.
The brain’s vision system is about 100x larger than current ‘large’ vision ANNs. But If deepmind decided to spend the cash on that and make it a huge one off research priority, do you really doubt that they could build a superhuman general vision system that learns with a similar dataset and training duration?
The foundation of intelligence is just inference—simply because universal inference is sufficient to solve any other problem. AIXI is already simple, but you can make it even simpler by replacing the planning component with inference over high EV actions, or even just inference over program space to learn approx planning.
So it all boils down to efficient inference. The new exciting progress in DL—for me at least—is in understanding how successful empirical optimization techniques can be derived as approx inference update schemes with various types of priors. This is what I referred to as new and upcoming “Bayesian methods”—bayesian grounded DL.