Right. So Boolean circuits are a better analogy than Turing machines.
They are of course equivalent in theory, but in practice directly searching through a boolean circuit space is much wiser than searching through a program space. Searching through analog/algebraic circuit space is even better, because you can take advantage of fmads instead of having to spend enormous circuit complexity emulating them. Neural nets are even better than that, because they enforce a mostly continous/differentiable energy landscape which helps inference/optimization.
I’m sorry, what is deep factoring? A reference perhaps?
It’s the general idea that you can reuse subcomputations amongst models and layers. Solonomoff induction is retarded for a number of reasons, but one is this: it treats every function/model as entirely distinct. So if you have say one high level model which has developed a good cat detector, that isn’t shared amongst the other models. Deep nets (of various forms) automatically share submodel components AND subcomputations/subexpressions amongst those submodels. That incredibly, massively speeds up the search. That is deep factoring.
All the successful multi-layer models use deep factoring to some degree. This paper: Sum-Product Networks explains the general idea pretty well.
Good point! Nevertheless, it seems to me very dubious that the human brain can learn to do anything within the limits of its computing power. For example, why can’t I learn to look at a page full of exercises in arithmetics and solve all of them in parallel?
There’s alot of reasons. First, due to nonlinear foveation your visual system can only read/parse a couple of words/symbols during each saccade—only those right in the narrow center of the visual cone, the fovea. So it takes a number of clock cycles or steps to scan the entire page, and your brain only has limited working memory to put stuff in.
Secondly, the bigger problem is that even if you already know how to solve a math problem, just parsing many math problems requires a number of steps, and then actually solving them—even if you know the ideal algorithm that requires the minimal number of steps—that minimal number of steps can still be quite large.
Many interesting problems still require a number of serial steps to solve, even with an infinite parallel machine. Sorting is one simple example.
...Neural nets are even better than that, because they enforce a mostly continous/differentiable energy landscape which helps inference/optimization.
I wonder whether this is a general property or is the success of continuous methods limited to problem with natural continuous models like vision.
Deep nets (of various forms) automatically share submodel components AND subcomputations/subexpressions amongst those submodels.
Yes, this is probably important.
First, due to nonlinear foveation your visual system can only read/parse a couple of words/symbols during each saccade—only those right in the narrow center of the visual cone, the fovea. So it takes a number of clock cycles or steps to scan the entire page, and your brain only has limited working memory to put stuff in.
Scanning the page is clearly not the bottleneck: I can read the page much faster than solve the exercises. “Limited working memory” sounds a claim that higher cognition has much less computing resources than low level tasks. Clearly visual processing requires much more “working memory” than solving a couple of dozens of exercises in arithmetic. But if we accept this constraint then does the brain still qualify for a ULM? It seems to me that if there is a deficiency of the brain’s architecture that prevents higher cognition from enjoying the brain’s full power, solving this deficiency definitely counts as an “architectural innovation”.
They are of course equivalent in theory, but in practice directly searching through a boolean circuit space is much wiser than searching through a program space. Searching through analog/algebraic circuit space is even better, because you can take advantage of fmads instead of having to spend enormous circuit complexity emulating them. Neural nets are even better than that, because they enforce a mostly continous/differentiable energy landscape which helps inference/optimization.
It’s the general idea that you can reuse subcomputations amongst models and layers. Solonomoff induction is retarded for a number of reasons, but one is this: it treats every function/model as entirely distinct. So if you have say one high level model which has developed a good cat detector, that isn’t shared amongst the other models. Deep nets (of various forms) automatically share submodel components AND subcomputations/subexpressions amongst those submodels. That incredibly, massively speeds up the search. That is deep factoring.
All the successful multi-layer models use deep factoring to some degree. This paper: Sum-Product Networks explains the general idea pretty well.
There’s alot of reasons. First, due to nonlinear foveation your visual system can only read/parse a couple of words/symbols during each saccade—only those right in the narrow center of the visual cone, the fovea. So it takes a number of clock cycles or steps to scan the entire page, and your brain only has limited working memory to put stuff in.
Secondly, the bigger problem is that even if you already know how to solve a math problem, just parsing many math problems requires a number of steps, and then actually solving them—even if you know the ideal algorithm that requires the minimal number of steps—that minimal number of steps can still be quite large.
Many interesting problems still require a number of serial steps to solve, even with an infinite parallel machine. Sorting is one simple example.
I wonder whether this is a general property or is the success of continuous methods limited to problem with natural continuous models like vision.
Yes, this is probably important.
Scanning the page is clearly not the bottleneck: I can read the page much faster than solve the exercises. “Limited working memory” sounds a claim that higher cognition has much less computing resources than low level tasks. Clearly visual processing requires much more “working memory” than solving a couple of dozens of exercises in arithmetic. But if we accept this constraint then does the brain still qualify for a ULM? It seems to me that if there is a deficiency of the brain’s architecture that prevents higher cognition from enjoying the brain’s full power, solving this deficiency definitely counts as an “architectural innovation”.