Regarding asymptotic vs. practical complexity: you are absolutely right, it is entirely possible that AGI is of exponential complexity but human-level is still feasible given practically available resources. A definite answer would require much more work than the handwaving I’m doing here, but I still think my argument is important for assessing the likelihood of different scenarios.
None of my arguments regarding AGI and NP apply to narrow AI. Narrow AIs don’t compute approximate Solomonoff induction in any sense, don’t evaluate logical uncertainties and are not “magic wands”: this is why they’re narrow! On the other hand humans seem to be very good at finding patterns (which is very close to Solomonoff induction), seem to easily reason about logical propositions as uncertain and solve problems in an amazingly large variety of domains (compared to narrow AIs and animals).
Thing is, it’s not clear to me that the human brain carries out anything theoretically similar to Solomonoff induction, in any part of the brain. I know that AIXI does, but the human brain is clearly not AIXI, otherwise it would be much smarter than it actually is (AIXIt is provably smarter than the human brain for general problem solving so it’s clear that the human brain must be a narrow intelligence).
Yup, Marcus Hutter proves in ‘Universal Artificial Intelligence’ that AIXIt performs “at least as well as the provably best time t and space l limited agent”, for general problems. He even offers a few concrete examples for this.
it’s only better than other programs when you can’t include assumptions about the domain.
Yes, and also when the domain-specific information doesn’t exist. We don’t yet precisely know how language works, for instance. If we did, we could code it, but we don’t.
There’s also the matter of having a large additive O(c) constant. That’s not as big an issue as some people make it out to be (it can be remedied by clever choice of programming language—provided you know what programming language to use).
No problem; this is a common misconception. People often think general intelligence is harder than narrow intelligence. Actually the best general intelligent agent can be coded with a few hundred lines of C code. It’s non-general intelligence that’s hard. Especially, the kind of non-general intelligence that the human brain has (identifying objects visually, language, humor, status games) is extremely hard to code. The G in AGI means ‘more general than a chess-playing robot, but less general than AIXI’
Ah, but that proof is trivial, uninteresting, and utterly useless.
AIXI(tl) is basically brute-force search. The proof you reference shows that no program can do better than AIXI(tl) at general problem solving. Let’s taboo “AIXI(tl)”: no program can do better than brute-force search at general problem solving.
If you have any computer science training at all, alarm bells should be going off. There are exceedingly few real world problem domains where brute force is really the best algorithm. They are so rare and esoteric in fact that we collect and categorize them, with immediate application to cryptographic methods. Even then nearly all such identified problems have better-than-brute-force solutions, with sqrt reduction of the search space being a very common lower bound. Even in crypto, a true no-better-than-brute-force-search problem is a rare beast.
So what is going on here? As expected, the problem is in the assumed underlying definitions: no program can do better than AIXI(tl)/brute-force-search at general problem solving. What is general problem solving? Why, exactly the mathematically appealing definition of course: solving problems for which the agent has no a priori knowledge about the problem domain or solution space.
Looked at in this light, the proof very nearly becomes a tautology. Assuming the probability distribution of the solution space to be entirely flat and unstructured – that is to say, every candidate solution has equal probability of being a real solution, and finding bad solutions teaches you nothing about where the good solution might lay – means that no other search except brute-force enumeration of solution space is possible. Every form of search would either be a different enumeration of solutions (which has the same expected performance), or repeats solutions and therefore is less efficient. It’s not the best technique – it’s the only technique… assuming that there really is no information available about the problem domain and that the solution space is entirely unstructured.
However it is those assumptions that are ridiculous. The definition adopted by many AGI people is ability for an agent to solve problems it was not specifically programmed deal with. That in no way precludes using information about these new problems that can be extracted from the environment. And in real world use cases, there’s a lot of information available. Just as a trivial example, if nothing else a real embodied problem exists in a universe with the 2nd law of thermodynamics, which means that questions about efficiency and resource competition matter. Agents that are programmed to understand and give preference to such considerations will do better than agents which are not programmed as such (e.g. AIXI(tl)).
Or, to summarize this whole post: no agent can do better than AIXI(tl) at problems which meet some mathematical platonic ideal of randomness, but it is trivial to create a better agent than AIXI(tl) at problems encountered in the real world.
Your premise is incorrect. AIXIt is not brute-force search of solutions. It is a true self-optimizing search; see arxiv.org/abs/cs/0004001. AIXI itself isn’t a brute force search either. The basic AIXI model actually specifies no algorithmic implementation, so saying ‘brute force search’ is incorrect.
All your other arguments follow from this premise so they are also incorrect.
Solomonoff induction involves enumerating all possible program strings and testing for validity. There’s another term for the class of algorithms which behave this way in the literature: it’s called brute-force search.
(AIXI is not Solomonoff induction, it is reinforcement learning using Solomonoff induction to evaluate possible strategies. If I’m playing loose with terminology, it is here. But that technicality does not affect my argument.)
AIXIt is provably smarter than the human brain for general problem solving so it’s clear that the human brain must be a narrow intelligence
Um, no? Maybe AIXIt is smarter (I’d quibble over how you define ‘smart’), but that doesn’t mean the human mind is incapable of solving general problems...
I never said the human mind is incapable of solving general problems. However it cannot be better than AIXIt at doing so. It’s either equal to AIXIt (which is very unlikely), or a narrow intelligence.
Solomonoff induction is essentially pattern matching and humans seem to be quite good at pattern matching. It is arguable that this ability is an important part of the human problem solving ability.
AIXIt is only “smarter than the human brain” in the ridiculous sense that it performs as well as the brain up to a gargantuan slow-down factor of 2^{length of program emulating human brain}. To seriously show it is smarter than the brain for general problem solving one would have to use an intelligence metric that takes resource bounds into account like a resource-bounded version of Legg-Hutter or my updateless metric.
I’m somewhat disappointed in this answer. Sure, it’s pattern matching, but there are many, many kinds of pattern matching, of which Solomonoff induction is just one kind.
it performs as well as the brain up to a gargantuan slow-down factor of 2^{length of program emulating human brain}.
That is not true. There is a large multiplicative factor but it’s not what you’re describing.
I’m not sure what kings of pattern matching you have in mind. Possibly you mean the sort of “pattern matching” typical to narrow AI e.g. a neural network firing with response to a certain “pattern”. Usually it is just clustering by dividing spaces with hypersurfaces constructed from some (very restricted) set of mathematical primitives. This sort of “pattern matching” is probably very important to the low-level working of the brain and the operation of unconscious systems such as visual processing, however it is not what I mean by “pattern matching” in the current context. I am referring to patterns expressible by language or logic. For example, if you see all numbers in a sequence are prime, it is a pattern. If you see all shapes in a collection are convex, it is a pattern. This sort of patterns are of fundamental importance in conscious thought and IMO can only be modeled mathematical by constructions akin to Solomonoff induction and Kolmogorov complexity.
There is a large multiplicative factor but it’s not what you’re describing.
On each step, AIXIt loops over all programs of given length and selects the one with the best performance in the current scenario. This is exponential in the length of the programs.
About the first point on pattern matching, I suggest you look at statistical inference techniques like GMM, DPGMM, LDA, LSA, or BM/RBMs. None of these have to do with Solomonoff induction, and they are more than capable of learning ‘patterns expressible by language or logic,’ yet they seem to more closely correspond to what the brain does than Solomonoff induction.
On each step, AIXIt loops over all programs of given length and selects the one with the best performance in the current scenario
Not precisely, but anyway that doesn’t lead to 2^{length of program emulating human brain} for human-level intelligence.
Thx for commenting!
Regarding asymptotic vs. practical complexity: you are absolutely right, it is entirely possible that AGI is of exponential complexity but human-level is still feasible given practically available resources. A definite answer would require much more work than the handwaving I’m doing here, but I still think my argument is important for assessing the likelihood of different scenarios.
None of my arguments regarding AGI and NP apply to narrow AI. Narrow AIs don’t compute approximate Solomonoff induction in any sense, don’t evaluate logical uncertainties and are not “magic wands”: this is why they’re narrow! On the other hand humans seem to be very good at finding patterns (which is very close to Solomonoff induction), seem to easily reason about logical propositions as uncertain and solve problems in an amazingly large variety of domains (compared to narrow AIs and animals).
Thing is, it’s not clear to me that the human brain carries out anything theoretically similar to Solomonoff induction, in any part of the brain. I know that AIXI does, but the human brain is clearly not AIXI, otherwise it would be much smarter than it actually is (AIXIt is provably smarter than the human brain for general problem solving so it’s clear that the human brain must be a narrow intelligence).
Not sure this logic works. AIXI with unlimited resources would be smarter. A version of AIXI that does some degree of an approximation though?
Yup, Marcus Hutter proves in ‘Universal Artificial Intelligence’ that AIXIt performs “at least as well as the provably best time t and space l limited agent”, for general problems. He even offers a few concrete examples for this.
...up to some additive and multiplicative constants, which are large enough to make any implementation impractical.
If it’s that good why isn’t it used? (Serious question).
edit: Based on your other comment I guess that’s it’s only better than other programs when you can’t include assumptions about the domain.
Yes, and also when the domain-specific information doesn’t exist. We don’t yet precisely know how language works, for instance. If we did, we could code it, but we don’t.
There’s also the matter of having a large additive O(c) constant. That’s not as big an issue as some people make it out to be (it can be remedied by clever choice of programming language—provided you know what programming language to use).
Thanks for pointing that out. I’ve skimmed some of the AIXI stuff before and obviously didn’t read it carefully enough.
No problem; this is a common misconception. People often think general intelligence is harder than narrow intelligence. Actually the best general intelligent agent can be coded with a few hundred lines of C code. It’s non-general intelligence that’s hard. Especially, the kind of non-general intelligence that the human brain has (identifying objects visually, language, humor, status games) is extremely hard to code. The G in AGI means ‘more general than a chess-playing robot, but less general than AIXI’
Ah, but that proof is trivial, uninteresting, and utterly useless.
AIXI(tl) is basically brute-force search. The proof you reference shows that no program can do better than AIXI(tl) at general problem solving. Let’s taboo “AIXI(tl)”: no program can do better than brute-force search at general problem solving.
If you have any computer science training at all, alarm bells should be going off. There are exceedingly few real world problem domains where brute force is really the best algorithm. They are so rare and esoteric in fact that we collect and categorize them, with immediate application to cryptographic methods. Even then nearly all such identified problems have better-than-brute-force solutions, with sqrt reduction of the search space being a very common lower bound. Even in crypto, a true no-better-than-brute-force-search problem is a rare beast.
So what is going on here? As expected, the problem is in the assumed underlying definitions: no program can do better than AIXI(tl)/brute-force-search at general problem solving. What is general problem solving? Why, exactly the mathematically appealing definition of course: solving problems for which the agent has no a priori knowledge about the problem domain or solution space.
Looked at in this light, the proof very nearly becomes a tautology. Assuming the probability distribution of the solution space to be entirely flat and unstructured – that is to say, every candidate solution has equal probability of being a real solution, and finding bad solutions teaches you nothing about where the good solution might lay – means that no other search except brute-force enumeration of solution space is possible. Every form of search would either be a different enumeration of solutions (which has the same expected performance), or repeats solutions and therefore is less efficient. It’s not the best technique – it’s the only technique… assuming that there really is no information available about the problem domain and that the solution space is entirely unstructured.
However it is those assumptions that are ridiculous. The definition adopted by many AGI people is ability for an agent to solve problems it was not specifically programmed deal with. That in no way precludes using information about these new problems that can be extracted from the environment. And in real world use cases, there’s a lot of information available. Just as a trivial example, if nothing else a real embodied problem exists in a universe with the 2nd law of thermodynamics, which means that questions about efficiency and resource competition matter. Agents that are programmed to understand and give preference to such considerations will do better than agents which are not programmed as such (e.g. AIXI(tl)).
Or, to summarize this whole post: no agent can do better than AIXI(tl) at problems which meet some mathematical platonic ideal of randomness, but it is trivial to create a better agent than AIXI(tl) at problems encountered in the real world.
Your premise is incorrect. AIXIt is not brute-force search of solutions. It is a true self-optimizing search; see arxiv.org/abs/cs/0004001. AIXI itself isn’t a brute force search either. The basic AIXI model actually specifies no algorithmic implementation, so saying ‘brute force search’ is incorrect.
All your other arguments follow from this premise so they are also incorrect.
Solomonoff induction involves enumerating all possible program strings and testing for validity. There’s another term for the class of algorithms which behave this way in the literature: it’s called brute-force search.
(AIXI is not Solomonoff induction, it is reinforcement learning using Solomonoff induction to evaluate possible strategies. If I’m playing loose with terminology, it is here. But that technicality does not affect my argument.)
It’s only brute-force search in the trivial sense that all search is brute-force search, at some level. See: No free lunch theorem.
Um, no? Maybe AIXIt is smarter (I’d quibble over how you define ‘smart’), but that doesn’t mean the human mind is incapable of solving general problems...
I never said the human mind is incapable of solving general problems. However it cannot be better than AIXIt at doing so. It’s either equal to AIXIt (which is very unlikely), or a narrow intelligence.
Solomonoff induction is essentially pattern matching and humans seem to be quite good at pattern matching. It is arguable that this ability is an important part of the human problem solving ability.
AIXIt is only “smarter than the human brain” in the ridiculous sense that it performs as well as the brain up to a gargantuan slow-down factor of 2^{length of program emulating human brain}. To seriously show it is smarter than the brain for general problem solving one would have to use an intelligence metric that takes resource bounds into account like a resource-bounded version of Legg-Hutter or my updateless metric.
I’m somewhat disappointed in this answer. Sure, it’s pattern matching, but there are many, many kinds of pattern matching, of which Solomonoff induction is just one kind.
That is not true. There is a large multiplicative factor but it’s not what you’re describing.
I’m not sure what kings of pattern matching you have in mind. Possibly you mean the sort of “pattern matching” typical to narrow AI e.g. a neural network firing with response to a certain “pattern”. Usually it is just clustering by dividing spaces with hypersurfaces constructed from some (very restricted) set of mathematical primitives. This sort of “pattern matching” is probably very important to the low-level working of the brain and the operation of unconscious systems such as visual processing, however it is not what I mean by “pattern matching” in the current context. I am referring to patterns expressible by language or logic. For example, if you see all numbers in a sequence are prime, it is a pattern. If you see all shapes in a collection are convex, it is a pattern. This sort of patterns are of fundamental importance in conscious thought and IMO can only be modeled mathematical by constructions akin to Solomonoff induction and Kolmogorov complexity.
On each step, AIXIt loops over all programs of given length and selects the one with the best performance in the current scenario. This is exponential in the length of the programs.
About the first point on pattern matching, I suggest you look at statistical inference techniques like GMM, DPGMM, LDA, LSA, or BM/RBMs. None of these have to do with Solomonoff induction, and they are more than capable of learning ‘patterns expressible by language or logic,’ yet they seem to more closely correspond to what the brain does than Solomonoff induction.
Not precisely, but anyway that doesn’t lead to 2^{length of program emulating human brain} for human-level intelligence.