I imagine the relationship differently. I imagine a relationship between how well a system can perform a task and the number of tasks the same system can accomplish.
Does a chicken have a general intelligence? A chicken can perform a wide range of tasks with low performance, and performs most tasks worse than random. For example, could a chicken solve a Rubik’s Cube? I think it would perform worse than random.
Generality to me seems like an aggregation of many specialised processes working together seamlessly to achieve a wide variety of tasks. Where do humans sit on my scale? I think we are pretty far along the x axis, but not too far up the y axis.
For your orthogonality thesis to be right, you have to demonstrate the existence of a system that’s very far along the x axis but exactly zero on the y axis. I argue that such a system is equivalent to a system that sits at zero in both the x and y axes, and hence we have a counterexample.
Imagine a general intelligence (e.g. a human) with damage to a certain specialised part of their brain, e.g. short-term memory. They will have the ability to do a very wide variety of tasks, but they will struggle to play chess.
Jeff Dean has proposed a new approach to ML he calls System Pathways in which we connect many ML models together such that when one model learns, it can share its learnings with other models so that the aggregate system can be used to achieve a wide variety of tasks.
This would reduce duplicate learning. Sometimes two specialised models end up learning the same thing, but when those models are connected together, we only need to do the learning once and then share it.
If Turing Completeness turns out to be all we need to create a general intelligence, then I argue that any entity capable of creating computers is generally intelligent. The only living organism we know of that has succeeded in creating a computer is the humans. Creating computers seems to be some kind of intelligence escape velocity. Once you create computers, you can create more intelligence (and maybe destroy yourself and those around you in the process).
Regarding the counterexample: I think it is fair to say that perfect orthogonality is not plausible, especially not if we allow cases with one axis being zero, whatever that might mean. But intelligence and generality could be still largely orthogonal. What do you think of the case of insects, as an example of low intelligence and high generality?
(I think not even the original orthogonality thesis holds perfectly. An example is the oft-cited fact that animals don’t optimize for the goal of inclusive genetic fitness because they are not intelligent enough to grasp such a goal. So they instead optimize for similar things, like sex.)
Have you come across the work of Yann LeCun on world models? LeCun is very interested in generality. He calls generality the “dark matter of intelligence”. He thinks that to achieve a high degree of generality, the agent needs to construct world models.
Insects have highly simplified world models, and that could be part of the explanation for the high degree of generality exhibited by insects. For example, the fact the male Jewel Beetle fell in love with beer bottles mistaking them for females is strong evidence that beetles have highly simplified world models.
I see what you mean now. I like the example of insects. They certainly do have an extremely high degree of generality despite their very low level of intelligence.
Get a Rubik’s Cube playing robotic arm, and ask it to flip a shuffled Rubik’s Cube randomly until it’s solved. How many years will it take until it’s solved? It’s some finite time, right? Millions of year? Billions of years?
Get a chicken and give it a Rubik’s Cube and ask it to solve it. I don’t think it will perform better than our random robot above.
I just think that randomness is a useful benchmark for performance on accomplishing tasks.
In Bogosort, the lower bound for random version is unbounded which is O(inf). You can even turn the Rubiks Cube problem into a sorting problem of finding path from starting position to solved position, involving a series of moves. I’m not sure if there are more than one set of shortest path solution to each scenario.
Oh, I’m not making the argument that randomly permuting the Rubik’s Cube will always solve it in a finite time, but that it might. I think it has a better chance of solving it than the chicken. The chicken might get lucky and knock the Rubik’s Cube off the edge of a cliff and it might rotate by accident, but other than that, the chicken isn’t going to do much work on solving it in the first place. Meanwhile, randomly permuting might solve it (or might not solve it in the worst case). I just think that random permutations have a higher chance of solving it than the chicken, but I can’t formally prove that.
I imagine the relationship differently. I imagine a relationship between how well a system can perform a task and the number of tasks the same system can accomplish.
Does a chicken have a general intelligence? A chicken can perform a wide range of tasks with low performance, and performs most tasks worse than random. For example, could a chicken solve a Rubik’s Cube? I think it would perform worse than random.
Generality to me seems like an aggregation of many specialised processes working together seamlessly to achieve a wide variety of tasks. Where do humans sit on my scale? I think we are pretty far along the x axis, but not too far up the y axis.
For your orthogonality thesis to be right, you have to demonstrate the existence of a system that’s very far along the x axis but exactly zero on the y axis. I argue that such a system is equivalent to a system that sits at zero in both the x and y axes, and hence we have a counterexample.
Imagine a general intelligence (e.g. a human) with damage to a certain specialised part of their brain, e.g. short-term memory. They will have the ability to do a very wide variety of tasks, but they will struggle to play chess.
Jeff Dean has proposed a new approach to ML he calls System Pathways in which we connect many ML models together such that when one model learns, it can share its learnings with other models so that the aggregate system can be used to achieve a wide variety of tasks.
This would reduce duplicate learning. Sometimes two specialised models end up learning the same thing, but when those models are connected together, we only need to do the learning once and then share it.
If Turing Completeness turns out to be all we need to create a general intelligence, then I argue that any entity capable of creating computers is generally intelligent. The only living organism we know of that has succeeded in creating a computer is the humans. Creating computers seems to be some kind of intelligence escape velocity. Once you create computers, you can create more intelligence (and maybe destroy yourself and those around you in the process).
Regarding the counterexample: I think it is fair to say that perfect orthogonality is not plausible, especially not if we allow cases with one axis being zero, whatever that might mean. But intelligence and generality could be still largely orthogonal. What do you think of the case of insects, as an example of low intelligence and high generality?
(I think not even the original orthogonality thesis holds perfectly. An example is the oft-cited fact that animals don’t optimize for the goal of inclusive genetic fitness because they are not intelligent enough to grasp such a goal. So they instead optimize for similar things, like sex.)
Have you come across the work of Yann LeCun on world models? LeCun is very interested in generality. He calls generality the “dark matter of intelligence”. He thinks that to achieve a high degree of generality, the agent needs to construct world models.
Insects have highly simplified world models, and that could be part of the explanation for the high degree of generality exhibited by insects. For example, the fact the male Jewel Beetle fell in love with beer bottles mistaking them for females is strong evidence that beetles have highly simplified world models.
I see what you mean now. I like the example of insects. They certainly do have an extremely high degree of generality despite their very low level of intelligence.
How?
We can demonstrate this wth a test.
Get a Rubik’s Cube playing robotic arm, and ask it to flip a shuffled Rubik’s Cube randomly until it’s solved. How many years will it take until it’s solved? It’s some finite time, right? Millions of year? Billions of years?
Get a chicken and give it a Rubik’s Cube and ask it to solve it. I don’t think it will perform better than our random robot above.
I just think that randomness is a useful benchmark for performance on accomplishing tasks.
In Bogosort, the lower bound for random version is unbounded which is O(inf). You can even turn the Rubiks Cube problem into a sorting problem of finding path from starting position to solved position, involving a series of moves. I’m not sure if there are more than one set of shortest path solution to each scenario.
Oh, I’m not making the argument that randomly permuting the Rubik’s Cube will always solve it in a finite time, but that it might. I think it has a better chance of solving it than the chicken. The chicken might get lucky and knock the Rubik’s Cube off the edge of a cliff and it might rotate by accident, but other than that, the chicken isn’t going to do much work on solving it in the first place. Meanwhile, randomly permuting might solve it (or might not solve it in the worst case). I just think that random permutations have a higher chance of solving it than the chicken, but I can’t formally prove that.