Great post, thank you for writing this! Your list of natural-seeming ideas is very thought provoking.
The idea that there is a simple yet powerful theoretical framework which describes human intelligence and/or intelligence in general.
I used to think that way, but now I agree with your position more. Something like Bayesian rationality is a small piece that many problems have in common, but any given problem will have lots of other structure to be exploited as well. In many AI problems, like recognizing handwriting or playing board games, that lets you progress faster than if you’d tried to start with the Bayesian angle.
We could still hope that the best algorithm for any given problem will turn out to be simple. But that seems unlikely, judging from both AI tasks like MNIST, where neural nets beat anything hand-coded, and non-AI tasks like matrix multiplication, where asymptotically best algorithms have been getting more and more complex. As a rule, algorithms don’t get simpler as they get better.
I’m not sure what you changed your mind about. Some of the examples you give are unconvincing, as they do have simple meta-algorithms that both discover the more complicated better solutions and analyse their behavior. My guess is that the point is that for example looking into nuance of things like decision theory is an endless pursuit, with more and more complicated solutions accounting for more and more unusual aspects of situations (that can no longer be judged as clearly superior), and no simple meta-algorithm that could’ve found these more complicated solutions, because it wouldn’t know what to look for. But that’s content of values, the thing you look for in human behavior, and we need at least a poor solution to the problem of making use of that. Perhaps you mean that even this poor solution is too complicated for humans to discover?
There’s a difference between discovering something and being able to formalise it. We use the simple meta-algorithm of gradient descent to train neural networks, but that doesn’t allow us to understand their behaviour.
Also, meta-algorithms which seem simple to us may not in fact be simple, if our own minds are complicated to describe.
My impression is that an overarching algorithm would allow the agent to develop solutions for the specialized tasks, not that it would directly constitute a perfect solution. I don’t quite understand your position here – would you mind elaborating?
There are many problems to be solved. Each problem may or may not have regularities to be exploited. Some regularities are shared among many problems, like Bayes structure, but others are unique. Solving a problem in reasonable time might require exploiting multiple regularities in it, so Bayes structure alone isn’t enough. There’s no algorithm for exploiting all regularities in all problems in reasonable time (this is similar to P≠NP). You can combine algorithms for exploiting a bunch of regularities, ending up with a longer algorithm that can’t be compressed very much and doesn’t have any simple core. Human intelligence could be like that: a laundry list of algorithms that exploit specific regularities in our environment.
Great post, thank you for writing this! Your list of natural-seeming ideas is very thought provoking.
I used to think that way, but now I agree with your position more. Something like Bayesian rationality is a small piece that many problems have in common, but any given problem will have lots of other structure to be exploited as well. In many AI problems, like recognizing handwriting or playing board games, that lets you progress faster than if you’d tried to start with the Bayesian angle.
We could still hope that the best algorithm for any given problem will turn out to be simple. But that seems unlikely, judging from both AI tasks like MNIST, where neural nets beat anything hand-coded, and non-AI tasks like matrix multiplication, where asymptotically best algorithms have been getting more and more complex. As a rule, algorithms don’t get simpler as they get better.
I’m not sure what you changed your mind about. Some of the examples you give are unconvincing, as they do have simple meta-algorithms that both discover the more complicated better solutions and analyse their behavior. My guess is that the point is that for example looking into nuance of things like decision theory is an endless pursuit, with more and more complicated solutions accounting for more and more unusual aspects of situations (that can no longer be judged as clearly superior), and no simple meta-algorithm that could’ve found these more complicated solutions, because it wouldn’t know what to look for. But that’s content of values, the thing you look for in human behavior, and we need at least a poor solution to the problem of making use of that. Perhaps you mean that even this poor solution is too complicated for humans to discover?
There’s a difference between discovering something and being able to formalise it. We use the simple meta-algorithm of gradient descent to train neural networks, but that doesn’t allow us to understand their behaviour.
Also, meta-algorithms which seem simple to us may not in fact be simple, if our own minds are complicated to describe.
My impression is that an overarching algorithm would allow the agent to develop solutions for the specialized tasks, not that it would directly constitute a perfect solution. I don’t quite understand your position here – would you mind elaborating?
My position goes something like this.
There are many problems to be solved. Each problem may or may not have regularities to be exploited. Some regularities are shared among many problems, like Bayes structure, but others are unique. Solving a problem in reasonable time might require exploiting multiple regularities in it, so Bayes structure alone isn’t enough. There’s no algorithm for exploiting all regularities in all problems in reasonable time (this is similar to P≠NP). You can combine algorithms for exploiting a bunch of regularities, ending up with a longer algorithm that can’t be compressed very much and doesn’t have any simple core. Human intelligence could be like that: a laundry list of algorithms that exploit specific regularities in our environment.
> algorithms don’t get simpler as they get better.
or s you minimize cost along one dimension costs get pushed into other dimensions. Aether variables apply at the level of representation too.