That’s not even true. In practice, it’s the best we’ve got, but it’s still terrible in most interesting settings (or else you could solve NP-hard problems in practice, which you can’t).
SGD seems to be sufficient to train neural networks on real world datasets. I would honestly argue that it’s better than whatever algorithms humans use. In that generally it learns much faster and far better on many problems.
That’s not sufficient for general intelligence. But I think the missing piece is not some super powerful optimization algorithm. A better optimizer is not going to change the field much or give us AGI.
And I also don’t think the magic of the human brain is in really good optimization. Most scientists don’t even believe that the brain can do anything as advanced as backpropagation, which requires accurately estimating gradients and propagating them backwards through long time steps. Most biologically plausible algorithms I’ve seen are super restricted or primitive.
It’s because the neural net algorithms are not even close to finding the optimal neural net in complex situations.
I disagree. You can find the optimal NN and it still might not be very good. For example, imagine feeding all the pixels of an image into a big NN. No matter how good the optimization, it will do way worse than one which exploits the structure of images. Like convolutional NNs, which have massive regularity and repeat the same pattern many times across the image (an edge detector on one part of an image is the same at another part.)
That’s trivial to do. It’s not the problem here.
It’s really not. Typical reinforcement learning is much more primitive than AIXI. AIXI, as best I understand it, actually simulates every hypothesis forward and picks the series of actions that lead to the best expected reward.
Even if you create a bunch of really good models, simulating them forward thousands of steps for every possible series of actions is impossible.
I disagree. You can find the optimal NN and it still might not be very good. For example, imagine feeding all the pixels of an image into a big NN. No matter how good the optimization, it will do way worse than one which exploits the structure of images. Like convolutional NNs, which have massive regularity and repeat the same pattern many times across the image (an edge detector on one part of an image is the same at another part.)
If you can find the optimal NN, that basically lets you solve circuit minimization, an NP-hard task. This will allow you to find the best computationally-tractable hypothesis for any problem, which is similar to Solomonoff induction for practical purposes. It will certainly be a huge improvement over current NN approaches, and it may indeed lead to AGI. Unfortunately, it’s probably impossible.
It’s really not. Typical reinforcement learning is much more primitive than AIXI. AIXI, as best I understand it, actually simulates every hypothesis forward and picks the series of actions that lead to the best expected reward.
I was only trying to say that if you’re finding the best NN, then simulating them is easy. I agree that this is not the full AIXI. I guess I misunderstood you—I thought you were trying to say that the reason NN doesn’t give us AGI is because they are hard to simulate.
SGD seems to be sufficient to train neural networks on real world datasets. I would honestly argue that it’s better than whatever algorithms humans use. In that generally it learns much faster and far better on many problems.
That’s not sufficient for general intelligence. But I think the missing piece is not some super powerful optimization algorithm. A better optimizer is not going to change the field much or give us AGI.
And I also don’t think the magic of the human brain is in really good optimization. Most scientists don’t even believe that the brain can do anything as advanced as backpropagation, which requires accurately estimating gradients and propagating them backwards through long time steps. Most biologically plausible algorithms I’ve seen are super restricted or primitive.
I disagree. You can find the optimal NN and it still might not be very good. For example, imagine feeding all the pixels of an image into a big NN. No matter how good the optimization, it will do way worse than one which exploits the structure of images. Like convolutional NNs, which have massive regularity and repeat the same pattern many times across the image (an edge detector on one part of an image is the same at another part.)
It’s really not. Typical reinforcement learning is much more primitive than AIXI. AIXI, as best I understand it, actually simulates every hypothesis forward and picks the series of actions that lead to the best expected reward.
Even if you create a bunch of really good models, simulating them forward thousands of steps for every possible series of actions is impossible.
If you can find the optimal NN, that basically lets you solve circuit minimization, an NP-hard task. This will allow you to find the best computationally-tractable hypothesis for any problem, which is similar to Solomonoff induction for practical purposes. It will certainly be a huge improvement over current NN approaches, and it may indeed lead to AGI. Unfortunately, it’s probably impossible.
I was only trying to say that if you’re finding the best NN, then simulating them is easy. I agree that this is not the full AIXI. I guess I misunderstood you—I thought you were trying to say that the reason NN doesn’t give us AGI is because they are hard to simulate.