EY was influenced by E.T. Jaynes, who was really against neural networks, in favor of bayesian networks. He thought NNs were unprincipled and not mathematically elegant, and bayes nets were. I see the same opinions in some of EY’s writings, like the one you link. And the general attitude that “non-elegant = bad” is basically MIRI’s mission statement.
I don’t agree with this at all. I wrote a thing here about how NNs can be elegant, and derived from first principles. But more generally, AI should use whatever works. If that happens to be “scruffy” methods, then so be it.
But more generally, AI should use whatever works. If that happens to be “scruffy” methods, then so be it.
This seems like a bizarre statement if we care about knowable AI safety. Near as I can tell, you just called for the rapid creation of AGI that we can’t prove non-genocidal.
I don’t believe Houshalter was referring to proving Friendliness (or something along those lines); my impression is that he was talking about implementing an AI, in which case neural networks, while “scruffy”, should be considered a legitimate approach. (Of course, the “scruffiness” of NN’s could very well affect certain aspects of Friendliness research; my relatively uninformed impression is that it’s very difficult to prove results about NN’s.)
If you can prove anything interesting about a system, that system is too simple to be interesting. Logic can’t handle uncertainty, and doesn’t scale at all to describing/modelling systems as complex as societies, brains, AIs, etc.
AIXI is simple, and if our universe happened to allow turing machines to calculate endlessly behind cartesian barriers, it could be interesting in the sense of actually working.
I don’t agree with this at all. I wrote a thing here about how NNs can be elegant, and derived from first principles.
Nice post.
Anyway, according to some recent works (ref, ref), it seems to be possible to directly learn digital circuits from examples using some variant of backproagation. In principle, if you add a circuit size penalty (which may be well the tricky part) this becomes time-bounded maximum a posteriori Solomonoff induction.
Yes binary neural networks are super interesting because they can be made much more compact in hardware than floating point ops. However there isn’t much (theoretical) advantage otherwise. Anything a circuit can do, an NN can do, and vice versa.
A circuit size penalty is already a very common technique. It’s called weight decay, where the synapses are encouraged to be as close to zero as possible. A synapse of 0 is the same as it not being there, which means the neural net parameters requires less information to specify.
I suppose the main lesson for us can be summarized by the famous verse:
A little learning is a dangerous thing;
Drink deep, or taste not the Pierian spring:
There shallow draughts intoxicate the brain,
And drinking largely sobers us again.
The sequences definitely qualify as shallow draughts that intoxicate the brain :-(
EY was influenced by E.T. Jaynes, who was really against neural networks, in favor of bayesian networks. He thought NNs were unprincipled and not mathematically elegant, and bayes nets were. I see the same opinions in some of EY’s writings, like the one you link. And the general attitude that “non-elegant = bad” is basically MIRI’s mission statement.
I don’t agree with this at all. I wrote a thing here about how NNs can be elegant, and derived from first principles. But more generally, AI should use whatever works. If that happens to be “scruffy” methods, then so be it.
This seems like a bizarre statement if we care about knowable AI safety. Near as I can tell, you just called for the rapid creation of AGI that we can’t prove non-genocidal.
I don’t believe Houshalter was referring to proving Friendliness (or something along those lines); my impression is that he was talking about implementing an AI, in which case neural networks, while “scruffy”, should be considered a legitimate approach. (Of course, the “scruffiness” of NN’s could very well affect certain aspects of Friendliness research; my relatively uninformed impression is that it’s very difficult to prove results about NN’s.)
If you can prove anything interesting about a system, that system is too simple to be interesting. Logic can’t handle uncertainty, and doesn’t scale at all to describing/modelling systems as complex as societies, brains, AIs, etc.
AIXI is simple, and if our universe happened to allow turing machines to calculate endlessly behind cartesian barriers, it could be interesting in the sense of actually working.
We have wildly different definitions of interesting, at least in the context of my original statement. :)
Yes, we need to find the way to make existing AIs safe.
Nice post.
Anyway, according to some recent works (ref, ref), it seems to be possible to directly learn digital circuits from examples using some variant of backproagation. In principle, if you add a circuit size penalty (which may be well the tricky part) this becomes time-bounded maximum a posteriori Solomonoff induction.
Yes binary neural networks are super interesting because they can be made much more compact in hardware than floating point ops. However there isn’t much (theoretical) advantage otherwise. Anything a circuit can do, an NN can do, and vice versa.
A circuit size penalty is already a very common technique. It’s called weight decay, where the synapses are encouraged to be as close to zero as possible. A synapse of 0 is the same as it not being there, which means the neural net parameters requires less information to specify.
Agreed on all points.
I suppose the main lesson for us can be summarized by the famous verse:
The sequences definitely qualify as shallow draughts that intoxicate the brain :-(