I can’t comment usefully on everything you wrote, so I’ll just say a couple of things.
First, don’t be too credulous: the field of AI has been surrounded and plagued by hype since its inception, the current era isn’t much different. Researchers have every incentive to encourage the hype.
Second, it’s interesting that you bring up the Kalman Filter, because it makes a nice contrast to DNNs. The Kalman filter is actually kind of nice aesthetically, it has a pleasing mathematical elegance to it. People who use the KF know more or less the limits of its applicability. When I’m reading DNN papers, I feel like the whole field has given up on the notion of aesthetics and wholeheartedly embraced architecture hacking as a methodology.
Third, I think you’ll find that the DNNs are much much harder to use than you imagine or expect. The problem is that all DNN research relies on architecture hacking: write down a network, train it up, look at the result, then tweak the architecture and repeat. There is very little, embarrassingly little theory behind it all. The phrase “we have found” is prominent in DNN papers, meaning “we tweaked the network a bunch of times in various ways and found that this trick worked the best.” Furthermore, each cycle of code/test/tweak takes a really long time since DNN training, almost by definition, is very time-consuming.
To address your third point first, I’m sure you are right. I have only played around with simple NNs, and shouldn’t have spoken freely on how it would be easy to estimate a more complex one, when I don’t know much about it.
As a follow up question to your second point: The Kalman filter is a very aesthetically pleasing model, I agree. Something I wonder, but have no idea on, is whether there are mathematical concepts similar to the Kalman filter (in terms of aesthetics and usefulness) that are entirely outside of the understanding of the human brain. So, hypothetically, if we engineered humans with IQ 200+ (or whatever), they would uncover things like the Kalman Filter that normal humans couldn’t grasp.
If that’s true, does it stand to reason that we could still use those models with a sufficiently well optimized/built DNN? We would just never understand what’s going on inside the network?
I often think of self-driving cars as learning the dynamic interactions of a set of nonlinear equations that are beyond the scope of a human to ever derive.
I’ll note I realize some of my questions might be too vague or pseudo-philosophical to be answered.
PS: I did a little internet sleuthing and have read the first ~12 pages of your book so far, which is very interesting and similar to how I think of the world (yours is much more well developed). I am also incredibly interested in empirical philosci and read/write/think about it a ton.
I can’t comment usefully on everything you wrote, so I’ll just say a couple of things.
First, don’t be too credulous: the field of AI has been surrounded and plagued by hype since its inception, the current era isn’t much different. Researchers have every incentive to encourage the hype.
Second, it’s interesting that you bring up the Kalman Filter, because it makes a nice contrast to DNNs. The Kalman filter is actually kind of nice aesthetically, it has a pleasing mathematical elegance to it. People who use the KF know more or less the limits of its applicability. When I’m reading DNN papers, I feel like the whole field has given up on the notion of aesthetics and wholeheartedly embraced architecture hacking as a methodology.
Third, I think you’ll find that the DNNs are much much harder to use than you imagine or expect. The problem is that all DNN research relies on architecture hacking: write down a network, train it up, look at the result, then tweak the architecture and repeat. There is very little, embarrassingly little theory behind it all. The phrase “we have found” is prominent in DNN papers, meaning “we tweaked the network a bunch of times in various ways and found that this trick worked the best.” Furthermore, each cycle of code/test/tweak takes a really long time since DNN training, almost by definition, is very time-consuming.
To address your third point first, I’m sure you are right. I have only played around with simple NNs, and shouldn’t have spoken freely on how it would be easy to estimate a more complex one, when I don’t know much about it.
As a follow up question to your second point: The Kalman filter is a very aesthetically pleasing model, I agree. Something I wonder, but have no idea on, is whether there are mathematical concepts similar to the Kalman filter (in terms of aesthetics and usefulness) that are entirely outside of the understanding of the human brain. So, hypothetically, if we engineered humans with IQ 200+ (or whatever), they would uncover things like the Kalman Filter that normal humans couldn’t grasp.
If that’s true, does it stand to reason that we could still use those models with a sufficiently well optimized/built DNN? We would just never understand what’s going on inside the network?
I often think of self-driving cars as learning the dynamic interactions of a set of nonlinear equations that are beyond the scope of a human to ever derive.
I’ll note I realize some of my questions might be too vague or pseudo-philosophical to be answered.
PS: I did a little internet sleuthing and have read the first ~12 pages of your book so far, which is very interesting and similar to how I think of the world (yours is much more well developed). I am also incredibly interested in empirical philosci and read/write/think about it a ton.