I would say that given the Minsky story and how common a problem overfitting is, I believe something at least very similar to the tank story did happen, and if it didn’t, then there nevertheless real problems with neural nets overfitting.
(That said, I think modern deep nets may get too much of a bad rap on this issue. Yes, they might do weird things like focusing on textures or whatever is going on in the adversarial examples, but they still recognize very well out-of-sample, and so they are not simply overfitting to the test set like in these old anecdotes. Their problems are different.)
This isn’t an example of overfitting, but of the training set not being iid. You wanted a random sample of pictures of tanks, but you instead got a highly biased sample that is drawn from a different distribution than the test set.
“This isn’t an example of overfitting, but of the training set not being iid.”
Upvote for the first half of that sentence, but I’m not sure how the second applies. The set of tanks is iid, the issue that the creators of the training set allowed tank/not tank to be correlated to an extraneous variable. It’s like having a drug trial where the placebos are one color and the real drug is another.
I guess I meant it’s not iid from the distribution you really wanted to sample. The hypothetical training set of all possible pictures of tanks, but you just sampled the ones that were during daytime.
I’m not sure you understand what “iid” means. I t means that each is drawn from the same distribution, and each sample is independent of the others. The term “iid” isn’t doing any work in your statement; you could just same “It’s not from the distribution you really want to sample”, and it would be just as informative.
Previous discussion: http://lesswrong.com/lw/td/magical_categories/4v4a
I would say that given the Minsky story and how common a problem overfitting is, I believe something at least very similar to the tank story did happen, and if it didn’t, then there nevertheless real problems with neural nets overfitting.
(That said, I think modern deep nets may get too much of a bad rap on this issue. Yes, they might do weird things like focusing on textures or whatever is going on in the adversarial examples, but they still recognize very well out-of-sample, and so they are not simply overfitting to the test set like in these old anecdotes. Their problems are different.)
This isn’t an example of overfitting, but of the training set not being iid. You wanted a random sample of pictures of tanks, but you instead got a highly biased sample that is drawn from a different distribution than the test set.
“This isn’t an example of overfitting, but of the training set not being iid.”
Upvote for the first half of that sentence, but I’m not sure how the second applies. The set of tanks is iid, the issue that the creators of the training set allowed tank/not tank to be correlated to an extraneous variable. It’s like having a drug trial where the placebos are one color and the real drug is another.
I guess I meant it’s not iid from the distribution you really wanted to sample. The hypothetical training set of all possible pictures of tanks, but you just sampled the ones that were during daytime.
I’m not sure you understand what “iid” means. I t means that each is drawn from the same distribution, and each sample is independent of the others. The term “iid” isn’t doing any work in your statement; you could just same “It’s not from the distribution you really want to sample”, and it would be just as informative.