Getting a validation accuracy of 50% in a binary classification task isn’t “surprisingly well”. It means your model is as good as random guessing: if you flipped a coin, you would get the right answer half the time, too. Getting 0% validation accuracy would mean that you are always guessing wrong, and would get 100% accurate results by reversing your model’s prediction. So, yes, just like the article says, the discriminator does not generalize.
Yes, I understand this point. I was saying that we’d expect it to get 0% if its algorithm is “guess yes for anything in the training set and no for anything outside of it”.
It continues to be surprising (to me) even though we expect that it’s trying to follow that algorithm but can’t do so exactly. Presumably the generator is able to emulate the features that it’s using for inexactly matching the training set. In this case, if those features were “looks like something from the training/test distribution”, we’d expect it to guess closer to 100% on the test set. If those features were highly specific to the training set, we’d expect it to get closer to 0% on the test set (since the model should reject anything without those features). Instead it gets ~50% which means whatever it’s looking for is completely uncorrelated to what the test data looks like and present in half of the examples—that seems surprising to me.
I’d currently interpret this as “the discriminator network acts nonsensically outside the training set + generator distribution, so it gets close to chance just because that’s what nonsensical networks do.”
Getting a validation accuracy of 50% in a binary classification task isn’t “surprisingly well”. It means your model is as good as random guessing: if you flipped a coin, you would get the right answer half the time, too. Getting 0% validation accuracy would mean that you are always guessing wrong, and would get 100% accurate results by reversing your model’s prediction. So, yes, just like the article says, the discriminator does not generalize.
Yes, I understand this point. I was saying that we’d expect it to get 0% if its algorithm is “guess yes for anything in the training set and no for anything outside of it”.
It continues to be surprising (to me) even though we expect that it’s trying to follow that algorithm but can’t do so exactly. Presumably the generator is able to emulate the features that it’s using for inexactly matching the training set. In this case, if those features were “looks like something from the training/test distribution”, we’d expect it to guess closer to 100% on the test set. If those features were highly specific to the training set, we’d expect it to get closer to 0% on the test set (since the model should reject anything without those features). Instead it gets ~50% which means whatever it’s looking for is completely uncorrelated to what the test data looks like and present in half of the examples—that seems surprising to me.
I’d currently interpret this as “the discriminator network acts nonsensically outside the training set + generator distribution, so it gets close to chance just because that’s what nonsensical networks do.”
Oh, I see, sorry.
No worries, was worth clarifying. I edited the post to link this comment thread.