Yes, I understand this point. I was saying that we’d expect it to get 0% if its algorithm is “guess yes for anything in the training set and no for anything outside of it”.
It continues to be surprising (to me) even though we expect that it’s trying to follow that algorithm but can’t do so exactly. Presumably the generator is able to emulate the features that it’s using for inexactly matching the training set. In this case, if those features were “looks like something from the training/test distribution”, we’d expect it to guess closer to 100% on the test set. If those features were highly specific to the training set, we’d expect it to get closer to 0% on the test set (since the model should reject anything without those features). Instead it gets ~50% which means whatever it’s looking for is completely uncorrelated to what the test data looks like and present in half of the examples—that seems surprising to me.
I’d currently interpret this as “the discriminator network acts nonsensically outside the training set + generator distribution, so it gets close to chance just because that’s what nonsensical networks do.”
Yes, I understand this point. I was saying that we’d expect it to get 0% if its algorithm is “guess yes for anything in the training set and no for anything outside of it”.
It continues to be surprising (to me) even though we expect that it’s trying to follow that algorithm but can’t do so exactly. Presumably the generator is able to emulate the features that it’s using for inexactly matching the training set. In this case, if those features were “looks like something from the training/test distribution”, we’d expect it to guess closer to 100% on the test set. If those features were highly specific to the training set, we’d expect it to get closer to 0% on the test set (since the model should reject anything without those features). Instead it gets ~50% which means whatever it’s looking for is completely uncorrelated to what the test data looks like and present in half of the examples—that seems surprising to me.
I’d currently interpret this as “the discriminator network acts nonsensically outside the training set + generator distribution, so it gets close to chance just because that’s what nonsensical networks do.”
Oh, I see, sorry.
No worries, was worth clarifying. I edited the post to link this comment thread.