You mean, by realising that there are online forums that are referenced elsewhere in the training corpus, yet themselves are conspicuously absent from the training data (which can be detected, say, as relative isolation of the features corresponding to these concepts, which means that the data around these concepts is purposefully removed from the training data)? And then these connections are added during fine-tuning when this forum data is finally added to the fine-tuning dataset? I still don’t see how this will let the network know it’s in training vs. deployment.
You mean, by realising that there are online forums that are referenced elsewhere in the training corpus, yet themselves are conspicuously absent from the training data (which can be detected, say, as relative isolation of the features corresponding to these concepts, which means that the data around these concepts is purposefully removed from the training data)? And then these connections are added during fine-tuning when this forum data is finally added to the fine-tuning dataset? I still don’t see how this will let the network know it’s in training vs. deployment.