I think this misses a significant factor—the size of the corpus required to establish a sufficiently distinct signature is not a constant, but grows substantially as the number of individuals you want to differentiate becomes bigger (I do not have a very rigorous argument for that, but I am guessing that it could be as significant as linear—obviously the number of bits you need to know grows logarithmically with the number of people, but the number of distinguishing bits you can extract from a corpus might also be growing only logarithmically in the size of the corpus as marginal increase in corpus size would likely mostly just reinforce what you already extracted from the smaller corpus, providing relatively little new signature data). Add to that the likelihood of signature drifting over time, being affected by a particular medium and audience, etc, and it might not be so easy to identify people...
I think this misses a significant factor—the size of the corpus required to establish a sufficiently distinct signature is not a constant, but grows substantially as the number of individuals you want to differentiate becomes bigger (I do not have a very rigorous argument for that, but I am guessing that it could be as significant as linear—obviously the number of bits you need to know grows logarithmically with the number of people, but the number of distinguishing bits you can extract from a corpus might also be growing only logarithmically in the size of the corpus as marginal increase in corpus size would likely mostly just reinforce what you already extracted from the smaller corpus, providing relatively little new signature data). Add to that the likelihood of signature drifting over time, being affected by a particular medium and audience, etc, and it might not be so easy to identify people...