The author of the Bible is the central and originating example of the field of author identification, so all techniques get applied to it. The program probably implements the standard techniques that people already claim reach this same conclusion. I’m surprised that it only agrees 90% with the standard lines for the binary authorship classification.
It is good that people implement algorithms in computers, even if they don’t reach new conclusions, because it pins them down to definite, checkable claims. Of course, there is a danger that the programmers overfit the parameters of the program to get the answer they desired, but if they do, it is probably much more clear from the source code than the verbal argument. Anyhow, once the program is published, it can be applied to new corpora without opportunity for further tuning. (though changing languages probably gives lots of room for fudging things)
The author of the Bible is the central and originating example of the field of author identification, so all techniques get applied to it. The program probably implements the standard techniques that people already claim reach this same conclusion. I’m surprised that it only agrees 90% with the standard lines for the binary authorship classification.
It is good that people implement algorithms in computers, even if they don’t reach new conclusions, because it pins them down to definite, checkable claims. Of course, there is a danger that the programmers overfit the parameters of the program to get the answer they desired, but if they do, it is probably much more clear from the source code than the verbal argument. Anyhow, once the program is published, it can be applied to new corpora without opportunity for further tuning. (though changing languages probably gives lots of room for fudging things)