Yep. It’s not the Bible. I suspect that there are already good stats compiled on the Q-source, etc.
In a way it’s not only futile but limiting to play the guessing game. There are lots of possible applications of Bayesian methods to the humanities. Maybe this discussion will help more projects than my own.
That was my first thought too; there’s a huge textual analysis tradition relating to the Bible and what I know of it maps pretty closely to the summary, although it’s also mature enough that there wouldn’t be much reason to obfuscate it like this. But it’s not implausible that it applies to some other body of literature. I understand there are some similar things going on in classics, for example.
The specifics shouldn’t matter too much, though. Although some types of mark are going to be a lot more machine-distinguishable than others, and that’s going to affect the kinds of analysis you can do—differences in spelling and grammar, for example, are far machine-friendlier than differences in letterforms in a manuscript.
I think Gwern’s right on this.
But Humanities has rejected that!
Yep. It’s not the Bible. I suspect that there are already good stats compiled on the Q-source, etc.
In a way it’s not only futile but limiting to play the guessing game. There are lots of possible applications of Bayesian methods to the humanities. Maybe this discussion will help more projects than my own.
Ah, OK. They hadn’t when I wrote it.
That was my first thought too; there’s a huge textual analysis tradition relating to the Bible and what I know of it maps pretty closely to the summary, although it’s also mature enough that there wouldn’t be much reason to obfuscate it like this. But it’s not implausible that it applies to some other body of literature. I understand there are some similar things going on in classics, for example.
The specifics shouldn’t matter too much, though. Although some types of mark are going to be a lot more machine-distinguishable than others, and that’s going to affect the kinds of analysis you can do—differences in spelling and grammar, for example, are far machine-friendlier than differences in letterforms in a manuscript.