You can ask me things if you like. At Reddit, some of the most successful AMAs are when people are asked about their occupation. I have a PhD in linguistics/philology and currently work in academia. We could talk about academic culture in the humanities if someone is interested in that.
Can you talk about your specific field in linguistics/philology?
I’ve mucked about here and there including in language classification (did those two extinct tribes speak related languages?), stemmatics (what is the relationship between all those manuscripts containing the same text?), non-traditional authorship attribution (who wrote this crap anyway?) and phonology (how and why do the sounds of a word “change” when it is inflected?). To preserve some anonymity (though I am not famous) I’d rather not get too specific.
what are the main challenges?
There are lots of little problems I’m interested in for their own sake but perhaps the meta-problems are of more interest here. Those would include getting people to accept that we can actually solve problems and that we should try our best to do so, Many scholars seem to have this fatalistic view of the humanities as doomed to walk in circles and never really settle anything. And for good reason—if someone manages to establish “p” then all the nice speculation based on assuming “not p” is worthless. But many would prefer to be as free as possible to speculate about as much as possible.
Do you have a stake/an opinion in the debates about the Chomskian strain in syntax/linguistics in general?
Yes. I think the Chomskyan approach is based on a fundamentally mistaken view of cognition, akin to “good old fashioned artificial intelligence”. I hope to write a top-level post on this at some point. But I’ll say this for Chomsky: He’s not a walk-around-in-circles obscurantist. He’s a resolutely-march-ahead kind of guy. A lot of the marching was in the wrong direction, but still, I respect that.
Is that really the standard term? You know, that the LW party line is that it’s a bad term like selling non-apples. Google suggests to me that it is not the most popular term. The link below replaces “non-traditional” with “modern,” which isn’t an improvement on this dimension.
Also, my first parsing was that “non-traditional” modified “authorship.” This is actually a reasonable use of the prefix “non,” since having a strong prior on the author makes a big difference (sociologically, if not technically). How bout that Marlowe?
You’re right, it’s a horrible term. For one thing, the methods involved are pretty well-established by now. I just use it by habit. As for that old Marlowe/Shakespeare hubbub, here’s a recent study which finds their style similar but definitely not identical.
I skimmed it and nothing seemed obviously wrong. If you’re interested, you could try for yourself. If you download Marlowe’s corpus, Shakespeare’s corpus and stylo you can get a feel for how this works in a couple of hours.
I would be extremely interested in your post on Chomsky. I almost but not quite majored in linguistics in America, which meant that I got the basic Chomskyan introduction but never got to the arguments against it. I am vaguely familiar with the probabilistic-learning models (enough to get why Chomsky’s proof that they can’t work fails), but not enough to get what predictions they make etc.
That’s quite a broad field to plow! I’ll keep asking questions, feel free to ignore those that are too specific/boring.
I’ve always wanted to know more about how authorship attribution is done; is this, found with a quick search, a reasonable survey of current state of the art, or perhaps you’d recommend something else to read?
Are your fields, and humanities in general, trying to move towards open publishing of academic papers, the way STEM fields have been trying to? As someone w/o a university affiliation, I’m intensely frustrated every time I follow an interesting citation to a JSTOR/Muse page.
Do you plan to stay in academia or leave, and it the latter, for what kind of job?
I think you should write that post about the Chomskyan approach.
I’ve always wanted to know more about how authorship attribution is done; is this, found with a quick search, a reasonable survey of current state of the art, or perhaps you’d recommend something else to read?
The Stamatatos survey you linked to will do fine. The basic story is “back in the day this stuff was really hard but some people tried anyway, then in 1964 Mosteller and Wallace published a landmark paper showing that you really could do impressive stuff, then along came computers and now we have a boatload of different algorithms, most of which work just great”. The funny thing about stylometry is that it is hard to get wrong. Count up anything you like (frequent words, infrequent words, character n-grams, whatever) and use any distance measurement you like and odds are you’ll get usable results. If you want to play around with this for yourself you can install stylo and turn it loose on a corpus of your choice. Gwern’s little experiment is also a good read.
My involvement with stylometry has not been to tweak the algorithms (they work just fine) but to apply them in some particular cases and to try to convince my fellow scholars that technological wizardry really can tell them things worth knowing.
Are your fields, and humanities in general, trying to move towards open publishing of academic papers, the way STEM fields have been trying to?
Yes. Essentially every scholar I know is in favor of this. As far as I can see, It will happen and is happening.
Do you plan to stay in academia or leave, and it the latter, for what kind of job?
I worked as an engineer for a few years but found I wasn’t that into it and really missed school. So I went back and I’d like to stay.
You can ask me things if you like. At Reddit, some of the most successful AMAs are when people are asked about their occupation. I have a PhD in linguistics/philology and currently work in academia. We could talk about academic culture in the humanities if someone is interested in that.
Can you talk about your specific field in linguistics/philology? What it is, what are the main challenges?
Do you have a stake/an opinion in the debates about the Chomskian strain in syntax/linguistics in general?
I’ve mucked about here and there including in language classification (did those two extinct tribes speak related languages?), stemmatics (what is the relationship between all those manuscripts containing the same text?), non-traditional authorship attribution (who wrote this crap anyway?) and phonology (how and why do the sounds of a word “change” when it is inflected?). To preserve some anonymity (though I am not famous) I’d rather not get too specific.
There are lots of little problems I’m interested in for their own sake but perhaps the meta-problems are of more interest here. Those would include getting people to accept that we can actually solve problems and that we should try our best to do so, Many scholars seem to have this fatalistic view of the humanities as doomed to walk in circles and never really settle anything. And for good reason—if someone manages to establish “p” then all the nice speculation based on assuming “not p” is worthless. But many would prefer to be as free as possible to speculate about as much as possible.
Yes. I think the Chomskyan approach is based on a fundamentally mistaken view of cognition, akin to “good old fashioned artificial intelligence”. I hope to write a top-level post on this at some point. But I’ll say this for Chomsky: He’s not a walk-around-in-circles obscurantist. He’s a resolutely-march-ahead kind of guy. A lot of the marching was in the wrong direction, but still, I respect that.
Is that really the standard term? You know, that the LW party line is that it’s a bad term like selling non-apples. Google suggests to me that it is not the most popular term. The link below replaces “non-traditional” with “modern,” which isn’t an improvement on this dimension.
Also, my first parsing was that “non-traditional” modified “authorship.” This is actually a reasonable use of the prefix “non,” since having a strong prior on the author makes a big difference (sociologically, if not technically). How bout that Marlowe?
You’re right, it’s a horrible term. For one thing, the methods involved are pretty well-established by now. I just use it by habit. As for that old Marlowe/Shakespeare hubbub, here’s a recent study which finds their style similar but definitely not identical.
Does anyone use a better term? “Statistical author attribution” seems like an obvious term, but google tells me that no one has ever used it.
Have you read the study you link? People who have read it tell me that the conclusions drawn do not match the body of the paper.
I skimmed it and nothing seemed obviously wrong. If you’re interested, you could try for yourself. If you download Marlowe’s corpus, Shakespeare’s corpus and stylo you can get a feel for how this works in a couple of hours.
Would love to read your post on the Chomskian approach, please do write it!
I would be extremely interested in your post on Chomsky. I almost but not quite majored in linguistics in America, which meant that I got the basic Chomskyan introduction but never got to the arguments against it. I am vaguely familiar with the probabilistic-learning models (enough to get why Chomsky’s proof that they can’t work fails), but not enough to get what predictions they make etc.
That’s quite a broad field to plow! I’ll keep asking questions, feel free to ignore those that are too specific/boring.
I’ve always wanted to know more about how authorship attribution is done; is this, found with a quick search, a reasonable survey of current state of the art, or perhaps you’d recommend something else to read?
Are your fields, and humanities in general, trying to move towards open publishing of academic papers, the way STEM fields have been trying to? As someone w/o a university affiliation, I’m intensely frustrated every time I follow an interesting citation to a JSTOR/Muse page.
Do you plan to stay in academia or leave, and it the latter, for what kind of job?
I think you should write that post about the Chomskyan approach.
The Stamatatos survey you linked to will do fine. The basic story is “back in the day this stuff was really hard but some people tried anyway, then in 1964 Mosteller and Wallace published a landmark paper showing that you really could do impressive stuff, then along came computers and now we have a boatload of different algorithms, most of which work just great”. The funny thing about stylometry is that it is hard to get wrong. Count up anything you like (frequent words, infrequent words, character n-grams, whatever) and use any distance measurement you like and odds are you’ll get usable results. If you want to play around with this for yourself you can install stylo and turn it loose on a corpus of your choice. Gwern’s little experiment is also a good read.
My involvement with stylometry has not been to tweak the algorithms (they work just fine) but to apply them in some particular cases and to try to convince my fellow scholars that technological wizardry really can tell them things worth knowing.
Yes. Essentially every scholar I know is in favor of this. As far as I can see, It will happen and is happening.
I worked as an engineer for a few years but found I wasn’t that into it and really missed school. So I went back and I’d like to stay.