Something that’s always bugged me about being an academic is, we’re terrible at communicating to people outside our field. This means that whenever I see a post using an NLP tool, they’re using a crap tool.
Why is that, do you think? This doesn’t seem to be the case in the ML community as far as I can judge (though I’m not an expert). What’s special about NLP? What prevents the nltk people from doing what you did?
In ML, everyone is engaging with the academics, and the academics are doing a great job of making that accessible, e.g. through MOOCs. ML is one of the most popular targets of “ongoing education”, because it’s popped up and it’s a useful feather to have in your cap. It extends the range of programs you can write greatly. Many people realise that, and are doing what it takes to learn. So even if there are some rough spots in the curriculum, the learners are motivated, and the job gets done.
The cousin of language processing is computer vision. The problem we have as academics is that there is a need to communicate current best-of-breed solutions to software engineers, while we also communicate underlying principles to our students and to each other.
If you look at nltk, it’s really a tool for teaching our grad students. And yet it’s become a software engineering tool-of-choice, when it should never have been billed as industrial strength at all. Check out the results in my blog post:
NLTK POS tagger: 94% accuracy, 236s
My tagger: 96.8% accuracy, 12s
Both are pure Python implementations. I do no special tricks; I just keep things tight and simple, and don’t pay costs from integrating into a large framework.
The problem is that the NLTK tagger is part of a complicated class hierarchy that includes a dictionary-lookup tagger, etc. These are useful systems to explain the problem to a grad student, but shouldn’t be given to a software engineer who wants to get something done.
There’s no reason why we can’t have a software package that just gets it done. Which is why I’m writing one :). The key difference is that I’ll be shipping one POS tagger, one parser, etc. The best one! If another algorithm comes out on top, I’ll rip out the old one and put the current best one in.
That’s the real difference between ML and NLP or computer vision. In NLP, we really really should be telling people, “just use this one”. In ML, we need to describe a toolbox.
Why is that, do you think? This doesn’t seem to be the case in the ML community as far as I can judge (though I’m not an expert). What’s special about NLP? What prevents the nltk people from doing what you did?
In ML, everyone is engaging with the academics, and the academics are doing a great job of making that accessible, e.g. through MOOCs. ML is one of the most popular targets of “ongoing education”, because it’s popped up and it’s a useful feather to have in your cap. It extends the range of programs you can write greatly. Many people realise that, and are doing what it takes to learn. So even if there are some rough spots in the curriculum, the learners are motivated, and the job gets done.
The cousin of language processing is computer vision. The problem we have as academics is that there is a need to communicate current best-of-breed solutions to software engineers, while we also communicate underlying principles to our students and to each other.
If you look at nltk, it’s really a tool for teaching our grad students. And yet it’s become a software engineering tool-of-choice, when it should never have been billed as industrial strength at all. Check out the results in my blog post:
NLTK POS tagger: 94% accuracy, 236s
My tagger: 96.8% accuracy, 12s
Both are pure Python implementations. I do no special tricks; I just keep things tight and simple, and don’t pay costs from integrating into a large framework.
The problem is that the NLTK tagger is part of a complicated class hierarchy that includes a dictionary-lookup tagger, etc. These are useful systems to explain the problem to a grad student, but shouldn’t be given to a software engineer who wants to get something done.
There’s no reason why we can’t have a software package that just gets it done. Which is why I’m writing one :). The key difference is that I’ll be shipping one POS tagger, one parser, etc. The best one! If another algorithm comes out on top, I’ll rip out the old one and put the current best one in.
That’s the real difference between ML and NLP or computer vision. In NLP, we really really should be telling people, “just use this one”. In ML, we need to describe a toolbox.