OpenAI recently announced progress in NLP, using a large transformer-based language model to tackle a variety of tasks and breaking performance records in many of them. It also generates synthetic short stories, which are surprisingly good.
How surprising are these results, given past models of how difficult language learning was and how far AI had progressed? Should we be significantly updating our estimates of AI timelines?
It doesn’t move much probability mass to the very near term (i.e. 1 year or less), because both this and AlphaStar aren’t really doing consequentialist reasoning, they’re just able to get a surprising performance with simpler tricks (the very Markovian nature of human writing, a good position evaluation function) given a whole lot of compute.
However, it does shift my probabilities forward in time, in the sense that one new weird trick to do deductive or consequentialist reasoning, plus a lot of compute, might get you there really quickly.
Something you learn pretty quickly in academia: don’t trust the demos. Systems never work as well when you select the inputs freely (and, if they do, expect thorough proof). So, I wouldn’t read too deeply into this yet; we don’t know how good it actually is.
Vis-a-vis selecting inputs freely: OpenAI also included a large dump of unconditioned text generation in their github repo.
They claim beating records on a range of standard tests (such as the Winograd schema), which is not something you can cheat by cherry-picking, assuming they are honest about the results.
https://transformer.huggingface.co/ is a nice demonstration of GPT2 that allows you to select the inputs freely.