My layperson’s understanding is that this is the first time human accuracy has been exceeded on the Imagenet benchmarking challenge, and represents an advance on Chinese giant Baidu’s progress reported last month, which I understood to be significant in its own right. http://arxiv.org/abs/1501.02876
One thing to note about the number for human accuracy for ImageNet that’s been going around a lot recently is that it was really a relatively informal experiment done by a couple of members of the Stanford vision lab (see section 6.4 of the paper for details). In particular, the number everyone cites was just one person, who, while he trained himself quite a while to recognize the ImageNet categories, nonetheless was prone to silly mistakes from time to time. A more optimistic human error is probably closer to 3-4%, but with that in mind the recent results people have been posting are still extremely impressive.
It’s also worth pointing another paper from Microsoft Research that beat the 5.1% human performance and actually came out a few days before Google’s. It’s a decent read, and I wouldn’t be surprised if people start incorporating elements from both MSR and Google’s papers in the near future.
Getting 5.1% error was really hard, takes a lot of time to get familiar with the classes and to sort through reference images. The 3% error was an entirely hypothetical, optimistic estimate, of a group of humans that make no mistakes.
One thing to note about the number for human accuracy for ImageNet that’s been going around a lot recently is that it was really a relatively informal experiment done by a couple of members of the Stanford vision lab (see section 6.4 of the paper for details). In particular, the number everyone cites was just one person, who, while he trained himself quite a while to recognize the ImageNet categories, nonetheless was prone to silly mistakes from time to time. A more optimistic human error is probably closer to 3-4%, but with that in mind the recent results people have been posting are still extremely impressive.
It’s also worth pointing another paper from Microsoft Research that beat the 5.1% human performance and actually came out a few days before Google’s. It’s a decent read, and I wouldn’t be surprised if people start incorporating elements from both MSR and Google’s papers in the near future.
Here is the guy who tried to get his own accuracy on imagenet: https://karpathy.github.io/2014/09/02/what-i-learned-from-competing-against-a-convnet-on-imagenet/
Getting 5.1% error was really hard, takes a lot of time to get familiar with the classes and to sort through reference images. The 3% error was an entirely hypothetical, optimistic estimate, of a group of humans that make no mistakes.
If you want to appreciate it, you can try the task yourself here: http://cs.stanford.edu/people/karpathy/ilsvrc/