Yesterday I spent some time looking at the recent changes in the size and effectiveness of neural nets.
The size (number of parameters, or connections, roughly equal to synapses) of the cat recogniser by Google in 2012 was 1 billion.
Later private work was done mostly on graphic cards and the size of parameters was limited by the size of the memory of graphic cards, which recently up to 12 GB. For example, Karpathy’s RNN has only 3 million parameters but was able to generate grammatically correct text. http://karpathy.github.io/2015/05/21/rnn-effectiveness/
However, latest work by Google created in 2016 a neural net with the size of 130 billion parameters, and they now use it in Google translate. They showed that quality is growing with the size of the net, but some diminishing returns are observed. https://arxiv.org/pdf/1701.06538.pdf
So the number of parameters in the best neural nets by Google grew 100 times for 5 years, and they are planning trillion parameters net soon.
The human brain has around 100 trillion synapses. If the speed of growth of best neural nets continues, 100 trillion net is 5 years from now, or somewhere in 2022. (By saying “best nets” I exclude some useless very large simulations which were already done.)
However, there is a problem of educating such big nets, as difficulty is growing as a square of the number of parameters. And there is a limitation of the memory graphic cards. However, OpenAI found a solution which is easily scalable by changing the way the net is educated. It is not the backpropagation, but gradient descent in very large parameters space. https://blog.openai.com/evolution-strategies/
If we look on the performance side of neural nets, it is doubling every year, and in 2016-2017 start to demonstrate superhuman performance on many recognition tasks. However, there are always some other tasks, where humans are better, and that is why it is not easy to say how far is the human level performance. https://srconstantin.wordpress.com/2017/01/28/performance-trends-in-ai/
Hence infrahuman AI is possible around 2022. Saying “infrahuman” I mean that it will do the most thing that humans are able to do, but it still be not genius, not conscious, not Einstein, etc, but probably good robot brain. From this point, it could start help researchers to make research, which could be the next discontinuity. One of the main features of such AI will be that it will be able to understand most of the human language.
Are you sure that Google’s 2016 net uses graphic cards? I would think that they use their Tensor Flow ASICs. The switch from graphic cards to ASICs is part of what allowed them huge performance improvements in a short time frame. I don’t think that they will continue to improve much better than Moore’s law.
Yesterday I spent some time looking at the recent changes in the size and effectiveness of neural nets.
The size (number of parameters, or connections, roughly equal to synapses) of the cat recogniser by Google in 2012 was 1 billion.
Later private work was done mostly on graphic cards and the size of parameters was limited by the size of the memory of graphic cards, which recently up to 12 GB. For example, Karpathy’s RNN has only 3 million parameters but was able to generate grammatically correct text. http://karpathy.github.io/2015/05/21/rnn-effectiveness/
However, latest work by Google created in 2016 a neural net with the size of 130 billion parameters, and they now use it in Google translate. They showed that quality is growing with the size of the net, but some diminishing returns are observed. https://arxiv.org/pdf/1701.06538.pdf So the number of parameters in the best neural nets by Google grew 100 times for 5 years, and they are planning trillion parameters net soon.
The human brain has around 100 trillion synapses. If the speed of growth of best neural nets continues, 100 trillion net is 5 years from now, or somewhere in 2022. (By saying “best nets” I exclude some useless very large simulations which were already done.)
However, there is a problem of educating such big nets, as difficulty is growing as a square of the number of parameters. And there is a limitation of the memory graphic cards. However, OpenAI found a solution which is easily scalable by changing the way the net is educated. It is not the backpropagation, but gradient descent in very large parameters space. https://blog.openai.com/evolution-strategies/
If we look on the performance side of neural nets, it is doubling every year, and in 2016-2017 start to demonstrate superhuman performance on many recognition tasks. However, there are always some other tasks, where humans are better, and that is why it is not easy to say how far is the human level performance. https://srconstantin.wordpress.com/2017/01/28/performance-trends-in-ai/
Hence infrahuman AI is possible around 2022. Saying “infrahuman” I mean that it will do the most thing that humans are able to do, but it still be not genius, not conscious, not Einstein, etc, but probably good robot brain. From this point, it could start help researchers to make research, which could be the next discontinuity. One of the main features of such AI will be that it will be able to understand most of the human language.
Are you sure that Google’s 2016 net uses graphic cards? I would think that they use their Tensor Flow ASICs. The switch from graphic cards to ASICs is part of what allowed them huge performance improvements in a short time frame. I don’t think that they will continue to improve much better than Moore’s law.
“We trained our models using TensorFlow (Abadi et al., 2016) on clusters containing 16-32 Tesla K40 GPUs” https://arxiv.org/pdf/1701.06538.pdf
So they did it before they implement Tensorflow hardware or didn’t use it.
Current price of such Tesla cluster is around 50-100 K USD