Faster computers almost certainly enable AI research. The current wave of deep learning is only possible because computers suddenly jumped 10-50x over a few years. (That is fast/cheap/general purpose GPUs enabling training of huge networks on a single computer.)
What’s weird about this is that it isn’t just being able to run bigger NNs. Before it was believed to be impossible to run really deep NNs because of vanishing gradients. Then suddenly people could experiment with deep nets on much faster computers, though they were still slow and impractical. But by experimenting with them, they figured out how to initialize weights properly, and now they can train much faster on even slow computers.
Nothing stopped anyone from making that discovery in the 90′s. But it took a renewed interest and faster computers to do experiments with for it to happen. The same is true for many other methods that have been invented. Things like dropout totally could have been invented a decade or two earlier, but for some reason it just wasn’t.
And there were supercomputers back then that could have run really big nets. If someone had an algorithm ready, it could have been tested. But no one had code just sitting around waiting for computers to get fast enough. Instead computers got fast first, then innovation happened.
The same is true for old AI research. The early AI researchers were working with computers smaller than my graphing calculator. That’s why a lot of early AI research seems silly, and why promising ideas like NNs were abandoned initially.
I heard an anecdote about one researcher who went from university to university carrying a stack of punch cards, and running them on the computer when there was spare time. It was something like a simple genetic algorithm that could easily complete in a few seconds on a modern computer. But took him months or years to get results from it.
The pattern is the same across the entire software industry, not just AI research.
Only a small portion of real progress comes from professors and Phd. Per person they tend to do pretty well in terms of innovation but it’s hard to beat a million obsessed geeks willing and able to spend every hour of their free time experimenting with something.
The people working in the olden days weren’t just working with slower computers, a lot of the time they were working with buggy, crappier languages, feature-poor debuggers and no IDE’s.
A comp sci undergrad student working with a modern language in a modern IDE with modern debuggers can whip up in hours what it would have taken phd’s weeks to do back in the early days and it’s not all just hardware.
Don’t get me wrong: Hardware helps, having cycles to burn and so much memory that you don’t have to care about wasting it also saves you time but you get a massive feedback loop where the more people there are in your environment doing similar things the more you can focus on the novel, important parts of your work rather than fucking around trying to find where you set a pointer incorrectly or screwed up a JUMP.
Very few people have access to supercomputers, if they do then they aren’t going to be spending their supercomputer time going “well that didn’t work but what if I tried this slight variation..”x100
Everyone has access to desktops so as soon as something can run on consumer electronics thousands of people can suddenly spend all night experimenting.
Even if the home experimentation doesn’t yield the results you now have a generation of teenagers who’ve spent time thinking about the problem and have experience of thinking in the right terms at a young age and are primed to gain a far deeper understanding once they hit college age.
Faster computers almost certainly enable AI research. The current wave of deep learning is only possible because computers suddenly jumped 10-50x over a few years. (That is fast/cheap/general purpose GPUs enabling training of huge networks on a single computer.)
What’s weird about this is that it isn’t just being able to run bigger NNs. Before it was believed to be impossible to run really deep NNs because of vanishing gradients. Then suddenly people could experiment with deep nets on much faster computers, though they were still slow and impractical. But by experimenting with them, they figured out how to initialize weights properly, and now they can train much faster on even slow computers.
Nothing stopped anyone from making that discovery in the 90′s. But it took a renewed interest and faster computers to do experiments with for it to happen. The same is true for many other methods that have been invented. Things like dropout totally could have been invented a decade or two earlier, but for some reason it just wasn’t.
And there were supercomputers back then that could have run really big nets. If someone had an algorithm ready, it could have been tested. But no one had code just sitting around waiting for computers to get fast enough. Instead computers got fast first, then innovation happened.
The same is true for old AI research. The early AI researchers were working with computers smaller than my graphing calculator. That’s why a lot of early AI research seems silly, and why promising ideas like NNs were abandoned initially.
I heard an anecdote about one researcher who went from university to university carrying a stack of punch cards, and running them on the computer when there was spare time. It was something like a simple genetic algorithm that could easily complete in a few seconds on a modern computer. But took him months or years to get results from it.
The pattern is the same across the entire software industry, not just AI research.
Only a small portion of real progress comes from professors and Phd. Per person they tend to do pretty well in terms of innovation but it’s hard to beat a million obsessed geeks willing and able to spend every hour of their free time experimenting with something.
The people working in the olden days weren’t just working with slower computers, a lot of the time they were working with buggy, crappier languages, feature-poor debuggers and no IDE’s.
A comp sci undergrad student working with a modern language in a modern IDE with modern debuggers can whip up in hours what it would have taken phd’s weeks to do back in the early days and it’s not all just hardware.
Don’t get me wrong: Hardware helps, having cycles to burn and so much memory that you don’t have to care about wasting it also saves you time but you get a massive feedback loop where the more people there are in your environment doing similar things the more you can focus on the novel, important parts of your work rather than fucking around trying to find where you set a pointer incorrectly or screwed up a JUMP.
Very few people have access to supercomputers, if they do then they aren’t going to be spending their supercomputer time going “well that didn’t work but what if I tried this slight variation..”x100
Everyone has access to desktops so as soon as something can run on consumer electronics thousands of people can suddenly spend all night experimenting.
Even if the home experimentation doesn’t yield the results you now have a generation of teenagers who’ve spent time thinking about the problem and have experience of thinking in the right terms at a young age and are primed to gain a far deeper understanding once they hit college age.