My sense is that AGZ is a high profile example of how fast the trend of neural nets (which mathematically have existed in essentially modern form since the 60s) can make progress. The same techniques have had a huge impact throughout AI research and I think counting this as a single data point in that sense is substantially undercounting the evidence. For example, image recognition benchmarks have used the same technology, as have Atari playing AI.
That could represent one step in a general trend of subsuming many detailed systems into fewer simpler systems. Or, it could represent a technology being newly viable, and the simplest applications of it being explored first.
For the former to be the case, this simplification process would need to keep happening at higher and higher abstraction levels. We’d explore a few variations on an AI architecture, then get a new insight that eclipses all these variations, taking the part we were tweaking and turning it into just another parameter for the system to learn by itself. Then we’d try some variations of this new simpler architecture, until we discover an insight that eclipses all these variations, etc. In this way, our AI systems would become increasingly general without any increase in complexity.
Without this kind of continuing trend, I’d expect increasing capability in NN-based software will have to be achieved in the same way as in regular old software: integrating more subsystems, covering more edge cases, generally increasing complexity and detail.
I think there are some strong points supporting the latter possibility, like the lack of similarly high profile success in unsupervised learning and the use of massive amounts of hardware and data that were unavailable in the past.
That said, I think someone five years ago might have said “well, we’ve had success with supervised learning but less with unsupervised and reinforcement learning.” (I’m not certain about this though)
I guess in my model AGZ is more like a third or fourth data point than a first data point—still not conclusive and with plenty of space to fizzle out but starting to make me feel like it’s actually part of a pattern.
My sense is that AGZ is a high profile example of how fast the trend of neural nets (which mathematically have existed in essentially modern form since the 60s) can make progress. The same techniques have had a huge impact throughout AI research and I think counting this as a single data point in that sense is substantially undercounting the evidence. For example, image recognition benchmarks have used the same technology, as have Atari playing AI.
That could represent one step in a general trend of subsuming many detailed systems into fewer simpler systems. Or, it could represent a technology being newly viable, and the simplest applications of it being explored first.
For the former to be the case, this simplification process would need to keep happening at higher and higher abstraction levels. We’d explore a few variations on an AI architecture, then get a new insight that eclipses all these variations, taking the part we were tweaking and turning it into just another parameter for the system to learn by itself. Then we’d try some variations of this new simpler architecture, until we discover an insight that eclipses all these variations, etc. In this way, our AI systems would become increasingly general without any increase in complexity.
Without this kind of continuing trend, I’d expect increasing capability in NN-based software will have to be achieved in the same way as in regular old software: integrating more subsystems, covering more edge cases, generally increasing complexity and detail.
I think there are some strong points supporting the latter possibility, like the lack of similarly high profile success in unsupervised learning and the use of massive amounts of hardware and data that were unavailable in the past.
That said, I think someone five years ago might have said “well, we’ve had success with supervised learning but less with unsupervised and reinforcement learning.” (I’m not certain about this though)
I guess in my model AGZ is more like a third or fourth data point than a first data point—still not conclusive and with plenty of space to fizzle out but starting to make me feel like it’s actually part of a pattern.