To be clear you mean you think that progress will slow down, speed up, or just be anywhere but constant?
Anywhere but constant, i.e. currently it seems to be slowing down with a focus on productionizing what exists.
Current avenues of research seem bogged down by an inability to actually establish “what works”, which is a problem that I see no clear solution, interesting problems can no longer be scored with a simple number representing “accuracy” or “performance”.
The one solution I’ve seen to this is hand-waving around RL-like system that are left to train online or on an internet-like simulation coupled with control interactions with reality, but ideally in a simulated way whenever possible. E.g. see Gwern’s recent clippy post for a tl;dr on that. In practice that seems to have severe limitations to me but who knows, I’m not smart enough to rule it out entirely.
The one thing I’m sure won’t happen is the last decade again, which seems to be mainly an artificial of the mutual realization that ml algorithms can run on GPUs and GPUs can be designed with ML in mind, Culminating in things like the transformer, which e.o.d. can be tought of an an “unrolled” recurrent/hopfield NN lacking some abilities, but able to take advantage of massive parallelization and memory capacity via the trasnpose + multiplication. But my read is that this success has lead to the bottlenecks now appearing in areas that are not compute.
That makes sense, thanks for the clarification!
My (totally outside-view) sense is that there is still a tremendous amount that can be done in terms of designing specialty hardware optimized for specific architectures, but I don’t know if that’s even doable with modern fab technology.
Anywhere but constant, i.e. currently it seems to be slowing down with a focus on productionizing what exists.
Current avenues of research seem bogged down by an inability to actually establish “what works”, which is a problem that I see no clear solution, interesting problems can no longer be scored with a simple number representing “accuracy” or “performance”.
The one solution I’ve seen to this is hand-waving around RL-like system that are left to train online or on an internet-like simulation coupled with control interactions with reality, but ideally in a simulated way whenever possible. E.g. see Gwern’s recent clippy post for a tl;dr on that. In practice that seems to have severe limitations to me but who knows, I’m not smart enough to rule it out entirely.
The one thing I’m sure won’t happen is the last decade again, which seems to be mainly an artificial of the mutual realization that ml algorithms can run on GPUs and GPUs can be designed with ML in mind, Culminating in things like the transformer, which e.o.d. can be tought of an an “unrolled” recurrent/hopfield NN lacking some abilities, but able to take advantage of massive parallelization and memory capacity via the trasnpose + multiplication. But my read is that this success has lead to the bottlenecks now appearing in areas that are not compute.
That makes sense, thanks for the clarification! My (totally outside-view) sense is that there is still a tremendous amount that can be done in terms of designing specialty hardware optimized for specific architectures, but I don’t know if that’s even doable with modern fab technology.