I think this is a key question. I think the answer for transformer LLMs without external memory store is currently ‘No, not arbitrarily long computations’. I have looked into this in my research and found some labs doing work which would enable this. So it’s more of a question of when the big labs will decide to take integrate these ideas developed by academic groups into SoTA models, not a question of whether it’s possible. There’s a lot of novel capabilities like this that have been demonstrated to be possible but not yet been integrated, even when trying to rule out those which might get blocked by poor scaling/parallelizability.
So, the remaining hurdles seem to be more about engineering solutions to integration challenges, rather than innovation and proof-of-concept.
I think this is a key question. I think the answer for transformer LLMs without external memory store is currently ‘No, not arbitrarily long computations’. I have looked into this in my research and found some labs doing work which would enable this. So it’s more of a question of when the big labs will decide to take integrate these ideas developed by academic groups into SoTA models, not a question of whether it’s possible. There’s a lot of novel capabilities like this that have been demonstrated to be possible but not yet been integrated, even when trying to rule out those which might get blocked by poor scaling/parallelizability.
So, the remaining hurdles seem to be more about engineering solutions to integration challenges, rather than innovation and proof-of-concept.