I’m wondering what “doom” is supposed to mean here. It seems a bit odd to think that longer context windows will make things worse. More likely, LeCun meant that things won’t improve enough? (Problems we see now don’t get fixed with longer context windows.)
So then, “doom” is a hyperbolic way of saying that other kinds of machine learning will eventually win, because LLM doesn’t improve enough.
Also, there’s an assumption that longer sequences are exponentially more complicated and I don’t think that’s true for human-generated text? As documents grow longer, they do get more complex, but they tend to become more modular, where each section depends less on what comes before it. If long-range dependencies grew exponentially then we wouldn’t understand them or be able to write them.
I’m wondering what “doom” is supposed to mean here. It seems a bit odd to think that longer context windows will make things worse. More likely, LeCun meant that things won’t improve enough? (Problems we see now don’t get fixed with longer context windows.)
So then, “doom” is a hyperbolic way of saying that other kinds of machine learning will eventually win, because LLM doesn’t improve enough.
Also, there’s an assumption that longer sequences are exponentially more complicated and I don’t think that’s true for human-generated text? As documents grow longer, they do get more complex, but they tend to become more modular, where each section depends less on what comes before it. If long-range dependencies grew exponentially then we wouldn’t understand them or be able to write them.