On the contrary, you mainly seem to be claiming that thinking of LLMs as working one token at a time is misleading, but I’m not sure I understand any examples of misleading conclusions that you think people draw from it. Where do you think people go wrong?
Over there in another part of the universe there are people who are yelling that LLMs are “stochastic parrots.” Their intention is to discredit LLMs as dangerous evil devices Not too far away from those folks are those saying it’s “autocomplete on steroids.” That’s only marginally better.
Saying LLMs are “next word predictors” feeds into that. Now, I’m talking about rhetoric here, not intellectual substance. But rhetoric matters. There needs to be a better way of talking about these devices for a general audience.
I gave one example of the “work” this does: that GPT performs better when prompted to reason first rather than state the answer first. Another example is: https://www.lesswrong.com/posts/bwyKCQD7PFWKhELMr/by-default-gpts-think-in-plain-sight
On the contrary, you mainly seem to be claiming that thinking of LLMs as working one token at a time is misleading, but I’m not sure I understand any examples of misleading conclusions that you think people draw from it. Where do you think people go wrong?
Over there in another part of the universe there are people who are yelling that LLMs are “stochastic parrots.” Their intention is to discredit LLMs as dangerous evil devices Not too far away from those folks are those saying it’s “autocomplete on steroids.” That’s only marginally better.
Saying LLMs are “next word predictors” feeds into that. Now, I’m talking about rhetoric here, not intellectual substance. But rhetoric matters. There needs to be a better way of talking about these devices for a general audience.
Oh, thanks for the link. It looks interesting.