Bill Benzon comments on The idea that ChatGPT is simply “predicting” the next word is, at best, misleading

Bill Benzon 21 Feb 2023 4:14 UTC
2 points
0
However, this future structure is not explicitly modelled anywhere, as far as I know. It’s possible that some model might have a “writing a fairy tale” neuron in there somewhere, linked to others that represent describable aspects of the story so far and others yet to come, and which increases the weighting of the token ” time” after “Once upon a”. I doubt there’s anything so directly interpretable as that, but I think it’s pretty certain that there are some structures in activations representing clusters of continuations past the current generation token.
More like a fairy tale region than a neuron. And once the system enters that region it stays there until the story is done.
Should we call those structures “plans” or not?
In the context of this discussion, I can live with that.