That’s basically what I was alluding to by “brute-forced tried enough possibilities to come up with the answer.” Even if that were the case, the implication is that it is actually constructing a complete multi-token answer in order to “test” that answer against the grammatical and semantic requirements. If it truly were re-computing the “correct” next token on each successive iteration, I don’t see how it could seamlessly merge its individually-generated tokens with the given sentence-end text.
That’s basically what I was alluding to by “brute-forced tried enough possibilities to come up with the answer.” Even if that were the case, the implication is that it is actually constructing a complete multi-token answer in order to “test” that answer against the grammatical and semantic requirements. If it truly were re-computing the “correct” next token on each successive iteration, I don’t see how it could seamlessly merge its individually-generated tokens with the given sentence-end text.