Good point, the whole “model treats tokens it previously produced and tokens that are part of the input exactly the same” thing and the whole “model doesn’t learn across usages” thing are also very important.
Good point, the whole “model treats tokens it previously produced and tokens that are part of the input exactly the same” thing and the whole “model doesn’t learn across usages” thing are also very important.