I’m suspicious of having made a mistake because LLaMa outputs similar tokens in sequence, e.g. the cyrillic tokens in succession, or repeating “partiellement”. Overall the text looks too coherent (?), not enough weird unicode symbols and encoding errors.
A trajectory produced by sampling the least likely tokens almost certainly is not the least likely trajectory and your experiment may suggest it’s not among the least likely trajectories either.
Yeah, definitely not the least likely trajectories, instead it’s just the next token with the smallest probability. I was thinking of doing beam search with minimizing logits, but that looked difficult to implement. Still surprised that it produces things like prü|stor|oire| which are pretty pronounceable.
A trajectory produced by sampling the least likely tokens almost certainly is not the least likely trajectory and your experiment may suggest it’s not among the least likely trajectories either.
Yeah, definitely not the least likely trajectories, instead it’s just the next token with the smallest probability. I was thinking of doing beam search with minimizing logits, but that looked difficult to implement. Still surprised that it produces things like
prü|stor|oire|
which are pretty pronounceable.