Yeah, definitely not the least likely trajectories, instead it’s just the next token with the smallest probability. I was thinking of doing beam search with minimizing logits, but that looked difficult to implement. Still surprised that it produces things like prü|stor|oire| which are pretty pronounceable.
Yeah, definitely not the least likely trajectories, instead it’s just the next token with the smallest probability. I was thinking of doing beam search with minimizing logits, but that looked difficult to implement. Still surprised that it produces things like
prü|stor|oire|
which are pretty pronounceable.