Note that there are glitch tokens in GPT3.5 and GPT4! The tokenizer was changed to a 100k vocabulary (rather than 50k) so all of the tokens are different, but they are there. Try ” ForCanBeConverted” as an example.
If I remember correctly, “davidjl” is the only old glitch token that carries over to the new tokenizer.
Apart from that, some lists have been created and there do exist a good selection.
Good find. What I find fascinating is the fairly consistent responses using certain tokens, and the lack of consistent response using other tokens. I observe that in a Bayesian network, the lack of consistent response would suggest that the network was uncertain, but consistency would indicate certainty. It makes me very curious how such ideas apply to the concept of Glitch tokens and the cause of the variability in response consistency.
Note that there are glitch tokens in GPT3.5 and GPT4! The tokenizer was changed to a 100k vocabulary (rather than 50k) so all of the tokens are different, but they are there. Try ” ForCanBeConverted” as an example.
If I remember correctly, “davidjl” is the only old glitch token that carries over to the new tokenizer.
Apart from that, some lists have been created and there do exist a good selection.
Good find. What I find fascinating is the fairly consistent responses using certain tokens, and the lack of consistent response using other tokens. I observe that in a Bayesian network, the lack of consistent response would suggest that the network was uncertain, but consistency would indicate certainty. It makes me very curious how such ideas apply to the concept of Glitch tokens and the cause of the variability in response consistency.