anon3242 comments on SmartyHeaderCode: anomalous tokens for GPT3.5 and GPT-4

anon3242 26 Apr 2023 17:28 UTC
1 point
0
GPT-3.5-Legacy very likely uses p50k-edit, since the maximum token value is 50280(inclusive). During my tests, sometimes the responses are not very “glichty”, but the generated title is. Probably worth further investigation. I have been thinking, the abrupt termination of generation when trying to say the “unspeakable” tokens may be a result of the possibilities of the glitch token and its neighbors being too low, which causes things like <|im_end|> or <|endoftext|> to be evetually spit out. If we can try to suppress its intention to end the generation maybe we won’t have “unspeakable” tokens anymore.