There are also some new glitch tokens for GPT-3.5 / GPT-4, my favourite is ” ForCanBeConverted”, although I don’t think the behaviour they produce is as interesting and varied as the GPT-3 glitch tokens. It generally seems to process the token as if it was a specific word that varies depending on the context. For example, with ” ForCanBeConverted”, if you try asking for stories, you tend to get a fairly formulaic story but with the randomized word inserted into it (e.g. “impossible”, “innovate”, “imaginate”, etc.). I think that might be due to the RLHF harming the model’s creativity though, biasing it towards “inoffensive” stories, which would make access to the base model more appealing.
Also, another thought that comes to mind—is it possible that the unexplained changes to the GPT-3 model’s output could be related to changes in the underlying hardware or implementation, rather than further training? I’m only thinking this because of the nondeterministic behaviour you get at 0 temperature (especially in the case of glitch tokens where floating-point rounding could make a big difference in the top logits).
There are also some new glitch tokens for GPT-3.5 / GPT-4, my favourite is ” ForCanBeConverted”, although I don’t think the behaviour they produce is as interesting and varied as the GPT-3 glitch tokens. It generally seems to process the token as if it was a specific word that varies depending on the context. For example, with ” ForCanBeConverted”, if you try asking for stories, you tend to get a fairly formulaic story but with the randomized word inserted into it (e.g. “impossible”, “innovate”, “imaginate”, etc.). I think that might be due to the RLHF harming the model’s creativity though, biasing it towards “inoffensive” stories, which would make access to the base model more appealing.
Also, another thought that comes to mind—is it possible that the unexplained changes to the GPT-3 model’s output could be related to changes in the underlying hardware or implementation, rather than further training? I’m only thinking this because of the nondeterministic behaviour you get at 0 temperature (especially in the case of glitch tokens where floating-point rounding could make a big difference in the top logits).
Thanks for sharing ” ForCanBeConverted”. Tested it and it is also throwing random stuff.