Thanks, appreciate the suggestion, there’s definitely a lot of room to go into more depth and I’ll definitely check that out
Martin Fell
Thanks, I’ll rephrase that part for clarity
A Search for More ChatGPT / GPT-3.5 / GPT-4 “Unspeakable” Glitch Tokens
In case anyone is interested or finds them useful, I did a bit more of a search for current ChatGPT glitch tokens from tokens 86000 to 96000 and found quite a few more, the ones listed below were the most extreme. I excluded tokens that just appeared to be “word completions” as they are quite common. Note the three in a row:
Token: 89473
″useRalativeImagePath”Token: 89472
″useRalative”Token: 89471
″useRal”Token: 87914
″ YYSTACK”Token: 87551
″CppGuid”Token: 86415
″BundleOrNil”Token: 86393
″ PropelException”Token: 93905
″ QtAws”Token: 93304
″VertexUvs”Token: 92103
″NavigatorMove”Token: 94823
″textTheme”Token: 94652
″BracketAccess”Token: 95812
” RTCK”
(initial character is a tab)Token: 97736
″ RTCT”
(initial character is a tab)Token: 97784
″ JSBracketAccess”Some of the more interesting responses I got during the search:
And I even got some spontaneous humour from ChatGPT:
Also worth noting that after testing several of these, they do seem to work on Bing too, which makes a lot of sense.
The tokens themselves are public, but not the actual embedding matrix/vectors (as far as I know)
Just out of curiosity I searched manually through tokens 96000 − 97999, I did find quite a few “word suffix” tokens, e.g. “oralType” which ChatGPT 3.5 always completes to “TemporalType”. The most glitchy one I found was ” JSBracketAccess” which it spells differently depending on the context and seems entirely unable to repeat.
(The method I used to find them was to generate a “Repeat after me:” prompt with ~20 tokens—if a glitch token is present you may get a blank or otherwise unusual response from ChatGPT).
I’ve also found generating exercises from text to be particularly useful, even to just make you think more about what you’re reading. Also found this useful when learning new tools, e.g. generating a load of einsum / einops exercises which didn’t even require pasting in any additional text. Using it to summarize code sounds interesting and not something I’ve tried before.
I wonder if something like this could somehow be combined with Anki to generate randomized questions? One of the issues I’ve had when using spaced repetition for learning coding is that I often end up remembering the exact answer to questions, when really what I want to do is learn when and where to use tools to solve varied problems. I wonder if using LLMs to randomize the questions could mitigate that a bit?
For what it’s worth, most modern fusion bombs actually generate most (e.g. 80%+) of their “yield” from fission—the fusion stage is surrounded by a layer of uranium which is bombarded by neutrons produced in the fusion reaction, causing fission in the uranium and magnifying the yield. So they are pretty dirty weapons. They are at least smaller than the weapons from the 50s and 60s though.
Since it seems that glitch tokens are caused by certain sequences of text appearing in the training corpus for the tokenizer much more often than they do in the LLM training data, something like that might work. But there also seem to exist “glitch phrases” or “unspeakable phrases”, i.e. sequences of tokens of extremely low probability to the model that could create some strange behaviour too, and it seems at least plausible to me that these kinds of phrases could still be generated even if countermeasures were taken to prevent glitch tokens from being created. Glitch phrases though are a bit more difficult to find without access to the model.