I wanted to find out if there were other clusters of tokens which generated similarly anomalous behavior so I wrote a script that took a list of tokens, sent them one at a time to text-curie-001 via a standardized prompt, and recorded everyrything that GPT failed to repeat on its first try. Here is an anomalous token that is not in the authors’ original cluster: “herical”.
There’s also “oreAnd” which, while similar, is not technically in the original cluster.
I wanted to find out if there were other clusters of tokens which generated similarly anomalous behavior so I wrote a script that took a list of tokens, sent them one at a time to
text-curie-001
via a standardized prompt, and recorded everyrything that GPT failed to repeat on its first try. Here is an anomalous token that is not in the authors’ original cluster: “herical”.There’s also “oreAnd” which, while similar, is not technically in the original cluster.
Did you do it with all tokens? If so, do you have a complete list of anomalous results posted anywhere?
I did it with a set of 50,000+ tokens. It was not the same set used by Jessica Rumbelow and mwatkins.
I haven’t posted it yet. It’s a noisy dataset. I don’t want to post the list without caveats and context.
Fair enough!