nostalgebraist comments on SolidGoldMagikarp (plus, prompt generation)

nostalgebraist 7 Feb 2023 2:34 UTC
LW: 8 AF: 3
0
AF
To check this, you’d want to look at a model trained with untied embeddings. Sadly, all the ones I’m aware of (Eleuther’s Pythia, and my interpretability friendly models) were trained on the GPT-NeoX tokenizer or variants, whcih doesn’t seem to have stupid tokens in the same way.
GPT-J uses the GPT-2 tokenizer and has untied embeddings.