mwatkins comments on SolidGoldMagikarp (plus, prompt generation)

mwatkins 6 Feb 2023 17:29 UTC
2 points
−1
In GPT2-small and GPT-J they’re actually smaller than average, as they tend to cluster close to the centroid (which isn’t too far from the origin). In GPT2-xl they do tend to be larger than average. But in all of these models, they’re found distributed across the full range of distances-from-centroid.
At this point we don’t know where the token embeddings lie relative to the centroid in GPT-3 embedding spaces, as that data is not yet publicly available. And all the bizarre behaviour we’ve been documenting has been in GPT-3 models (despite discovering the “triggering” tokens in GPT-2/J embedding spaces.
OpenAI is still claiming online that all of their token embeddings are normalised to norm 1, but this is simply untrue, as can be easily demonstrated with a few lines of PyTorch.