Others have suggested that the vagueness of the definitions at small and large distance from centroid are a side effect of layernorm (although you’ve given the most detailed account of how that might work). This seemed plausible at the time, but not so much now that I’ve just found this:
The prompt “A typical definition of ″ would be ’”, where there’s no customised embedding involved (we’re just eliciting a definition of the null string) gives “A person who is a member of a group.” at temp 0. And I’ve had confirmation from someone with GPT4 base model access that it does exactly the same thing (so I’d expect this is something across all GPT models—a shame GPT3 is no longer available to test this).
Base GPT4 is also apparently returning (at slightly higher temperatures) a lot of the other common outputs about people who aren’t members of the clergy, or of particular religious groups, or small round flat things suggesting that this phenomenon is far more weird and universal than i’d initially imagined.
Others have suggested that the vagueness of the definitions at small and large distance from centroid are a side effect of layernorm (although you’ve given the most detailed account of how that might work). This seemed plausible at the time, but not so much now that I’ve just found this:
The prompt “A typical definition of ″ would be ’”, where there’s no customised embedding involved (we’re just eliciting a definition of the null string) gives “A person who is a member of a group.” at temp 0. And I’ve had confirmation from someone with GPT4 base model access that it does exactly the same thing (so I’d expect this is something across all GPT models—a shame GPT3 is no longer available to test this).
Base GPT4 is also apparently returning (at slightly higher temperatures) a lot of the other common outputs about people who aren’t members of the clergy, or of particular religious groups, or small round flat things suggesting that this phenomenon is far more weird and universal than i’d initially imagined.
Here’s the upper section (most probable branches) of GPT-J’s definition tree for the null string: