LawrenceC comments on LLM Basics: Embedding Spaces—Transformer Token Vectors Are Not Points in Space

LawrenceC 14 Feb 2023 2:09 UTC
2 points
0
Yeah, I agree! You 100% should not think about the unembed as looking for “the closest token”, as opposed to looking for the token with the largest dot product (= high cosine similarity + large size).
I suspect the piece would be helpful for people with similar confusions, though I think by default most people already think of features as directions (this is an incredible tacit assumption that’s made everywhere in mech interp work), especially since the embed/unembed are linear functions.