I expect that this will be true for LLM token embeddings too. Have anyone checked this?
I also expect something similar to be true for internal LLM representations too, but that this might be harder to verify. However, maybe not, if you have interpretable SAE vectors?
This post reminds me of the Word2vec algebra.
E.g. “kitten”—“cat” + “dog” ≈ “puppy”
I expect that this will be true for LLM token embeddings too. Have anyone checked this?
I also expect something similar to be true for internal LLM representations too, but that this might be harder to verify. However, maybe not, if you have interpretable SAE vectors?
Case in point: This is a five year old tsne plot of word vectors on my laptop.