Speculative guess about the semantic richness: the embeddings at distances like 5-10 are typical to concepts which are usually represented by multi token strings. E.g. “spotted salamander” is 5 tokens.
If one manufactured extreme embeddings at 5-10 distance, would they decode to completions of tokens which implied a long phrase beforehand?
Speculative guess about the semantic richness: the embeddings at distances like 5-10 are typical to concepts which are usually represented by multi token strings. E.g. “spotted salamander” is 5 tokens.
If one manufactured extreme embeddings at 5-10 distance, would they decode to completions of tokens which implied a long phrase beforehand?