Superposition is the possibility of storing more than n features in an n-dimensional vector, by letting the features be slightly correlated with each other. It turns out that one can store exponentially many features in a given vector. The ability to store that many features in a single vector space is sometimes explained using the Johnson–Lindenstrauss lemma, but the lemma seems counterintuitive, so I came up with an alternative approach that I found simpler:
Suppose you have a set F with 2d elements and you want to embed it in a d-dimensional vector space. We label each element xi with integers i such that F={x0,x1,…,x2d−1}. You can write each integer i as a string of d bits, bd−1…b2b1b0. To improve symmetry, we translate a bit b from being 0 or 1 to being −1 or 1 by taking 2b−1. Join all the digits into a vector and normalize to get the embedding:
e(xi)=1√d[2bd−1−1,…,2b2−1,2b1−1,2b0−1]⊤
That is, we map each bit to a separate dimension, with a b=1 bit mapping to a positive value and a b=0 bit mapping to a negative value, and scale the embedding by 1√d to keep the embedding vector of unit length.
If we pick two random elements xa and xb of F, then an elementary argument shows that their dot product is well-approximated as following a normal distribution N(0,1√d).
In some ways this isn’t quite as perfect as the Johnson-Lindenstrauss lemma since you could in principle be unlucky and get two elements that accidentally have a high similarity. After all, for a given element xi, there will be d elements xi⊕2k whose numbers merely differ from xi by a bitflip. However, it is straightforward to reduce the noise: just concatenate multiple embeddings based on different labels. If instead of using d dimensions, you use Rd dimensions, then you can pump down the noise to N(0,1√Rd).
Binary encoding as a simple explicit construction for superposition
Superposition is the possibility of storing more than n features in an n-dimensional vector, by letting the features be slightly correlated with each other. It turns out that one can store exponentially many features in a given vector. The ability to store that many features in a single vector space is sometimes explained using the Johnson–Lindenstrauss lemma, but the lemma seems counterintuitive, so I came up with an alternative approach that I found simpler:
Suppose you have a set F with 2d elements and you want to embed it in a d-dimensional vector space. We label each element xi with integers i such that F={x0,x1,…,x2d−1}. You can write each integer i as a string of d bits, bd−1…b2b1b0. To improve symmetry, we translate a bit b from being 0 or 1 to being −1 or 1 by taking 2b−1. Join all the digits into a vector and normalize to get the embedding:
e(xi)=1√d[2bd−1−1,…,2b2−1,2b1−1,2b0−1]⊤
That is, we map each bit to a separate dimension, with a b=1 bit mapping to a positive value and a b=0 bit mapping to a negative value, and scale the embedding by 1√d to keep the embedding vector of unit length.
If we pick two random elements xa and xb of F, then an elementary argument shows that their dot product is well-approximated as following a normal distribution N(0,1√d).
In some ways this isn’t quite as perfect as the Johnson-Lindenstrauss lemma since you could in principle be unlucky and get two elements that accidentally have a high similarity. After all, for a given element xi, there will be d elements xi⊕2k whose numbers merely differ from xi by a bitflip. However, it is straightforward to reduce the noise: just concatenate multiple embeddings based on different labels. If instead of using d dimensions, you use Rd dimensions, then you can pump down the noise to N(0,1√Rd).