Vivek Hebbar comments on What’s up with LLMs representing XORs of arbitrary features?

Vivek Hebbar 15 Jan 2024 1:54 UTC
LW: 1 AF: 1
0
AF
Here’s a fun thing I noticed:
There are 16 boolean functions of two variables. Now consider an embedding that maps each of the four pairs {(A=true, B=true), (A=true, B=false), …} to a point in 2d space. For any such embedding, at most 14 of the 16 functions will be representable with a linear decision boundary.
For the “default” embedding (x=A, y=B), xor and its complement are the two excluded functions. If we rearrange the points such that xor is linearly represented, we always lose some other function (and its complement). In fact, there are 7 meaningfully distinct colinearity-free embeddings, each of which excludes a different pair of functions.^[1]
I wonder how this situation scales for higher dimensions and variable counts. It would also make sense to consider sparse features (which allow superposition to get good average performance).
1. ^
  The one unexcludable pair is (“always true”, “always false”).
  These are the seven embeddings: