c.trout comments on What’s up with LLMs representing XORs of arbitrary features?

c.trout 5 Jan 2024 17:24 UTC
1 point
0
My apologies if this is dumb but if when a model linearly represents features a and b it automatically linearly represents $a \land b$ and $a \lor b$ , then why wouldn’t it automatically (i.e. without using up more model capacity) also linearly represent $a \oplus b$ ? After all, $a \oplus b$ is equivalent to $(a \lor b) \land \neg (a \land b)$ , which is equivalent to $(a \lor b) \land (\neg a \lor \neg b)$ .
In general, since { $\neg$ , $\land$ } is truth functionally complete, if a and b are represented linearly won’t the model have a linear representation of every expression of first order logic (without quantifiers) that involve only a and b?
(If I’m missing something dumb, feel free to ignore this post).
- ryan_greenblatt 5 Jan 2024 17:55 UTC
  8 points
  0
  Parent
  I think the key thing is that $a \land b$ and $a \lor b$ aren’t separately linearly represented (or really even linearly represented in some strong sense). The model “represents” these by $a + b$ with different thresholds like this:
  
  So, if we try to compute $a \lor b \land \neg (a \land b)$ linearly then we would do $a \lor b$ corresponds to $(a + b)$ , $a \land b$ corresponds to $(a + b)$ and $\neg$ corresponds to negation, so we get $(a + b) - (a + b)$ which clearly doesn’t do what we want!
  
  If we could apply a relu on one side, then we could get this: $(a + b) - 2 r e l u ((a + b) - 1)$ . And this could work (assuming $a$ and $b$ are booleans which are either 0 or 1). See also Fabien’s comment which shows that you naturally get xor just using random MLPs assuming some duplication.
  - ryan_greenblatt 6 Jan 2024 1:17 UTC
    2 points
    0
    Parent
    More generally, just try to draw the XOR decision boundary on the above diagram!
    - c.trout 6 Jan 2024 20:44 UTC
      1 point
      0
      Parent
      Ah, I see now! Thanks for the clarifications!