Tom Lieberum comments on Tracr: Compiled Transformers as a Laboratory for Interpretability | DeepMind

Tom Lieberum 15 Jan 2023 10:05 UTC
LW: 1 AF: 1
0
AF
(The quote refers to the usage of binary attention patterns in general, so I’m not sure why you’re quoting it)
I obv agree that if you take the softmax over {0, 1000, 2000}, you will get 0 and 1 entries.
iiuc, the statement in the tracr paper is not that you can’t have attention patterns which implement this logical operation, but that you can’t have a single head implementing this attention pattern (without exponential blowup)
- Gurkenglas 15 Jan 2023 10:09 UTC
  LW: 2 AF: 1
  0
  AF Parent
  (To show my idea compatible with Boolean attention.)
  
  I use a single head, and the ranks add up linearly.
  - Vlad Mikulik 16 Jan 2023 15:58 UTC
    LW: 1 AF: 1
    0
    AF Parent
    You’re right that accounting for the softmax could be used to get around the argument in the appendix. We’ll mention this when we update the paper.
    The scheme as you described it relies on an always-on dummy token, which would conflict with our implementation of default values in aggregates: we fix the BOS token attention to 0.5, so at low softmax temp it’s attended to iff nothing else is; however with this scheme we’d want 0.5 to always round off to zero after softmax. This is plausibly surmountable but we put it out of scope for the pilot release since we didn’t seem to need this feature, whereas default values come up pretty often.
    Also, while this construction will work for just and (with arbitrarily many conjuncts), I don’t think it works for arbitrary compositions using and and or. (Of course it would be better than no composition at all.)
    In general, I expect there to be a number of potentially low-hanging improvements to be made to the compiler—many of them we deliberately omitted and are mentioned in the limitations section, and many we haven’t yet come up with. There’s tons of features one could add, each of which takes time to think about and increases overall complexity, so we had to be judicious which lines to pursue—and even then, we barely had time to get into superposition experiments. We currently aren’t prioritising further Tracr development until we see it used for research in practice, but I’d be excited to help anyone who’s interested in working with it or contributing.
    - Gurkenglas 16 Jan 2023 17:35 UTC
      LW: 2 AF: 1
      0
      AF Parent
      
      we fix the BOS token attention to 0.5
      
      1.5 does conjunction without my dummy.
      
      When at most one clause is true, 0.5 does disjunction instead.
      - Vlad Mikulik 16 Jan 2023 18:53 UTC
        LW: 1 AF: 1
        0
        AF Parent
        Ah, yep, I think you’re right—it should be pretty easy to add support for and in selectors then.