Really nice summarisation of the confusion. Re: your point 3, this point makes “induction heads” as a class of things feel a lot less coherent :( I had also not considered that the behaviour on random sequences to show induction as a fallback—do you think there may be induction-y heads that simply don’t activate on random sequences due to the out-of-distribution nature of them?
I don’t have a confident answer to this question. Nonetheless, I can share related evidence we found during REMIX (that should be public in the near future).
We defined a new measure for context sensitivity relying on causal intervention. We measure how much the in-context loss of the model increases when we replace the input of a given head with a modified input sequence, where the far-away context is scrubbed (replaced by the text from a random sequence in the dataset). We found heads in GPT2-small that are context-sensitive according to this new metric, but score low on the score used to define induction heads. This means that there exist heads that heavily depend on the context that are not behavioral induction heads.
It’s unclear what those heads are doing (if that’s induction-y behavior on natural text or some other type of in-context processing that cannot be described as “induction-y”).
Really nice summarisation of the confusion. Re: your point 3, this point makes “induction heads” as a class of things feel a lot less coherent :( I had also not considered that the behaviour on random sequences to show induction as a fallback—do you think there may be induction-y heads that simply don’t activate on random sequences due to the out-of-distribution nature of them?
I don’t have a confident answer to this question. Nonetheless, I can share related evidence we found during REMIX (that should be public in the near future).
We defined a new measure for context sensitivity relying on causal intervention. We measure how much the in-context loss of the model increases when we replace the input of a given head with a modified input sequence, where the far-away context is scrubbed (replaced by the text from a random sequence in the dataset). We found heads in GPT2-small that are context-sensitive according to this new metric, but score low on the score used to define induction heads. This means that there exist heads that heavily depend on the context that are not behavioral induction heads.
It’s unclear what those heads are doing (if that’s induction-y behavior on natural text or some other type of in-context processing that cannot be described as “induction-y”).