It’s not surprising to me, since language models have done similar things in the past, e.g. learning to translate mainly from unsupervised monolingual data.
That said, I am not sure about explaining it with “natural abstractions”. At least, I cannot immediately derive the connection to the natural abstraction arguments. I would not be surprised if there was a connection, but I would also not be surprised if there wasn’t a connection. It feels a bit like a Mysterious Answer if I cannot directly derive the connection. But I haven’t thought much about it, so it may be obvious if I think harder.
The natural abstraction here is the goal/task, and when the context/environment changes, I’m claiming that the system is likely to generalise to a natural abstraction of the goal/task in the new context/environment.
The natural generalisation of “follow instructions in English” is “follow instructions” and this translates to other language contexts.
John Wentworth has a number of specific arguments about natural abstractions, e.g. resampling, telephone theorem, etc.
When you use the term “natural abstraction” to say that it is predictable/expected, are you saying that those arguments predict this outcome? If so, do you have a specific argument in mind?
I feel like this answer glosses over the fact that the encoding changes. Surely you can find some encodings of instructions such that LLMs cannot follow instructions in that encoding. So the question lies in why learning the English encoding also allows the model to learn (say) German encodings.
So the question lies in why learning the English encoding also allows the model to learn (say) German encodings.
No? We already know that the model can competently respond in German. Once you condition on the model competently responding in other languages (e.g. for translation tasks) there is no special question about why it follows instructions in other languages as well.
Like “why are LLMs capable of translation might be an interesting question”, but if you’re not asking that question, then I don’t understand why you’re asking this.
My position is that this isn’t a special capability that warrants any explanation that isn’t covered in an explanation of why/how LLMs are competent translators.
Ah, I misunderstood the content of original tweet—I didn’t register that the model indeed had access to lots of data in other languages as well. In retrospect I should have been way more shocked if this wasn’t the case. Thanks.
I then agree that it’s not too surprising that the instruction-following behavior is not dependent on language, though it’s certainly interesting. (I agree with Habryka’s response below.)
It’s not surprising to me, since language models have done similar things in the past, e.g. learning to translate mainly from unsupervised monolingual data.
That said, I am not sure about explaining it with “natural abstractions”. At least, I cannot immediately derive the connection to the natural abstraction arguments. I would not be surprised if there was a connection, but I would also not be surprised if there wasn’t a connection. It feels a bit like a Mysterious Answer if I cannot directly derive the connection. But I haven’t thought much about it, so it may be obvious if I think harder.
The natural abstraction here is the goal/task, and when the context/environment changes, I’m claiming that the system is likely to generalise to a natural abstraction of the goal/task in the new context/environment.
The natural generalisation of “follow instructions in English” is “follow instructions” and this translates to other language contexts.
John Wentworth has a number of specific arguments about natural abstractions, e.g. resampling, telephone theorem, etc.
When you use the term “natural abstraction” to say that it is predictable/expected, are you saying that those arguments predict this outcome? If so, do you have a specific argument in mind?
I feel like this answer glosses over the fact that the encoding changes. Surely you can find some encodings of instructions such that LLMs cannot follow instructions in that encoding. So the question lies in why learning the English encoding also allows the model to learn (say) German encodings.
No? We already know that the model can competently respond in German. Once you condition on the model competently responding in other languages (e.g. for translation tasks) there is no special question about why it follows instructions in other languages as well.
Like “why are LLMs capable of translation might be an interesting question”, but if you’re not asking that question, then I don’t understand why you’re asking this.
My position is that this isn’t a special capability that warrants any explanation that isn’t covered in an explanation of why/how LLMs are competent translators.
Ah, I misunderstood the content of original tweet—I didn’t register that the model indeed had access to lots of data in other languages as well. In retrospect I should have been way more shocked if this wasn’t the case. Thanks.
I then agree that it’s not too surprising that the instruction-following behavior is not dependent on language, though it’s certainly interesting. (I agree with Habryka’s response below.)