So the question lies in why learning the English encoding also allows the model to learn (say) German encodings.
No? We already know that the model can competently respond in German. Once you condition on the model competently responding in other languages (e.g. for translation tasks) there is no special question about why it follows instructions in other languages as well.
Like “why are LLMs capable of translation might be an interesting question”, but if you’re not asking that question, then I don’t understand why you’re asking this.
My position is that this isn’t a special capability that warrants any explanation that isn’t covered in an explanation of why/how LLMs are competent translators.
Ah, I misunderstood the content of original tweet—I didn’t register that the model indeed had access to lots of data in other languages as well. In retrospect I should have been way more shocked if this wasn’t the case. Thanks.
I then agree that it’s not too surprising that the instruction-following behavior is not dependent on language, though it’s certainly interesting. (I agree with Habryka’s response below.)
No? We already know that the model can competently respond in German. Once you condition on the model competently responding in other languages (e.g. for translation tasks) there is no special question about why it follows instructions in other languages as well.
Like “why are LLMs capable of translation might be an interesting question”, but if you’re not asking that question, then I don’t understand why you’re asking this.
My position is that this isn’t a special capability that warrants any explanation that isn’t covered in an explanation of why/how LLMs are competent translators.
Ah, I misunderstood the content of original tweet—I didn’t register that the model indeed had access to lots of data in other languages as well. In retrospect I should have been way more shocked if this wasn’t the case. Thanks.
I then agree that it’s not too surprising that the instruction-following behavior is not dependent on language, though it’s certainly interesting. (I agree with Habryka’s response below.)