As I understand that last point, you’re saying that it’s not a good point because it is false (hence my ‘if it turns out to be true’).
I’m not exactly sure what “it” is here. It is true that our results can be validly reinterpreted as being about data ordering. My claim is just that this reinterpretation is not that interesting, because all fine-tuning can be reinterpreted in the same way, and we have ample evidence from such fine-tuning that data ordering generally does matter quite a lot, so it not mattering in this case is quite significant.
I’m not exactly sure what “it” is here. It is true that our results can be validly reinterpreted as being about data ordering. My claim is just that this reinterpretation is not that interesting, because all fine-tuning can be reinterpreted in the same way, and we have ample evidence from such fine-tuning that data ordering generally does matter quite a lot, so it not mattering in this case is quite significant.