By tamper-resistant fine-tuning, are you referring to this paper by Tamirisa et al? (That’d be a pretty devastating issue with the whole motivation to their paper since no one actually does anything but use LoRA for fine-tuning open-weight models...)
I think it’s not that devastating, since I expect that their method can be adapted to counter classic LoRA tuning (the algorithm takes as input some set finetuning methods it “trains against”). But yeah, it’s not reassuring that it doesn’t generalize between full-weight FT and LoRA.
By tamper-resistant fine-tuning, are you referring to this paper by Tamirisa et al? (That’d be a pretty devastating issue with the whole motivation to their paper since no one actually does anything but use LoRA for fine-tuning open-weight models...)
That’s right.
I think it’s not that devastating, since I expect that their method can be adapted to counter classic LoRA tuning (the algorithm takes as input some set finetuning methods it “trains against”). But yeah, it’s not reassuring that it doesn’t generalize between full-weight FT and LoRA.