Nice work! Can the reason these concepts are possible to ‘remove’ be traced back to the LoRA finetune?
As far as I’m aware, major open source chat tuned models like LLaMA are fine-tuned properly, not via a LoRA
Nice work! Can the reason these concepts are possible to ‘remove’ be traced back to the LoRA finetune?
As far as I’m aware, major open source chat tuned models like LLaMA are fine-tuned properly, not via a LoRA