The paper “Privacy Backdoors: Stealing Data with Corrupted Pretrained Models” introduces “data traps” as a way of making a neutral network remember a chosen training example, even given further training. This involves storing the chosen example in the weights and then ensuring those weights are not updated.
An ML paper on data stealing provides a construction for “gradient hacking”
Link post
The paper “Privacy Backdoors: Stealing Data with Corrupted Pretrained Models” introduces “data traps” as a way of making a neutral network remember a chosen training example, even given further training. This involves storing the chosen example in the weights and then ensuring those weights are not updated.
I have not read the paper, but it seems it might be relevant for gradient hacking https://www.lesswrong.com/posts/uXH4r6MmKPedk8rMA/gradient-hacking