I’m confused about the back door attack detection task even after reading it a few times:
The article says: “The key difference in the attack detection task is that you are given the backdoor input along with the backdoored model, and merely need to recognize the nput as an attack”.
When I read that, I find myself wondering why that isn’t trivial solved by a model that memorises which input(s) are known to be an attack.
My best interpretation is that there are a bunch of possible inputs that cause an attack and you are given one of them and just have to recognise that one plus the others you don’t see. Is this interpretation correct?
I don’t see it suggest anywhere that you know which particular input is part of the back door input set. I think you just have to identify which inputs are attacks and which are not.
I believe synthesizing a back door set falls within the Trojan horse problem and could be very hard as the back door set may just be random numbers. If you can synthesize the back door set you can probably solve the Trojan horse problem.
I’m confused about the back door attack detection task even after reading it a few times:
The article says: “The key difference in the attack detection task is that you are given the backdoor input along with the backdoored model, and merely need to recognize the nput as an attack”.
When I read that, I find myself wondering why that isn’t trivial solved by a model that memorises which input(s) are known to be an attack.
My best interpretation is that there are a bunch of possible inputs that cause an attack and you are given one of them and just have to recognise that one plus the others you don’t see. Is this interpretation correct?
You have to specify your backdoor defense before the attacker picks which input to backdoor.
I don’t see it suggest anywhere that you know which particular input is part of the back door input set. I think you just have to identify which inputs are attacks and which are not.
I believe synthesizing a back door set falls within the Trojan horse problem and could be very hard as the back door set may just be random numbers. If you can synthesize the back door set you can probably solve the Trojan horse problem.