This helps me to notice that there is a fairly strong and simple data poisoning attack possible with canaries, such that canaries can’t really be relied upon (amid other reasons I already believed they’re insufficient, esp once AIs can browse the Web):
The attack is that one could just siphoning up all text on the Internet that did have a canary string, and then republish it without the canary :/
This helps me to notice that there is a fairly strong and simple data poisoning attack possible with canaries, such that canaries can’t really be relied upon (amid other reasons I already believed they’re insufficient, esp once AIs can browse the Web):
The attack is that one could just siphoning up all text on the Internet that did have a canary string, and then republish it without the canary :/