GPT-4o can not reproduce the string, and instead just makes up plausible candidates. You love to see it.
Hmm. I assume you could fine-tune away an LLM from reproducing the string. Eliciting it would just become more difficult. Try posting canary text, and a part of the canary string, and see if GPT-4o completes it.
Hmm. I assume you could fine-tune away an LLM from reproducing the string. Eliciting it would just become more difficult. Try posting canary text, and a part of the canary string, and see if GPT-4o completes it.
Tried 8 times, it doesn’t manage and still makes things up (given the first four/first six bytes of the canary GUID).
But it really tries to, while Claude is talking about how it shouldn’t’ve seen the string.