Dagon comments on Is Text Watermarking a lost cause?

Dagon 1 Oct 2024 17:05 UTC
9 points
1
I did some analysis of audio watermarks a long time ago, and concluded that the answer to the title is “yes, this is a lost cause analytically, but viable socially and legally for quite some time”. The reason is mostly economics and standards of proof, not technical. You start out with
1) The watermark should be decisive: It should be almost impossible for normal text to appear marked by accident.
But then most of your analysis is statistical. Without defining “almost impossible”, you don’t actually know what’s successful. What will hold up in court, what will be good enough to send a threatening letter, and what will allow you to enforce an internal rule are all somewhat different requirements.
Further, you assume random distribution, rather than your point 3
3) The watermark should be robust and able to withstand alterations to the text. Perhaps even from a malicious actor trying to destroy the watermark.
Alteration is adversarial—people are TRYING to remove or reduce the level of proof provided by the watermark. For audio, it took a long time for the knowledge of acoustic modeling and MP3 compression to filter into transcoding tools that removed watermarks. It remains the case that specific watermarking choices are kept secret for as long as possible, and are easily removed once known. The balance of not affecting perceptible sound and not trivially removed with a re-encode is still debated, and for most purposes, it’s not bothered with anymore.
Text is, at first glance, EVEN HARDER to disguise a watermark—word choice is distinctive, especially if it deviates enough to get to “beyond a reasonable doubt” threshold. For most schemes, simple rewrites (including automatic rewrites) are probably enough to get the doubt level high enough that it’s deniable.
Then there’s the question of where the watermark is being introduced. Intentional, “text came from X edition of Y source” watermarking is different from “an LLM generated this chunk of text”. The latter is often called “fingerprinting” more than “watermarking”, because it’s not intentionally added for the purpose. It’s also doomed in the general case, but most plagiarists are too lazy to bother, so there are current use cases.