If your text generation algorithm is “repeatedly sample randomly (at a given temperature) from a probability distribution over tokens”, that means you control a stream of bits which don’t matter for output quality but which will be baked into the text you create (recoverably baked in if you have access to the “given a prefix, what is the probability distribution over next tokens” engine).
So at that point, you’re looking for “is there some cryptographic trickery which allows someone in possession of a secret key to determine whether a stream of bits has a small edit distance from a stream of bits they could create, but where that stream of bits would look random to any outside observer?” I suspect the answer is “yes”.
That said, this technique is definitely not robust to e.g. “translate English text to French and then back to English” and probably not even robust to “change a few tokens here and there”.
Alternatively, there’s the inelegant but effective approach of “maintain an index of all the text you have ever created and search against that index, as the cost to generate the text is at least an order of magnitude higher[1] than the cost to store it for a year or two”.
I see $0.3 / million tokens generated on OpenRouter for llama-3.1-70b-instruct, which is just about the smallest model size I’d imagine wanting to watermark the output for. A raw token is about 2 bytes, but let’s bump that up by a factor of 50 − 100 to account for things like “redundancy” and “backups” and “searchable text indexes take up more space than the raw text”. So spending $1 on generating tokens will result in something like 0.5 GB of data you need to store.
Quickly-accessible data storage costs something like $0.070 / GB / year, so “generate tokens and store them in a searchable place for 5 years” would be about 25-50% more expensive than “generate tokens and throw them away immediately”.
If your text generation algorithm is “repeatedly sample randomly (at a given temperature) from a probability distribution over tokens”, that means you control a stream of bits which don’t matter for output quality but which will be baked into the text you create (recoverably baked in if you have access to the “given a prefix, what is the probability distribution over next tokens” engine).
So at that point, you’re looking for “is there some cryptographic trickery which allows someone in possession of a secret key to determine whether a stream of bits has a small edit distance from a stream of bits they could create, but where that stream of bits would look random to any outside observer?” I suspect the answer is “yes”.
That said, this technique is definitely not robust to e.g. “translate English text to French and then back to English” and probably not even robust to “change a few tokens here and there”.
Alternatively, there’s the inelegant but effective approach of “maintain an index of all the text you have ever created and search against that index, as the cost to generate the text is at least an order of magnitude higher[1] than the cost to store it for a year or two”.
I see $0.3 / million tokens generated on OpenRouter for
llama-3.1-70b-instruct
, which is just about the smallest model size I’d imagine wanting to watermark the output for. A raw token is about 2 bytes, but let’s bump that up by a factor of 50 − 100 to account for things like “redundancy” and “backups” and “searchable text indexes take up more space than the raw text”. So spending $1 on generating tokens will result in something like 0.5 GB of data you need to store.Quickly-accessible data storage costs something like $0.070 / GB / year, so “generate tokens and store them in a searchable place for 5 years” would be about 25-50% more expensive than “generate tokens and throw them away immediately”.