All three of those examples are of the form “hey here’s a lot of samples from a distribution, please output another sample from the same distribution”, which is not the kind of problem where anyone would ever expect adversarial dynamics / weird edge-cases, right?
I would expect some adversarial dynamics/weird edge cases in such cases. The International Obfuscated C Code Contest is a thing that exists, and so a sequence predictor that has been trained to produce content similar to an input distribution that included the entrants to that contest will place some nonzero level of probability that the C code it writes should actually contain a sneaky backdoor, and, once the sneaky backdoor is there, would place a fairly high probability that the next token it outputs should be another part of the same program.
Such weird/adversarial outputs probably won’t be a super large part of your input space unless you’re sampling in such a way as to make them overrepresented, though.
I would expect some adversarial dynamics/weird edge cases in such cases. The International Obfuscated C Code Contest is a thing that exists, and so a sequence predictor that has been trained to produce content similar to an input distribution that included the entrants to that contest will place some nonzero level of probability that the C code it writes should actually contain a sneaky backdoor, and, once the sneaky backdoor is there, would place a fairly high probability that the next token it outputs should be another part of the same program.
Such weird/adversarial outputs probably won’t be a super large part of your input space unless you’re sampling in such a way as to make them overrepresented, though.