I think my favourite theory is that open-sourcing Llama 2 means that you now can’t tell if text on the internet was written by humans or AI, which is sad (a) for personal reasons and (b) because it sort of messes up your ability to train models on human-generated data. But I’m not sure if this is or should be sufficient for a suit.
I think my favourite theory is that open-sourcing Llama 2 means that you now can’t tell if text on the internet was written by humans or AI, which is sad (a) for personal reasons and (b) because it sort of messes up your ability to train models on human-generated data. But I’m not sure if this is or should be sufficient for a suit.
How is this specifically downstream from open sourcing, since text could already be generated by a closed source LLM?
Closed source LLMs could conceivably implement watermarking techniques, or log interactions, but this is infeasible with open source LLMs.