If I tell a model to write me a book summary, that book summary can be specific interesting output without containing any new information.
If I want to know how to build a bomb, there are already plenty of sources out there on how to build a bomb. The information is already accessible from those sources. When an LLM synthesizes the existing information in its training data to help someone build a bomb it’s not inventing new information.
Deep fakes aren’t about simply repeating information that’s already in the training data.
So the argument would be that the lawmaker chose to say “accessible” because they want to allow LLMs to synthesize the existing information in their training data and repeat it back to the user but that does not mean that the lawmaker had an intention to allow the LLMs to produce new information that gets used to create harm even if there are other ways to create that information.
Almost no specific (interesting) output is information that’s already been generated by any model, in the strictest sense.
If I tell a model to write me a book summary, that book summary can be specific interesting output without containing any new information.
If I want to know how to build a bomb, there are already plenty of sources out there on how to build a bomb. The information is already accessible from those sources. When an LLM synthesizes the existing information in its training data to help someone build a bomb it’s not inventing new information.
Deep fakes aren’t about simply repeating information that’s already in the training data.
So the argument would be that the lawmaker chose to say “accessible” because they want to allow LLMs to synthesize the existing information in their training data and repeat it back to the user but that does not mean that the lawmaker had an intention to allow the LLMs to produce new information that gets used to create harm even if there are other ways to create that information.