There’s an assumption that the text that language models are trained on can be coherently integrated somehow. But the input is a babel of unreliable and contradictory opinions. Training to convincingly imitate any of a bunch of opinions, many of which are false, may not result in a coherent model of the world, but rather a model of a lot of nonsense on the Internet.
Do you have much actual experience playing around with large language models?
text that language models are trained on can be coherently integrated somehow
In my experience, the knowledge/world model of GPT-3/ChatGPT are coherently integrated.
Training to convincingly imitate any of a bunch of opinions, many of which are false, may not result in a coherent model of the world, but rather a model of a lot of nonsense on the Internet.
This seems empirically false to my experience using language models, and prima facie unlikely. Lots of text on the internet is just reporting about underlying reality:
Log files
Research papers
Academic and industry reports
Etc.
Learning to predict such reports of reality, would privilege processes that can learn the structure of reality.
Furthermore, text that is fact and text that is fiction is often distinguished in writing style or presentation. In my experience, large language models do not conflate fact or fiction.
There’s an assumption that the text that language models are trained on can be coherently integrated somehow. But the input is a babel of unreliable and contradictory opinions. Training to convincingly imitate any of a bunch of opinions, many of which are false, may not result in a coherent model of the world, but rather a model of a lot of nonsense on the Internet.
Do you have much actual experience playing around with large language models?
In my experience, the knowledge/world model of GPT-3/ChatGPT are coherently integrated.
This seems empirically false to my experience using language models, and prima facie unlikely. Lots of text on the internet is just reporting about underlying reality:
Log files
Research papers
Academic and industry reports
Etc.
Learning to predict such reports of reality, would privilege processes that can learn the structure of reality.
Furthermore, text that is fact and text that is fiction is often distinguished in writing style or presentation. In my experience, large language models do not conflate fact or fiction.