Brian Slesinsky comments on The Limit of Language Models

Brian Slesinsky 25 Dec 2022 22:50 UTC
0 points
−10
There’s an assumption that the text that language models are trained on can be coherently integrated somehow. But the input is a babel of unreliable and contradictory opinions. Training to convincingly imitate any of a bunch of opinions, many of which are false, may not result in a coherent model of the world, but rather a model of a lot of nonsense on the Internet.
- DragonGod 25 Dec 2022 22:55 UTC
  3 points
  3
  Parent
  Do you have much actual experience playing around with large language models?
  text that language models are trained on can be coherently integrated somehow
  In my experience, the knowledge/world model of GPT-3/ChatGPT are coherently integrated.
  Training to convincingly imitate any of a bunch of opinions, many of which are false, may not result in a coherent model of the world, but rather a model of a lot of nonsense on the Internet.
  This seems empirically false to my experience using language models, and prima facie unlikely. Lots of text on the internet is just reporting about underlying reality:
  - Log files
  - Research papers
  - Academic and industry reports
  - Etc.
  Learning to predict such reports of reality, would privilege processes that can learn the structure of reality.
  Furthermore, text that is fact and text that is fiction is often distinguished in writing style or presentation. In my experience, large language models do not conflate fact or fiction.