gabrielrecc comments on There is a globe in your LLM

gabrielrecc 10 Oct 2024 12:19 UTC
8 points
3
This is cool, although I suspect that you’d get something similar from even very simple models that aren’t necessarily “modelling the world” in any deep sense, simply due to first and second order statistical associations between nearby place names. See e.g. https://onlinelibrary.wiley.com/doi/pdfdirect/10.1111/j.1551-6709.2008.01003.x , https://escholarship.org/uc/item/2g6976kg .
- gwern 10 Oct 2024 15:36 UTC
  12 points
  3
  Parent
  Yes, people have been pulling this sort of semantic knowledge out of word embeddings since the start. Here is a long list from like 5 years ago, going far beyond just geographic locations: https://gwern.net/gpt-2#fn11
  
  This is one of the reasons that people have rejected the claims that LLMs are doing anything special: because after all, just a word2vec, which barely even counts as a neural net, or n-grams, seems able to ‘learn’ a lot of the same things as a LLM does, even though it’s “obviously” not a world model. (It’s a modus ponens/tollens thing.)
  
  One of the coolest demonstrations of extracting world models (and demonstrating the flaws in the learned world models due to a lack of inductive priors) is a paper on inferring the exact street connectivity & geography of New York City from training on taxi cab trajectories: https://x.com/keyonV/status/1803838591371555252 https://arxiv.org/abs/2406.03689
  What links here?
  - gwern's comment on Extensions and Intensions by Eliezer Yudkowsky (14 Oct 2024 20:48 UTC; 5 points)