Razied answers What experiment settles the Gary Marcus vs Geoffrey Hinton debate?

Razied 14 Feb 2024 19:21 UTC
2 points
0
Obviously LLMs memorize some things, the easy example is that the pretraining dataset of GPT-4 probably contained lots of cryptographically hashed strings which are impossible to infer from the overall patterns of language. Predicting those accurately absolutely requires memorization, there’s literally no other way unless the LLM solves an NP-hard problem. Then there are in-between things like Barack Obama’s age, which might be possible to infer from other language (a president is probably not 10 yrs old or 230), but within the plausible range, you also just need to memorize it.
- Gerald Monroe 14 Feb 2024 19:43 UTC
  2 points
  0
  Parent
  Where it gets interesting is when you leave the space of token strings the machine has seen, but you are somewhere in the input space “in between” strings it has seen. That’s why this works at all and exhibits any intelligence.
  
  For example if it has seen a whole bunch of patterns like “A->B”, and “C->D”, if you give input “G” it will complete with “->F”.
  
  Or for President ages, what if the president isn’t real? https://chat.openai.com/share/3ccdc340-ada5-4471-b114-0b936d1396ad
  - JavierCC 15 Feb 2024 23:55 UTC
    1 point
    0
    Parent
    There are fake/fictional presidents in the training data.