TurnTrout comments on To what extent is GPT-3 capable of reasoning?

TurnTrout 21 Jul 2020 13:14 UTC
LW: 2 AF: 1
AF
How should I modify the problems I gave it? What would be the least impressive test which would convince you it is reasoning, and not memorizing? (Preferably something that doesn’t rely on eg rhyming, since GPT-3 uses an obfuscating input encoding)
- abramdemski 21 Jul 2020 16:29 UTC
  LW: 6 AF: 2
  AF Parent
  I know there are benchmarks for NL reasoning, but I’m not re-finding them so easily...
  This looks like one:
  https://github.com/facebookresearch/clutrr/
  Anyway, my main issue is that you’re not defining what you mean by reasoning, even informally. What’s the difference between reasoning vs mere interpolation/extrapolation? A stab at a definition would make it a lot easier to differentiate.
  - TurnTrout 21 Jul 2020 17:21 UTC
    LW: 2 AF: 1
    AF Parent
    One stab might be some kind of “semantic sensitivity”:
    Some inputs are close in terms of edit distance, but very different semantically. One clue that a system can reason is if it can correctly respond to these small variations, and explain the difference.
    This is part of why I tested similar situations with the bullet—I wanted to see whether small changes to the words would provoke a substantively different response.
    I think another part of this is “sequential processing steps required”—you couldn’t just look up a fact or a definition somewhere, to get the correct response.
    This is still woefully incomplete, but hopefully this helps a bit.
    - abramdemski 21 Jul 2020 19:37 UTC
      LW: 5 AF: 3
      AF Parent
      I like the second suggestion a lot more than the first. To me, the first is getting more at “Does GPT convert to a semantic representation, or just go based off of syntax?” I already strongly suspect it does something more meaningful than “just syntax”—but whether it then reasons about it is another matter.
- [ ]
  [deleted]