ryan_greenblatt comments on Possible miracles

ryan_greenblatt 10 Oct 2022 15:59 UTC
2 points
0
Imo, it is reasonably close to the right comparison for thinking about humans understanding how LLMs work (I make no claims about this being a reasonable comparison for other things). We care about how humans perform using conscious reasoning.

Similarly, I’d claim that trying to do interpretability on your own linguistic cortex is made difficult by the fact the the linguistic cortex (probably) implicitly represents probability distributions over language which are much better than those that you can conciously compute.
- ryan_greenblatt 10 Oct 2022 16:09 UTC
  1 point
  0
  Parent
  More generally, it’s worth thinking about the conscious reasoning gap—this gap happens to be smaller in vision for various reasons.
  
  This gap will also ofc exist in language models trying to interpret themselves, but fine-tuning might be very helpful for at least partially removing this gap.
  - Tao Lin 10 Oct 2022 16:28 UTC
    2 points
    2
    Parent
    isn’t this about generation vs classification, not language vs vision?