noggin-scratcher comments on Knowledge Neurons in Pretrained Transformers

noggin-scratcher 18 May 2021 0:52 UTC
9 points
if we can really just think about the feed-forward layers as encoding simple key-value knowledge pairs
Once upon a time, people thought that you could make AI simply by putting a sufficiently large body of facts into a database for the system to reason over. Later we realised that of course that was silly and would never work.
But apparently they were right all along, and training a neural network is just an efficient way of persuading a computer to do the data entry for you?
- Charlie Steiner 19 May 2021 5:11 UTC
  16 points
  Parent
  On one hand, maybe? Maybe training using a differential representation and SGD was the only missing ingredient.
  
  But I think I’ll believe it when I see large neural models distilled into relatively tiny symbolic models with no big loss of function. If that’s hard, it means that partial activations and small differences in coefficients are doing important work.