Raemon comments on Mech Interp Puzzle 1: Suspiciously Similar Embeddings in GPT-Neo

Raemon 16 Jul 2023 22:16 UTC
LW: 7 AF: 5
0
AF
What background knowledge do you think this requires? If I know a bit about how ML and language models work in general, should I be able to reason this out from first principles (or from following a fairly obvious trail of “look up relevant terms and quickly get up to speed on the domain?”). Or does it require some amount of pre-existing ML taste?
Also, do you have a rough sense of how long it took for MATS scholars?
- Neel Nanda 16 Jul 2023 22:55 UTC
  LW: 7 AF: 3
  1
  AF Parent
  Great questions, thanks!
  
  Background: You don’t need to know anything beyond “a language model is a stack of matrix multiplications and non-linearities. The input is a series of tokens (words and sub-words) which get converted to vectors by a massive lookup table called the embedding (the vectors are called token embeddings). These vectors have really high cosine sim in GPT-Neo”.
  
  Re how long it took for scholars, hmm, maybe an hour? Not sure, I expect it varied a ton. I gave this in their first or second week, I think.