Neel Nanda comments on The Plan − 2024 Update

Neel Nanda 31 Dec 2024 20:25 UTC
36 points
5
Fwiw, this is not at all obvious to me, and I would weakly bet that larger models are harder to interpret (even beyond there just being more capabilities to study)
- Nathan Helm-Burger 2 Jan 2025 0:57 UTC
  6 points
  0
  Parent
  Hmm. I think there’s something about this that rings true and yet...
  
  Ok, so what if there were a set of cliff faces that had the property that climbing the bigger ones was more important and also that climbing tools worked better on them. Yet, despite the tools working better on the large cliffs, the smaller cliffs were easier to climb (both because the routes were shorter, and because the routes were less technical). Seems like if your goal is to design climbing equipment that will be helpful on large cliff faces, you should test the climbing equipment on large cliff faces, even if that means you won’t have the satisfaction of completing any of your testing climbs.
  - Aprillion 2 Jan 2025 15:23 UTC
    3 points
    0
    Parent
    What if you tried to figure out a way to understand the “canonical cliffness” and design a new line of equipment that could be tailored to fit any “slope”… Which cliff would you test first? 🤔
- metawrong 4 Jan 2025 7:44 UTC
  1 point
  0
  Parent
  So you would expect Claude Opus 3 to be harder to interpret than Claude Sonnet 3.5 ?
  
  My intuition is that larger models of the same capability would exhibit less super-position and thus be easier to interpret?