Daniel Tan comments on Toward A Mathematical Framework for Computation in Superposition

Daniel Tan 27 Apr 2024 15:00 UTC
1 point
0
This work is very exciting to me, and I’m curious to hear the authors’ thoughts on whether we could verify specific predictions made by this model in real models.
- For example, the proposed U-AND operator—do we expect this to occur in real LLMs, and could we try to find evidence of this by applying mech interp to carefully-chosen toy models?
I have a more detailed write-up on model organisms of superposition here: https://docs.google.com/document/d/1hwI30HNNB2MkOrtEzo7hppG9X7Cn7Xm9a-1LBqcttWc/edit?usp=sharing
Would love to discuss this more!