Adam Shai comments on Transformers Represent Belief State Geometry in their Residual Stream

Adam Shai 17 Apr 2024 18:24 UTC
1 point
0
Cool question. This is one of the things we’d like to explore more going forward. We are pretty sure this is pretty nuanced and has to do with the relationship between the (minimal) state of the generative model, the token vocab size, and the residual stream dimensionality.
One your last question, I believe so but one would have to do the experiment! It totally should be done. check out the Hackathon if you are interested ;)