Adam Shai comments on Transformers Represent Belief State Geometry in their Residual Stream

Adam Shai 17 Apr 2024 22:10 UTC
3 points
0
I should have explained this better in my post.
For every input into the transformer (of every length up to the context window length), we know the ground truth belief state that comp mech says an observer should have over the HMM states. In this case, this is 3 numbers. So for each input we have a 3d ground truth vector. Also, for each input we have the residual stream activation (in this case a 64D vector). To find the projection we just use standard Linear Regression (as implemented in sklearn) between the 64D residual stream vectors and the 3D (really 2D) ground truth vectors. Does that make sense?
- dr_s 20 Apr 2024 12:12 UTC
  2 points
  0
  Parent
  Given that the model eventually outputs the next token, shouldn’t the final embedding matrix be exactly your linear fit matrix multiplied by the probability of each state to output a given token? Could you use that?
- Sandi 18 Apr 2024 18:20 UTC
  1 point
  0
  Parent
  Yep, that’s what I was trying to describe as well. Thanks!