Exa Watson comments on Transformers Represent Belief State Geometry in their Residual Stream

Exa Watson 20 Apr 2024 15:35 UTC
2 points
0
If I understand this right, you train a transformer on data generated from a hidden markov process, of the form {0,1,R} and find that there is a mechanism for tracking when R occurs in the residual stream, as well as that the transformer learns the hidden markov process. is that correct?
- Keenan Pepper 20 Apr 2024 23:50 UTC
  4 points
  0
  Parent
  No, the actual hidden Markov process used to generate the awesome triangle fractal image is not the {0,1,random} model but a different one, which is called “Mess3” and has a symmetry between the 3 hidden states.
  Also, they’re not claiming the transformer learns merely the hidden states of the HMM, but a more complicated thing called the “mixed state presentation”, which is not the states that the HMM can be in but the (usually much larger number of) belief states which an ideal prediction process trying to “sync” to it might go thru.