Nisan comments on Transformers Represent Belief State Geometry in their Residual Stream

Nisan 18 Apr 2024 16:07 UTC
2 points
0
If I understand correctly, the next-token prediction of Mess3 is related to the current-state prediction by a nonsingular linear transformation. So a linear probe showing “the meta-structure of an observer’s belief updates over the hidden states of the generating structure” is equivalent to one showing “the structure of the next-token predictions”, no?
- Nisan 18 Apr 2024 17:05 UTC
  3 points
  0
  Parent
  I suppose if you had more hidden states than observables, you could distinguish hidden-state prediction from next-token prediction by the dimension of the fractal.