Adam Shai comments on Transformers Represent Belief State Geometry in their Residual Stream

Adam Shai 17 Apr 2024 18:26 UTC
1 point
0
Thanks! I appreciate the critique. From this comment and from John’s it seems correct and I’ll keep it in mind for the future.
On the question, by optimize the representation do you mean causally intervene on the residual stream during inference (e.g. a patching experiment)? Or do you mean something else that involves backprop? If the first, then we haven’t tried, but definitely want to! It could be something someone does at the Hackathon, if interested ;)
- Leon Lang 17 Apr 2024 20:46 UTC
  1 point
  0
  Parent
  Yes the first! Thanks for the link!