ojorgensen comments on Because of LayerNorm, Directions in GPT-2 MLP Layers are Monosemantic

ojorgensen 28 Jan 2024 19:29 UTC
1 point
0
Yeah I think we have the same understanding here (in hindsight I should have made this more explicit in the post / title).

I would be excited to see someone empirically try to answer the question you mention at the end. In particular, given some direction $u$ and a LayerNormed vector $v$ , one might try to quantify how smoothly rotating from $v$ towards $u$ changes the output of the MLP layer. This seems like a good test of whether the Polytope Lens is helpful / necessary for understanding the MLPs of Transformers (with smooth changes corresponding to your ‘random jostling cancels out’ corresponding to not needing to worry about Polytope Lens style issues).
- Chris_Leong 29 Jan 2024 1:51 UTC
  2 points
  0
  Parent
  Also: It seems like there would be an easier way to get this observation that this post makes, ie. directly showing that kV and V get mapped to the same point by layer norm (excluding the epsilon).
  
  Don’t get me wrong, the circle is cool, but seems like it’s a bit of a detour.