TurnTrout comments on Steering GPT-2-XL by adding an activation vector

TurnTrout 22 May 2023 14:20 UTC
LW: 10 AF: 5
0
AF
Additionally, attention is ran on the normalized x meaning only the “unscaled” version of x is moved between positions.
Thanks for writing this up, I hadn’t realized this. One conclusion I’m drawing is: If the values in the modified residual streams aren’t important to other computations in later sequence positions, then a large-coefficient addition will still lead to reasonable completions.
- Ulisse Mini 22 May 2023 16:31 UTC
  3 points
  0
  Parent
  Yeah, assuming by “not important” you mean “not relevant” (low attention score)