Steering vector: “I talk about weddings constantly”—“I do not talk about weddings constantly” before attention layer 20 with coefficient +4
Front
Middle
Back
Average number of wedding words
0.70
0.81
0.87
@lisathiergart I’m curious if a linear increase in the number of words with position along the residual stream replicates for other prompts. Have you looked at this?
@lisathiergart I’m curious if a linear increase in the number of words with position along the residual stream replicates for other prompts. Have you looked at this?