The most finicky part of our methodology (and the part Iām least satisfied with currently) is in the selection of a direction.
For reproducibility of our Llama 3 results, I can share the positions and layers where we extracted the directions from:
8B: (position_idx = ā1, layer_idx = 12)
70B: (position_idx = ā5, layer_idx = 37)
The position indexing assumes the usage of this prompt template, with two new lines appended to the end.
The most finicky part of our methodology (and the part Iām least satisfied with currently) is in the selection of a direction.
For reproducibility of our Llama 3 results, I can share the positions and layers where we extracted the directions from:
8B: (position_idx = ā1, layer_idx = 12)
70B: (position_idx = ā5, layer_idx = 37)
The position indexing assumes the usage of this prompt template, with two new lines appended to the end.