We recommend using the gpt2-layers directory, which includes resid_pre layers 5-11, topk=30, 12288 features (the tokenized âtâ ones have learned lookup tables, pre-initialized with unigram residuals).
The folders pareto-sweep, init-sweep, and expansion-sweep contain parameter sweeps, with lookup tables fixed to 2x unigram residuals.
In addition to the code repo linked above, for now here is some quick code that loads the SAE, exposes the lookup table, and computes activations only:
Additionally:
We recommend using the
gpt2-layers
directory, which includes resid_pre layers 5-11, topk=30, 12288 features (the tokenized âtâ ones have learned lookup tables, pre-initialized with unigram residuals).The folders
pareto-sweep
,init-sweep
, andexpansion-sweep
contain parameter sweeps, with lookup tables fixed to 2x unigram residuals.In addition to the code repo linked above, for now here is some quick code that loads the SAE, exposes the lookup table, and computes activations only: