I actually do have some publicly hosted, only on residual stream and some simple training code.
I’m wanting to integrate some basic visualizations (and include Antrhopic’s tricks) before making a public post on it, but currently:
Dict on pythia-70m-deduped
Dict on Pythia-410m-deduped
Which can be downloaded & interpreted with this notebook
With easy training code for bespoke models here.
I actually do have some publicly hosted, only on residual stream and some simple training code.
I’m wanting to integrate some basic visualizations (and include Antrhopic’s tricks) before making a public post on it, but currently:
Dict on pythia-70m-deduped
Dict on Pythia-410m-deduped
Which can be downloaded & interpreted with this notebook
With easy training code for bespoke models here.