Connor Kissane

Karma: 176

Connor Kissane 17 May 2024 8:53 UTC
1 point
1
in reply to: Ali Shehper’s comment on: Sparse Autoencoders Work on Attention Layer Outputs
Thanks for the comment! We always use the pre-ReLU feature activation, which is equal to the post-ReLU activation (given that the feature is activate), and is purely linear function of z. Edited the post for clarity.

Connor Kissane 31 Mar 2024 16:35 UTC
7 points
0
on: SAE-VIS: Announcement Post
Amazing! We found your original library super useful for our Attention SAEs research, so thanks for making this!

We Inspected Every Head In GPT-2 Small using SAEs So You Don’t Have To

robertzk, Connor Kissane, Arthur Conmy and Neel Nanda

6 Mar 2024 5:03 UTC

57 points

0 comments12 min readLW link

Attention SAEs Scale to GPT-2 Small

Connor Kissane, robertzk, Arthur Conmy and Neel Nanda

3 Feb 2024 6:50 UTC

76 points

4 comments8 min readLW link

Sparse Autoencoders Work on Attention Layer Outputs

Connor Kissane, robertzk, Arthur Conmy and Neel Nanda

16 Jan 2024 0:26 UTC

82 points

9 comments18 min readLW link

Connor Kissane 14 Aug 2023 14:20 UTC
1 point
0
on: Mech Interp Puzzle 1: Suspiciously Similar Embeddings in GPT-Neo
These puzzles are great, thanks for making them!

Connor Kissane 19 Jul 2023 19:57 UTC
1 point
0
on: Causal scrubbing: results on induction heads
Code for this token filtering can be found in the appendix and the exact token list is linked.
Maybe I just missed it, but I’m not seeing this. Is the code still available?