RSS

CallumMcDougall

Karma: 1,868

ARENA 5.0 - Call for Applicants

Jan 30, 2025, 1:18 PM
35 points
2 comments6 min readLW link

Scal­ing Sparse Fea­ture Cir­cuit Find­ing to Gemma 9B

Jan 10, 2025, 11:08 AM
86 points
11 comments17 min readLW link

SAEBench: A Com­pre­hen­sive Bench­mark for Sparse Autoencoders

Dec 11, 2024, 6:30 AM
82 points
6 comments2 min readLW link
(www.neuronpedia.org)

AI Align­ment Re­search Eng­ineer Ac­cel­er­a­tor (ARENA): Call for ap­pli­cants v4.0

Jul 6, 2024, 11:34 AM
57 points
7 comments6 min readLW link

How ARENA course ma­te­rial gets made

CallumMcDougallJul 2, 2024, 6:04 PM
41 points
2 comments7 min readLW link

A Selec­tion of Ran­domly Selected SAE Features

Apr 1, 2024, 9:09 AM
109 points
2 comments4 min readLW link

SAE-VIS: An­nounce­ment Post

Mar 31, 2024, 3:30 PM
74 points
8 comments1 min readLW link

Mech In­terp Challenge: Jan­uary—De­ci­pher­ing the Cae­sar Cipher Model

CallumMcDougallJan 1, 2024, 6:03 PM
17 points
0 comments3 min readLW link

In­ter­pretabil­ity with Sparse Au­toen­coders (Co­lab ex­er­cises)

CallumMcDougallNov 29, 2023, 12:56 PM
74 points
9 comments4 min readLW link

AI Align­ment Re­search Eng­ineer Ac­cel­er­a­tor (ARENA): call for applicants

CallumMcDougallNov 7, 2023, 9:43 AM
56 points
0 comments1 min readLW link

Mech In­terp Challenge: Novem­ber—De­ci­pher­ing the Cu­mu­la­tive Sum Model

CallumMcDougallNov 2, 2023, 5:10 PM
18 points
2 comments2 min readLW link

[Paper] All’s Fair In Love And Love: Copy Sup­pres­sion in GPT-2 Small

Oct 13, 2023, 6:32 PM
82 points
4 comments8 min readLW link

Mech In­terp Challenge: Oc­to­ber—De­ci­pher­ing the Sorted List Model

CallumMcDougallOct 3, 2023, 10:57 AM
23 points
0 comments3 min readLW link

ARENA 2.0 - Im­pact Report

CallumMcDougallSep 26, 2023, 5:13 PM
35 points
5 comments13 min readLW link

Mech In­terp Challenge: Septem­ber—De­ci­pher­ing the Ad­di­tion Model

CallumMcDougallSep 13, 2023, 10:23 PM
35 points
0 comments4 min readLW link

Mech In­terp Challenge: Au­gust—De­ci­pher­ing the First Unique Char­ac­ter Model

CallumMcDougallAug 9, 2023, 7:14 PM
36 points
1 comment3 min readLW link

Com­pu­ta­tional Thread Art

CallumMcDougallAug 6, 2023, 9:42 PM
75 points
2 comments6 min readLW link

Six (and a half) in­tu­itions for SVD

CallumMcDougallJul 4, 2023, 7:23 PM
71 points
1 comment1 min readLW link

An Anal­ogy for Un­der­stand­ing Transformers

CallumMcDougallMay 13, 2023, 12:20 PM
89 points
6 comments9 min readLW link

AI Align­ment Re­search Eng­ineer Ac­cel­er­a­tor (ARENA): call for applicants

CallumMcDougallApr 17, 2023, 8:30 PM
100 points
9 comments7 min readLW link