Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Jannik Brinkmann
Karma:
137
All
Posts
Comments
New
Top
Old
Evaluating Sparse Autoencoders with Board Game Models
Adam Karvonen
,
Sam Marks
,
Can
,
Benjamin Wright
,
Jannik Brinkmann
,
Logan Riggs
and
Rico Angell
2 Aug 2024 19:50 UTC
38
points
1
comment
9
min read
LW
link
Interpreting Preference Models w/ Sparse Autoencoders
Logan Riggs
and
Jannik Brinkmann
1 Jul 2024 21:35 UTC
74
points
12
comments
9
min read
LW
link
Finding Backward Chaining Circuits in Transformers Trained on Tree Search
abhayesian
,
Jannik Brinkmann
and
Victor Levoso
28 May 2024 5:29 UTC
50
points
1
comment
9
min read
LW
link
(arxiv.org)
Improving SAE’s by Sqrt()-ing L1 & Removing Lowest Activating Features
Logan Riggs
and
Jannik Brinkmann
15 Mar 2024 16:30 UTC
26
points
5
comments
4
min read
LW
link
Back to top