Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Georg Lange
Karma:
73
All
Posts
Comments
New
Top
Old
SAEs Discover Meaningful Features in the IOI Task
Alex Makelov
,
Georg Lange
and
Neel Nanda
5 Jun 2024 23:48 UTC
15
points
2
comments
10
min read
LW
link
An Interpretability Illusion for Activation Patching of Arbitrary Subspaces
Georg Lange
,
Alex Makelov
and
Neel Nanda
29 Aug 2023 1:04 UTC
77
points
4
comments
1
min read
LW
link
Back to top