Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
jacob_drori
Karma:
80
All
Posts
Comments
New
Top
Old
Open Source Automated Interpretability for Sparse Autoencoder Features
kh4dien
,
SrGonao
,
jacob_drori
and
Nora Belrose
30 Jul 2024 21:11 UTC
67
points
1
comment
13
min read
LW
link
(blog.eleuther.ai)
A thought experiment to help persuade skeptics that power-seeking AI is plausible
jacob_drori
25 Nov 2023 23:26 UTC
2
points
4
comments
5
min read
LW
link
Back to top