Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
mikes
Karma:
210
All
Posts
Comments
New
Top
Old
Breaking Circuit Breakers
mikes
and
tbenthompson
14 Jul 2024 18:57 UTC
53
points
13
comments
1
min read
LW
link
(confirmlabs.org)
Fluent dreaming for language models (AI interpretability method)
tbenthompson
,
mikes
and
Zygi Straznickas
6 Feb 2024 6:02 UTC
45
points
5
comments
1
min read
LW
link
(arxiv.org)
Takeaways from the NeurIPS 2023 Trojan Detection Competition
mikes
13 Jan 2024 12:35 UTC
20
points
2
comments
1
min read
LW
link
(confirmlabs.org)
[Question]
The literature on aluminum adjuvants is very suspicious. Small IQ tax is plausible—can any experts help me estimate it?
mikes
4 Jul 2023 9:33 UTC
61
points
39
comments
3
min read
LW
link
Back to top