Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
dannyhalawi
Karma:
135
All
Posts
Comments
New
Top
Old
Covert Malicious Finetuning
Tony Wang
and
dannyhalawi
2 Jul 2024 2:41 UTC
89
points
4
comments
3
min read
LW
link
Approaching Human-Level Forecasting with Language Models
Fred Zhang
,
dannyhalawi
and
jsteinhardt
29 Feb 2024 22:36 UTC
60
points
6
comments
3
min read
LW
link
Back to top