Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
ryan_greenblatt comments on
Sycophancy to subterfuge: Investigating reward tampering in large language models
[ ]
[deleted]
Back to top
ryan_greenblatt comments on Sycophancy to subterfuge: Investigating reward tampering in large language models