Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Neel Nanda comments on
[Paper] AI Sandbagging: Language Models can Strategically Underperform on Evaluations
Neel Nanda
14 Jun 2024 23:28 UTC
3
points
0
Thanks for the additional context, that seems reasonable
Back to top
Thanks for the additional context, that seems reasonable