ArchiveSequencesAbout

QuestionsEventsShortformAlignment ForumAF Comments

HomeFeaturedAllTagsRecent Comments

Ollie J comments on [Paper] AI Sandbagging: Language Models can Strategically Underperform on Evaluations

Ollie J 13 Jun 2024 12:15 UTC
2 points
0
Fixed, thanks for flagging