cubefox comments on ozziegooen’s Shortform

cubefox 14 Feb 2025 18:26 UTC
2 points
0
Yeah, recent Claude does relatively well. Though I assume it also depends on how disinterested and analytical the phrasing of the prompt is (e.g. explicitly mentioning the slur in question). I also wouldn’t rule out that Claude was specifically optimized for this somewhat notorious example.