Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
evhub comments on
Robustness of Model-Graded Evaluations and Automated Interpretability
evhub
17 Jul 2023 20:00 UTC
LW: 3 AF: 3
0
AF
(Moderation note: added to the Alignment Forum from LessWrong.)
Back to top
(Moderation note: added to the Alignment Forum from LessWrong.)