ArchiveSequencesAbout

QuestionsEventsShortformAlignment ForumAF Comments

HomeFeaturedAllTagsRecent Comments

evhub comments on Robustness of Model-Graded Evaluations and Automated Interpretability

evhub 17 Jul 2023 20:00 UTC
LW: 3 AF: 3
0
AF
(Moderation note: added to the Alignment Forum from LessWrong.)