ArchiveSequencesAbout

QuestionsEventsShortformAlignment ForumAF Comments

HomeFeaturedAllTagsRecent Comments

evhub comments on unRLHF—Efficiently undoing LLM safeguards

evhub 14 Oct 2023 5:25 UTC
LW: 4 AF: 3
2
AF
(Moderation note: added to the Alignment Forum from LessWrong.)