Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Charlie Steiner comments on
Charlie Steiner’s Shortform
Charlie Steiner
5 Mar 2023 4:34 UTC
4
points
Interpretability as an RLHF problem seems like something to do.
Back to top
Interpretability as an RLHF problem seems like something to do.