RSS

nothoughtsheadempty

Karma: 12

Early Ex­per­i­ments in Re­ward Model In­ter­pre­ta­tion Us­ing Sparse Autoencoders

Oct 3, 2023, 7:45 AM
17 points
0 comments5 min readLW link

lu­nais­cod­ing’s Shortform

nothoughtsheademptyMay 18, 2023, 9:41 PM
1 point
1 comment1 min readLW link