RSS

Kshitij Sachan

Karma: 343

Redwood Research

AI Con­trol: Im­prov­ing Safety De­spite In­ten­tional Subversion

Dec 13, 2023, 3:51 PM
236 points
24 comments10 min readLW link4 reviews

LLMs are (mostly) not helped by filler tokens

Kshitij SachanAug 10, 2023, 12:48 AM
66 points
35 comments6 min readLW link