RSS

Collin

Karma: 473

http://​​collinpburns.com/​​

What AI Safety Ma­te­ri­als Do ML Re­searchers Find Com­pel­ling?

Dec 28, 2022, 2:03 AM
175 points
34 comments2 min readLW link

How “Dis­cov­er­ing La­tent Knowl­edge in Lan­guage Models Without Su­per­vi­sion” Fits Into a Broader Align­ment Scheme

CollinDec 15, 2022, 6:22 PM
244 points
39 comments16 min readLW link1 review