RSS

Andrew Mack

Karma: 315

Deep Causal Transcod­ing: A Frame­work for Mechanis­ti­cally Elic­it­ing La­tent Be­hav­iors in Lan­guage Models

Dec 3, 2024, 9:19 PM
106 points
8 comments41 min readLW link

Mechanis­ti­cally Elic­it­ing La­tent Be­hav­iors in Lan­guage Models

Apr 30, 2024, 6:51 PM
210 points
43 comments45 min readLW link