RSS

Sid Black

Karma: 779

The Sin­gu­lar Value De­com­po­si­tions of Trans­former Weight Ma­tri­ces are Highly Interpretable

Nov 28, 2022, 12:54 PM
199 points
33 comments31 min readLW link

Con­jec­ture Se­cond Hiring Round

Nov 23, 2022, 5:11 PM
92 points
0 comments1 min readLW link

Con­jec­ture: a ret­ro­spec­tive af­ter 8 months of work

Nov 23, 2022, 5:10 PM
180 points
9 comments8 min readLW link

Cur­rent themes in mechanis­tic in­ter­pretabil­ity research

Nov 16, 2022, 2:14 PM
89 points
2 comments12 min readLW link

In­ter­pret­ing Neu­ral Net­works through the Poly­tope Lens

Sep 23, 2022, 5:58 PM
144 points
29 comments33 min readLW link

Con­jec­ture: In­ter­nal In­fo­haz­ard Policy

Jul 29, 2022, 7:07 PM
131 points
6 comments19 min readLW link