RSS

scasper

Karma: 2,006

https://​​stephencasper.com/​​

EIS VI: Cri­tiques of Mechanis­tic In­ter­pretabil­ity Work in AI Safety

scasperFeb 17, 2023, 8:48 PM
49 points
9 comments12 min readLW link

EIS V: Blind Spots In AI Safety In­ter­pretabil­ity Research

scasperFeb 16, 2023, 7:09 PM
57 points
24 comments10 min readLW link

EIS IV: A Spotlight on Fea­ture At­tri­bu­tion/​Saliency

scasperFeb 15, 2023, 6:46 PM
19 points
1 comment4 min readLW link

EIS III: Broad Cri­tiques of In­ter­pretabil­ity Research

scasperFeb 14, 2023, 6:24 PM
20 points
2 comments11 min readLW link

EIS II: What is “In­ter­pretabil­ity”?

scasperFeb 9, 2023, 4:48 PM
28 points
6 comments4 min readLW link

The Eng­ineer’s In­ter­pretabil­ity Se­quence (EIS) I: Intro

scasperFeb 9, 2023, 4:28 PM
46 points
24 comments3 min readLW link