RSS

Lucius Bushnaq

Karma: 1,610

AI notkilleveryoneism researcher at Apollo, focused on interpretability.

The Lo­cal In­ter­ac­tion Ba­sis: Iden­ti­fy­ing Com­pu­ta­tion­ally-Rele­vant and Sparsely In­ter­act­ing Fea­tures in Neu­ral Networks

20 May 2024 17:53 UTC
101 points
2 comments3 min readLW link

In­ter­pretabil­ity: In­te­grated Gra­di­ents is a de­cent at­tri­bu­tion method

20 May 2024 17:55 UTC
14 points
7 comments6 min readLW link