RSS

scasper

Karma: 2,009

https://​​stephencasper.com/​​

EIS IX: In­ter­pretabil­ity and Adversaries

scasperFeb 20, 2023, 6:25 PM
30 points
8 comments8 min readLW link

EIS VIII: An Eng­ineer’s Un­der­stand­ing of De­cep­tive Alignment

scasperFeb 19, 2023, 3:25 PM
30 points
5 comments4 min readLW link

EIS VII: A Challenge for Mechanists

scasperFeb 18, 2023, 6:27 PM
36 points
4 comments3 min readLW link

EIS VI: Cri­tiques of Mechanis­tic In­ter­pretabil­ity Work in AI Safety

scasperFeb 17, 2023, 8:48 PM
49 points
9 comments12 min readLW link

EIS V: Blind Spots In AI Safety In­ter­pretabil­ity Research

scasperFeb 16, 2023, 7:09 PM
57 points
24 comments10 min readLW link

EIS IV: A Spotlight on Fea­ture At­tri­bu­tion/​Saliency

scasperFeb 15, 2023, 6:46 PM
19 points
1 comment4 min readLW link

EIS III: Broad Cri­tiques of In­ter­pretabil­ity Research

scasperFeb 14, 2023, 6:24 PM
20 points
2 comments11 min readLW link

EIS II: What is “In­ter­pretabil­ity”?

scasperFeb 9, 2023, 4:48 PM
28 points
6 comments4 min readLW link

The Eng­ineer’s In­ter­pretabil­ity Se­quence (EIS) I: Intro

scasperFeb 9, 2023, 4:28 PM
46 points
24 comments3 min readLW link

Avoid­ing per­pet­ual risk from TAI

scasperDec 26, 2022, 10:34 PM
15 points
6 comments5 min readLW link

Ex­is­ten­tial AI Safety is NOT sep­a­rate from near-term applications

scasperDec 13, 2022, 2:47 PM
37 points
17 comments3 min readLW link

Where to be an AI Safety Pro­fes­sor

scasperDec 7, 2022, 7:09 AM
31 points
12 comments2 min readLW link

The Slip­pery Slope from DALLE-2 to Deep­fake Anarchy

scasperNov 5, 2022, 2:53 PM
17 points
9 comments11 min readLW link

[Linkpost] A sur­vey on over 300 works about in­ter­pretabil­ity in deep networks

scasperSep 12, 2022, 7:07 PM
97 points
7 comments2 min readLW link
(arxiv.org)

Pit­falls with Proofs

scasperJul 19, 2022, 10:21 PM
19 points
21 comments8 min readLW link

A daily rou­tine I do for my AI safety re­search work

scasperJul 19, 2022, 9:58 PM
22 points
7 comments1 min readLW link

Deep Dives: My Ad­vice for Pur­su­ing Work in Re­search

scasperMar 11, 2022, 5:56 PM
33 points
2 comments3 min readLW link

The Achilles Heel Hy­poth­e­sis for AI

scasperOct 13, 2020, 2:35 PM
20 points
6 comments1 min readLW link

Pro­cras­ti­na­tion Para­doxes: the Good, the Bad, and the Ugly

scasperAug 6, 2020, 7:47 PM
21 points
4 comments10 min readLW link

Solip­sism is Underrated

scasperMar 28, 2020, 6:09 PM
11 points
30 comments3 min readLW link