Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
scasper
Karma:
2,009
https://stephencasper.com/
All
Posts
Comments
New
Top
Old
Page
2
EIS IX: Interpretability and Adversaries
scasper
Feb 20, 2023, 6:25 PM
30
points
8
comments
8
min read
LW
link
EIS VIII: An Engineer’s Understanding of Deceptive Alignment
scasper
Feb 19, 2023, 3:25 PM
30
points
5
comments
4
min read
LW
link
EIS VII: A Challenge for Mechanists
scasper
Feb 18, 2023, 6:27 PM
36
points
4
comments
3
min read
LW
link
EIS VI: Critiques of Mechanistic Interpretability Work in AI Safety
scasper
Feb 17, 2023, 8:48 PM
49
points
9
comments
12
min read
LW
link
EIS V: Blind Spots In AI Safety Interpretability Research
scasper
Feb 16, 2023, 7:09 PM
57
points
24
comments
10
min read
LW
link
EIS IV: A Spotlight on Feature Attribution/Saliency
scasper
Feb 15, 2023, 6:46 PM
19
points
1
comment
4
min read
LW
link
EIS III: Broad Critiques of Interpretability Research
scasper
Feb 14, 2023, 6:24 PM
20
points
2
comments
11
min read
LW
link
EIS II: What is “Interpretability”?
scasper
Feb 9, 2023, 4:48 PM
28
points
6
comments
4
min read
LW
link
The Engineer’s Interpretability Sequence (EIS) I: Intro
scasper
Feb 9, 2023, 4:28 PM
46
points
24
comments
3
min read
LW
link
Avoiding perpetual risk from TAI
scasper
Dec 26, 2022, 10:34 PM
15
points
6
comments
5
min read
LW
link
Existential AI Safety is NOT separate from near-term applications
scasper
Dec 13, 2022, 2:47 PM
37
points
17
comments
3
min read
LW
link
Where to be an AI Safety Professor
scasper
Dec 7, 2022, 7:09 AM
31
points
12
comments
2
min read
LW
link
The Slippery Slope from DALLE-2 to Deepfake Anarchy
scasper
Nov 5, 2022, 2:53 PM
17
points
9
comments
11
min read
LW
link
[Linkpost] A survey on over 300 works about interpretability in deep networks
scasper
Sep 12, 2022, 7:07 PM
97
points
7
comments
2
min read
LW
link
(arxiv.org)
Pitfalls with Proofs
scasper
Jul 19, 2022, 10:21 PM
19
points
21
comments
8
min read
LW
link
A daily routine I do for my AI safety research work
scasper
Jul 19, 2022, 9:58 PM
22
points
7
comments
1
min read
LW
link
Deep Dives: My Advice for Pursuing Work in Research
scasper
Mar 11, 2022, 5:56 PM
33
points
2
comments
3
min read
LW
link
The Achilles Heel Hypothesis for AI
scasper
Oct 13, 2020, 2:35 PM
20
points
6
comments
1
min read
LW
link
Procrastination Paradoxes: the Good, the Bad, and the Ugly
scasper
Aug 6, 2020, 7:47 PM
21
points
4
comments
10
min read
LW
link
Solipsism is Underrated
scasper
Mar 28, 2020, 6:09 PM
11
points
30
comments
3
min read
LW
link
Previous
Back to top
Next