Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Quintin Pope
Karma:
4,839
All
Posts
Comments
New
Top
Old
Page
1
Counting arguments provide no evidence for AI doom
Nora Belrose
and
Quintin Pope
27 Feb 2024 23:03 UTC
94
points
188
comments
14
min read
LW
link
QAPR 5: grokking is maybe not *that* big a deal?
Quintin Pope
23 Jul 2023 20:14 UTC
114
points
15
comments
9
min read
LW
link
Research agenda: Supervising AIs improving AIs
Quintin Pope
,
Owen D
,
Roman Engeler
and
jacquesthibs
29 Apr 2023 17:09 UTC
76
points
5
comments
19
min read
LW
link
Evolution provides no evidence for the sharp left turn
Quintin Pope
11 Apr 2023 18:43 UTC
205
points
62
comments
15
min read
LW
link
Quintin Pope’s Shortform
Quintin Pope
26 Mar 2023 1:48 UTC
7
points
9
comments
1
min read
LW
link
My Objections to “We’re All Gonna Die with Eliezer Yudkowsky”
Quintin Pope
21 Mar 2023 0:06 UTC
357
points
230
comments
39
min read
LW
link
A Short Dialogue on the Meaning of Reward Functions
Leon Lang
,
Quintin Pope
and
peligrietzer
19 Nov 2022 21:04 UTC
45
points
0
comments
3
min read
LW
link
QAPR 4: Inductive biases
Quintin Pope
10 Oct 2022 22:08 UTC
67
points
2
comments
18
min read
LW
link
QAPR 3: interpretability-guided training of neural nets
Quintin Pope
28 Sep 2022 16:02 UTC
58
points
2
comments
10
min read
LW
link
Quintin’s alignment papers roundup—week 2
Quintin Pope
19 Sep 2022 13:41 UTC
67
points
2
comments
10
min read
LW
link
Quintin’s alignment papers roundup—week 1
Quintin Pope
10 Sep 2022 6:39 UTC
120
points
6
comments
9
min read
LW
link
The shard theory of human values
Quintin Pope
and
TurnTrout
4 Sep 2022 4:28 UTC
248
points
67
comments
24
min read
LW
link
2
reviews
Evolution is a bad analogy for AGI: inner alignment
Quintin Pope
13 Aug 2022 22:15 UTC
78
points
15
comments
8
min read
LW
link
Humans provide an untapped wealth of evidence about alignment
TurnTrout
and
Quintin Pope
14 Jul 2022 2:31 UTC
210
points
94
comments
9
min read
LW
link
1
review
[Question]
What’s the “This AI is of moral concern.” fire alarm?
Quintin Pope
13 Jun 2022 8:05 UTC
37
points
56
comments
2
min read
LW
link
[Question]
Any prior work on mutiagent dynamics for continuous distributions over agents?
Quintin Pope
1 Jun 2022 18:12 UTC
15
points
2
comments
1
min read
LW
link
Idea: build alignment dataset for very capable models
Quintin Pope
12 Feb 2022 19:30 UTC
14
points
2
comments
3
min read
LW
link
Hypothesis: gradient descent prefers general circuits
Quintin Pope
8 Feb 2022 21:12 UTC
46
points
26
comments
11
min read
LW
link
The Case for Radical Optimism about Interpretability
Quintin Pope
16 Dec 2021 23:38 UTC
66
points
16
comments
8
min read
LW
link
1
review
[Linkpost] A General Language Assistant as a Laboratory for Alignment
Quintin Pope
3 Dec 2021 19:42 UTC
37
points
2
comments
2
min read
LW
link
Back to top
Next