Quintin Pope

Karma: 4,838

Counting arguments provide no evidence for AI doom

Nora Belrose and Quintin Pope

27 Feb 2024 23:03 UTC

95 points

188 comments14 min readLW link

QAPR 5: grokking is maybe not that big a deal?

Quintin Pope23 Jul 2023 20:14 UTC

114 points

15 comments9 min readLW link

Research agenda: Supervising AIs improving AIs

Quintin Pope, Owen D, Roman Engeler and jacquesthibs

29 Apr 2023 17:09 UTC

76 points

5 comments19 min readLW link

Evolution provides no evidence for the sharp left turn

Quintin Pope11 Apr 2023 18:43 UTC

204 points

62 comments15 min readLW link

Quintin Pope’s Shortform

Quintin Pope26 Mar 2023 1:48 UTC

7 points

9 comments1 min readLW link

My Objections to “We’re All Gonna Die with Eliezer Yudkowsky”

Quintin Pope21 Mar 2023 0:06 UTC

357 points

230 comments39 min readLW link

A Short Dialogue on the Meaning of Reward Functions

Leon Lang, Quintin Pope and peligrietzer

19 Nov 2022 21:04 UTC

45 points

0 comments3 min readLW link

QAPR 4: Inductive biases

Quintin Pope10 Oct 2022 22:08 UTC

67 points

2 comments18 min readLW link

QAPR 3: interpretability-guided training of neural nets

Quintin Pope28 Sep 2022 16:02 UTC

58 points

2 comments10 min readLW link

Quintin’s alignment papers roundup—week 2

Quintin Pope19 Sep 2022 13:41 UTC

67 points

2 comments10 min readLW link

Quintin’s alignment papers roundup—week 1

Quintin Pope10 Sep 2022 6:39 UTC

120 points

6 comments9 min readLW link

The shard theory of human values

Quintin Pope and TurnTrout

4 Sep 2022 4:28 UTC

248 points

67 comments24 min readLW link 2 reviews

Evolution is a bad analogy for AGI: inner alignment

Quintin Pope13 Aug 2022 22:15 UTC

78 points

15 comments8 min readLW link

Humans provide an untapped wealth of evidence about alignment

TurnTrout and Quintin Pope

14 Jul 2022 2:31 UTC

210 points

94 comments9 min readLW link 1 review

[Question] What’s the “This AI is of moral concern.” fire alarm?

Quintin Pope13 Jun 2022 8:05 UTC

37 points

56 comments2 min readLW link

[Question] Any prior work on mutiagent dynamics for continuous distributions over agents?

Quintin Pope1 Jun 2022 18:12 UTC

15 points

2 comments1 min readLW link

Idea: build alignment dataset for very capable models

Quintin Pope12 Feb 2022 19:30 UTC

14 points

2 comments3 min readLW link

Hypothesis: gradient descent prefers general circuits

Quintin Pope8 Feb 2022 21:12 UTC

46 points

26 comments11 min readLW link

The Case for Radical Optimism about Interpretability

Quintin Pope16 Dec 2021 23:38 UTC

66 points

16 comments8 min readLW link 1 review

[Linkpost] A General Language Assistant as a Laboratory for Alignment

Quintin Pope3 Dec 2021 19:42 UTC

37 points

2 comments2 min readLW link