RSS

Quintin Pope

Karma: 4,838

Count­ing ar­gu­ments provide no ev­i­dence for AI doom

27 Feb 2024 23:03 UTC
95 points
188 comments14 min readLW link

QAPR 5: grokking is maybe not *that* big a deal?

Quintin Pope23 Jul 2023 20:14 UTC
114 points
15 comments9 min readLW link

Re­search agenda: Su­per­vis­ing AIs im­prov­ing AIs

29 Apr 2023 17:09 UTC
76 points
5 comments19 min readLW link

Evolu­tion pro­vides no ev­i­dence for the sharp left turn

Quintin Pope11 Apr 2023 18:43 UTC
204 points
62 comments15 min readLW link

Quintin Pope’s Shortform

Quintin Pope26 Mar 2023 1:48 UTC
7 points
9 comments1 min readLW link

My Ob­jec­tions to “We’re All Gonna Die with Eliezer Yud­kowsky”

Quintin Pope21 Mar 2023 0:06 UTC
357 points
230 comments39 min readLW link

A Short Dialogue on the Mean­ing of Re­ward Functions

19 Nov 2022 21:04 UTC
45 points
0 comments3 min readLW link

QAPR 4: In­duc­tive biases

Quintin Pope10 Oct 2022 22:08 UTC
67 points
2 comments18 min readLW link

QAPR 3: in­ter­pretabil­ity-guided train­ing of neu­ral nets

Quintin Pope28 Sep 2022 16:02 UTC
58 points
2 comments10 min readLW link

Quintin’s al­ign­ment pa­pers roundup—week 2

Quintin Pope19 Sep 2022 13:41 UTC
67 points
2 comments10 min readLW link

Quintin’s al­ign­ment pa­pers roundup—week 1

Quintin Pope10 Sep 2022 6:39 UTC
120 points
6 comments9 min readLW link

The shard the­ory of hu­man values

4 Sep 2022 4:28 UTC
248 points
67 comments24 min readLW link2 reviews

Evolu­tion is a bad anal­ogy for AGI: in­ner alignment

Quintin Pope13 Aug 2022 22:15 UTC
78 points
15 comments8 min readLW link

Hu­mans provide an un­tapped wealth of ev­i­dence about alignment

14 Jul 2022 2:31 UTC
210 points
94 comments9 min readLW link1 review

[Question] What’s the “This AI is of moral con­cern.” fire alarm?

Quintin Pope13 Jun 2022 8:05 UTC
37 points
56 comments2 min readLW link

[Question] Any prior work on mu­ti­a­gent dy­nam­ics for con­tin­u­ous dis­tri­bu­tions over agents?

Quintin Pope1 Jun 2022 18:12 UTC
15 points
2 comments1 min readLW link

Idea: build al­ign­ment dataset for very ca­pa­ble models

Quintin Pope12 Feb 2022 19:30 UTC
14 points
2 comments3 min readLW link

Hy­poth­e­sis: gra­di­ent de­scent prefers gen­eral circuits

Quintin Pope8 Feb 2022 21:12 UTC
46 points
26 comments11 min readLW link

The Case for Rad­i­cal Op­ti­mism about Interpretability

Quintin Pope16 Dec 2021 23:38 UTC
66 points
16 comments8 min readLW link1 review

[Linkpost] A Gen­eral Lan­guage As­sis­tant as a Lab­o­ra­tory for Alignment

Quintin Pope3 Dec 2021 19:42 UTC
37 points
2 comments2 min readLW link