RSS

Quintin Pope

Karma: 4,883

Count­ing ar­gu­ments provide no ev­i­dence for AI doom

Feb 27, 2024, 11:03 PM
100 points
188 comments14 min readLW link

QAPR 5: grokking is maybe not *that* big a deal?

Quintin PopeJul 23, 2023, 8:14 PM
114 points
15 comments9 min readLW link

Re­search agenda: Su­per­vis­ing AIs im­prov­ing AIs

Apr 29, 2023, 5:09 PM
76 points
5 comments19 min readLW link

Evolu­tion pro­vides no ev­i­dence for the sharp left turn

Quintin PopeApr 11, 2023, 6:43 PM
206 points
65 comments15 min readLW link1 review

Quintin Pope’s Shortform

Quintin PopeMar 26, 2023, 1:48 AM
7 points
9 comments1 min readLW link

My Ob­jec­tions to “We’re All Gonna Die with Eliezer Yud­kowsky”

Quintin PopeMar 21, 2023, 12:06 AM
358 points
232 comments39 min readLW link1 review

A Short Dialogue on the Mean­ing of Re­ward Functions

Nov 19, 2022, 9:04 PM
45 points
0 comments3 min readLW link

QAPR 4: In­duc­tive biases

Quintin PopeOct 10, 2022, 10:08 PM
67 points
2 comments18 min readLW link

QAPR 3: in­ter­pretabil­ity-guided train­ing of neu­ral nets

Quintin PopeSep 28, 2022, 4:02 PM
58 points
2 comments10 min readLW link

Quintin’s al­ign­ment pa­pers roundup—week 2

Quintin PopeSep 19, 2022, 1:41 PM
67 points
2 comments10 min readLW link

Quintin’s al­ign­ment pa­pers roundup—week 1

Quintin PopeSep 10, 2022, 6:39 AM
120 points
6 comments9 min readLW link

The shard the­ory of hu­man values

Sep 4, 2022, 4:28 AM
255 points
67 comments24 min readLW link2 reviews

Evolu­tion is a bad anal­ogy for AGI: in­ner alignment

Quintin PopeAug 13, 2022, 10:15 PM
79 points
15 comments8 min readLW link

Hu­mans provide an un­tapped wealth of ev­i­dence about alignment

Jul 14, 2022, 2:31 AM
211 points
94 comments9 min readLW link1 review

[Question] What’s the “This AI is of moral con­cern.” fire alarm?

Quintin PopeJun 13, 2022, 8:05 AM
37 points
56 comments2 min readLW link

[Question] Any prior work on mu­ti­a­gent dy­nam­ics for con­tin­u­ous dis­tri­bu­tions over agents?

Quintin PopeJun 1, 2022, 6:12 PM
15 points
2 comments1 min readLW link

Idea: build al­ign­ment dataset for very ca­pa­ble models

Quintin PopeFeb 12, 2022, 7:30 PM
14 points
2 comments3 min readLW link

Hy­poth­e­sis: gra­di­ent de­scent prefers gen­eral circuits

Quintin PopeFeb 8, 2022, 9:12 PM
46 points
26 comments11 min readLW link

The Case for Rad­i­cal Op­ti­mism about Interpretability

Quintin PopeDec 16, 2021, 11:38 PM
66 points
16 comments8 min readLW link1 review

[Linkpost] A Gen­eral Lan­guage As­sis­tant as a Lab­o­ra­tory for Alignment

Quintin PopeDec 3, 2021, 7:42 PM
37 points
2 comments2 min readLW link