RSS

paulfchristiano

Karma: 27,900

Teach­ing ML to an­swer ques­tions hon­estly in­stead of pre­dict­ing hu­man answers

paulfchristianoMay 28, 2021, 5:30 PM
53 points
18 comments16 min readLW link
(ai-alignment.com)

De­cou­pling de­liber­a­tion from competition

paulfchristianoMay 25, 2021, 6:50 PM
84 points
17 comments9 min readLW link1 review
(ai-alignment.com)

Mun­dane solu­tions to ex­otic problems

paulfchristianoMay 4, 2021, 6:20 PM
56 points
8 comments5 min readLW link
(ai-alignment.com)

Low-stakes alignment

paulfchristianoApr 30, 2021, 12:10 AM
87 points
11 comments7 min readLW link1 review
(ai-alignment.com)

AMA: Paul Chris­ti­ano, al­ign­ment researcher

paulfchristianoApr 28, 2021, 6:55 PM
117 points
197 comments1 min readLW link

An­nounc­ing the Align­ment Re­search Center

paulfchristianoApr 26, 2021, 11:30 PM
178 points
6 comments1 min readLW link
(ai-alignment.com)

Another (outer) al­ign­ment failure story

paulfchristianoApr 7, 2021, 8:12 PM
249 points
38 comments12 min readLW link1 review

My re­search methodology

paulfchristianoMar 22, 2021, 9:20 PM
159 points
38 comments16 min readLW link1 review
(ai-alignment.com)

De­mand offsetting

paulfchristianoMar 21, 2021, 6:20 PM
133 points
41 comments5 min readLW link
(sideways-view.com)

It’s not eco­nom­i­cally in­effi­cient for a UBI to re­duce re­cip­i­ent’s employment

paulfchristianoNov 22, 2020, 4:40 PM
93 points
60 comments4 min readLW link
(sideways-view.com)

Hiring en­g­ineers and re­searchers to help al­ign GPT-3

paulfchristianoOct 1, 2020, 6:54 PM
206 points
13 comments3 min readLW link

“Un­su­per­vised” trans­la­tion as an (in­tent) al­ign­ment problem

paulfchristianoSep 30, 2020, 12:50 AM
62 points
15 comments4 min readLW link
(ai-alignment.com)

Distributed pub­lic goods provision

paulfchristianoSep 26, 2020, 9:20 PM
27 points
3 comments5 min readLW link
(sideways-view.com)

Bet­ter pri­ors as a safety problem

paulfchristianoJul 5, 2020, 9:20 PM
66 points
7 comments5 min readLW link
(ai-alignment.com)

Learn­ing the prior

paulfchristianoJul 5, 2020, 9:00 PM
92 points
28 comments8 min readLW link
(ai-alignment.com)

Inac­cessible information

paulfchristianoJun 3, 2020, 5:10 AM
83 points
17 comments14 min readLW link2 reviews
(ai-alignment.com)

Wri­teup: Progress on AI Safety via Debate

Feb 5, 2020, 9:04 PM
103 points
18 comments33 min readLW link

He­donic asymmetries

paulfchristianoJan 26, 2020, 2:10 AM
98 points
22 comments2 min readLW link
(sideways-view.com)

Mo­ral pub­lic goods

paulfchristianoJan 26, 2020, 12:10 AM
147 points
74 comments4 min readLW link
(sideways-view.com)

Of ar­gu­ments and wagers

paulfchristianoJan 10, 2020, 10:20 PM
52 points
6 comments6 min readLW link
(ai-alignment.com)