RSS

paulfchristiano

Karma: 27,791

Ma­trix com­ple­tion prize results

paulfchristianoDec 20, 2023, 3:40 PM
41 points
0 comments2 min readLW link
(www.alignment.org)

Thoughts on re­spon­si­ble scal­ing poli­cies and regulation

paulfchristianoOct 24, 2023, 10:21 PM
220 points
33 comments6 min readLW link

Thoughts on shar­ing in­for­ma­tion about lan­guage model capabilities

paulfchristianoJul 31, 2023, 4:04 PM
210 points
44 comments11 min readLW link1 review

Self-driv­ing car bets

paulfchristianoJul 29, 2023, 6:10 PM
235 points
44 comments5 min readLW link
(sideways-view.com)

ARC is hiring the­o­ret­i­cal researchers

Jun 12, 2023, 6:50 PM
126 points
12 comments4 min readLW link
(www.alignment.org)

Prizes for ma­trix com­ple­tion problems

paulfchristianoMay 3, 2023, 11:30 PM
164 points
52 comments1 min readLW link
(www.alignment.org)

My views on “doom”

paulfchristianoApr 27, 2023, 5:50 PM
250 points
37 comments2 min readLW link1 review
(ai-alignment.com)

Chris­ti­ano (ARC) and GA (Con­jec­ture) Dis­cuss Align­ment Cruxes

Feb 24, 2023, 11:03 PM
61 points
7 comments47 min readLW link

Thoughts on the im­pact of RLHF research

paulfchristianoJan 25, 2023, 5:23 PM
252 points
102 comments9 min readLW link

Can we effi­ciently dis­t­in­guish differ­ent mechanisms?

paulfchristianoDec 27, 2022, 12:20 AM
88 points
30 comments16 min readLW link
(ai-alignment.com)

Three rea­sons to cooperate

paulfchristianoDec 24, 2022, 5:40 PM
82 points
14 comments10 min readLW link
(sideways-view.com)

Can we effi­ciently ex­plain model be­hav­iors?

paulfchristianoDec 16, 2022, 7:40 PM
64 points
3 comments9 min readLW link
(ai-alignment.com)

AI al­ign­ment is dis­tinct from its near-term applications

paulfchristianoDec 13, 2022, 7:10 AM
255 points
21 comments2 min readLW link
(ai-alignment.com)

Find­ing gliders in the game of life

paulfchristianoDec 1, 2022, 8:40 PM
101 points
8 comments16 min readLW link
(ai-alignment.com)

Mechanis­tic anomaly de­tec­tion and ELK

paulfchristianoNov 25, 2022, 6:50 PM
134 points
22 comments21 min readLW link
(ai-alignment.com)

De­ci­sion the­ory and dy­namic inconsistency

paulfchristianoJul 3, 2022, 10:20 PM
80 points
33 comments10 min readLW link
(sideways-view.com)

AI-Writ­ten Cri­tiques Help Hu­mans No­tice Flaws

paulfchristianoJun 25, 2022, 5:22 PM
137 points
5 comments3 min readLW link
(openai.com)

Where I agree and dis­agree with Eliezer

paulfchristianoJun 19, 2022, 7:15 PM
898 points
223 comments18 min readLW link2 reviews

What is causal­ity to an ev­i­den­tial de­ci­sion the­o­rist?

paulfchristianoApr 17, 2022, 4:00 PM
45 points
26 comments5 min readLW link
(sideways-view.com)

ELK prize results

Mar 9, 2022, 12:01 AM
138 points
50 comments21 min readLW link