RSS

peterbarnett

Karma: 2,727

Researcher at MIRI

EA and AI safety

https://​​peterbarnett.org/​​

AI Gover­nance to Avoid Ex­tinc­tion: The Strate­gic Land­scape and Ac­tion­able Re­search Questions

May 1, 2025, 10:46 PM
90 points
2 comments8 min readLW link
(techgov.intelligence.org)

Without fun­da­men­tal ad­vances, mis­al­ign­ment and catas­tro­phe are the de­fault out­comes of train­ing pow­er­ful AI

Jan 26, 2024, 7:22 AM
161 points
60 comments57 min readLW link

Try­ing to al­ign hu­mans with in­clu­sive ge­netic fitness

peterbarnettJan 11, 2024, 12:13 AM
23 points
5 comments10 min readLW link

Labs should be ex­plicit about why they are build­ing AGI

peterbarnettOct 17, 2023, 9:09 PM
210 points
18 comments1 min readLW link1 review

Thomas Kwa’s MIRI re­search experience

Oct 2, 2023, 4:42 PM
172 points
53 comments1 min readLW link

Do­ing over­sight from the very start of train­ing seems hard

peterbarnettSep 20, 2022, 5:21 PM
14 points
3 comments3 min readLW link

Con­fu­sions in My Model of AI Risk

peterbarnettJul 7, 2022, 1:05 AM
22 points
9 comments5 min readLW link

Scott Aaron­son is join­ing OpenAI to work on AI safety

peterbarnettJun 18, 2022, 4:06 AM
117 points
31 comments1 min readLW link
(scottaaronson.blog)

A Story of AI Risk: In­struc­tGPT-N

peterbarnettMay 26, 2022, 11:22 PM
24 points
0 comments8 min readLW link

Why I’m Wor­ried About AI

peterbarnettMay 23, 2022, 9:13 PM
22 points
2 comments12 min readLW link

Fram­ings of De­cep­tive Alignment

peterbarnettApr 26, 2022, 4:25 AM
32 points
7 comments5 min readLW link

How to be­come an AI safety researcher

peterbarnettApr 15, 2022, 11:41 AM
25 points
0 comments14 min readLW link

Thoughts on Danger­ous Learned Optimization

peterbarnettFeb 19, 2022, 10:46 AM
4 points
2 comments4 min readLW link

pe­ter­bar­nett’s Shortform

peterbarnettFeb 16, 2022, 5:24 PM
3 points
27 commentsLW link

Align­ment Prob­lems All the Way Down

peterbarnettJan 22, 2022, 12:19 AM
29 points
7 comments11 min readLW link

[Question] What ques­tions do you have about do­ing work on AI safety?

peterbarnettDec 21, 2021, 4:36 PM
13 points
8 comments1 min readLW link

Some mo­ti­va­tions to gra­di­ent hack

peterbarnettDec 17, 2021, 3:06 AM
8 points
0 comments6 min readLW link

Un­der­stand­ing Gra­di­ent Hacking

peterbarnettDec 10, 2021, 3:58 PM
41 points
5 comments30 min readLW link

When Should the Fire Alarm Go Off: A model for op­ti­mal thresholds

peterbarnettApr 28, 2021, 12:27 PM
40 points
4 comments5 min readLW link
(peterbarnett.org)

Does mak­ing un­steady in­cre­men­tal progress work?

peterbarnettMar 5, 2021, 7:23 AM
8 points
4 comments1 min readLW link
(peterbarnett.org)