Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
peterbarnett
Karma:
2,554
Researcher at MIRI
EA and AI safety
https://peterbarnett.org/
All
Posts
Comments
New
Top
Old
Without fundamental advances, misalignment and catastrophe are the default outcomes of training powerful AI
Jeremy Gillen
and
peterbarnett
26 Jan 2024 7:22 UTC
160
points
60
comments
57
min read
LW
link
Trying to align humans with inclusive genetic fitness
peterbarnett
11 Jan 2024 0:13 UTC
23
points
5
comments
10
min read
LW
link
Labs should be explicit about why they are building AGI
peterbarnett
17 Oct 2023 21:09 UTC
195
points
17
comments
1
min read
LW
link
Thomas Kwa’s MIRI research experience
Thomas Kwa
,
peterbarnett
,
Vivek Hebbar
,
Jeremy Gillen
,
jacobjacob
and
Raemon
2 Oct 2023 16:42 UTC
172
points
53
comments
1
min read
LW
link
Doing oversight from the very start of training seems hard
peterbarnett
20 Sep 2022 17:21 UTC
14
points
3
comments
3
min read
LW
link
Confusions in My Model of AI Risk
peterbarnett
7 Jul 2022 1:05 UTC
22
points
9
comments
5
min read
LW
link
Scott Aaronson is joining OpenAI to work on AI safety
peterbarnett
18 Jun 2022 4:06 UTC
117
points
31
comments
1
min read
LW
link
(scottaaronson.blog)
A Story of AI Risk: InstructGPT-N
peterbarnett
26 May 2022 23:22 UTC
24
points
0
comments
8
min read
LW
link
Why I’m Worried About AI
peterbarnett
23 May 2022 21:13 UTC
22
points
2
comments
12
min read
LW
link
Framings of Deceptive Alignment
peterbarnett
26 Apr 2022 4:25 UTC
32
points
7
comments
5
min read
LW
link
How to become an AI safety researcher
peterbarnett
15 Apr 2022 11:41 UTC
23
points
0
comments
14
min read
LW
link
Thoughts on Dangerous Learned Optimization
peterbarnett
19 Feb 2022 10:46 UTC
4
points
2
comments
4
min read
LW
link
peterbarnett’s Shortform
peterbarnett
16 Feb 2022 17:24 UTC
3
points
27
comments
1
min read
LW
link
Alignment Problems All the Way Down
peterbarnett
22 Jan 2022 0:19 UTC
29
points
7
comments
11
min read
LW
link
[Question]
What questions do you have about doing work on AI safety?
peterbarnett
21 Dec 2021 16:36 UTC
13
points
8
comments
1
min read
LW
link
Some motivations to gradient hack
peterbarnett
17 Dec 2021 3:06 UTC
8
points
0
comments
6
min read
LW
link
Understanding Gradient Hacking
peterbarnett
10 Dec 2021 15:58 UTC
41
points
5
comments
30
min read
LW
link
When Should the Fire Alarm Go Off: A model for optimal thresholds
peterbarnett
28 Apr 2021 12:27 UTC
40
points
4
comments
5
min read
LW
link
(peterbarnett.org)
Does making unsteady incremental progress work?
peterbarnett
5 Mar 2021 7:23 UTC
8
points
4
comments
1
min read
LW
link
(peterbarnett.org)
Summary of AI Research Considerations for Human Existential Safety (ARCHES)
peterbarnett
9 Dec 2020 23:28 UTC
11
points
0
comments
13
min read
LW
link
Back to top