So­cial Dark Matter

Duncan Sabien (Deactivated)Nov 16, 2023, 8:00 PM
355 points
125 comments34 min readLW link2 reviews

Shal­low re­view of live agen­das in al­ign­ment & safety

Nov 27, 2023, 11:10 AM
348 points
73 comments29 min readLW link1 review

AI Timelines

Nov 10, 2023, 5:28 AM
300 points
135 comments51 min readLW link2 reviews

OpenAI: The Bat­tle of the Board

ZviNov 22, 2023, 5:30 PM
281 points
83 comments11 min readLW link
(thezvi.wordpress.com)

The 6D effect: When com­pa­nies take risks, one email can be very pow­er­ful.

scasperNov 4, 2023, 8:08 PM
279 points
42 comments3 min readLW link

OpenAI: Facts from a Weekend

ZviNov 20, 2023, 3:30 PM
271 points
166 comments9 min readLW link
(thezvi.wordpress.com)

The 101 Space You Will Always Have With You

ScrewtapeNov 29, 2023, 4:56 AM
271 points
22 comments6 min readLW link1 review

What are the re­sults of more parental su­per­vi­sion and less out­door play?

juliawiseNov 25, 2023, 12:52 PM
228 points
31 comments5 min readLW link

Abil­ity to solve long-hori­zon tasks cor­re­lates with want­ing things in the be­hav­iorist sense

So8resNov 24, 2023, 5:37 PM
197 points
84 comments5 min readLW link1 review

Think­ing By The Clock

ScrewtapeNov 8, 2023, 7:40 AM
196 points
29 comments8 min readLW link1 review

Pro­pa­ganda or Science: A Look at Open Source AI and Bioter­ror­ism Risk

1a3ornNov 2, 2023, 6:20 PM
193 points
79 comments23 min readLW link

Sam Alt­man fired from OpenAI

LawrenceCNov 17, 2023, 8:42 PM
192 points
75 comments1 min readLW link
(openai.com)

The other side of the tidal wave

KatjaGraceNov 3, 2023, 5:40 AM
189 points
86 comments1 min readLW link
(worldspiritsockpuppet.com)

How to (hope­fully eth­i­cally) make money off of AGI

Nov 6, 2023, 11:35 PM
169 points
93 comments32 min readLW link1 review

You can just spon­ta­neously call peo­ple you haven’t met in years

lcNov 13, 2023, 5:21 AM
167 points
21 comments1 min readLW link

Loudly Give Up, Don’t Quietly Fade

ScrewtapeNov 13, 2023, 11:30 PM
164 points
12 comments6 min readLW link1 review

Vote on In­ter­est­ing Disagreements

Ben PaceNov 7, 2023, 9:35 PM
159 points
131 comments1 min readLW link

My thoughts on the so­cial re­sponse to AI risk

Matthew BarnettNov 1, 2023, 9:17 PM
157 points
37 comments10 min readLW link

Mo­ral Real­ity Check (a short story)

jessicataNov 26, 2023, 5:03 AM
149 points
45 comments21 min readLW link1 review
(unstableontology.com)

Does davi­dad’s up­load­ing moon­shot work?

Nov 3, 2023, 2:21 AM
146 points
35 comments25 min readLW link

EA orgs’ le­gal struc­ture in­hibits risk tak­ing and in­for­ma­tion shar­ing on the margin

ElizabethNov 5, 2023, 7:13 PM
136 points
17 comments4 min readLW link

In­tegrity in AI Gover­nance and Advocacy

Nov 3, 2023, 7:52 PM
134 points
57 comments23 min readLW link

Apoca­lypse in­surance, and the hardline liber­tar­ian take on AI risk

So8resNov 28, 2023, 2:09 AM
134 points
40 comments7 min readLW link1 review

One Day Sooner

ScrewtapeNov 2, 2023, 7:00 PM
118 points
8 comments8 min readLW link1 review

8 ex­am­ples in­form­ing my pes­simism on up­load­ing with­out re­verse engineering

Steven ByrnesNov 3, 2023, 8:03 PM
117 points
12 comments12 min readLW link

The Soul Key

Richard_NgoNov 4, 2023, 5:51 PM
112 points
10 comments8 min readLW link1 review
(www.narrativeark.xyz)

How much to up­date on re­cent AI gov­er­nance moves?

Nov 16, 2023, 11:46 PM
112 points
5 comments29 min readLW link

De­cep­tion Chess: Game #1

Nov 3, 2023, 9:13 PM
111 points
22 comments8 min readLW link1 review

Ex­pe­riences and learn­ings from both sides of the AI safety job market

Marius HobbhahnNov 15, 2023, 3:40 PM
110 points
4 comments18 min readLW link

Stuxnet, not Skynet: Hu­man­ity’s dis­em­pow­er­ment by AI

RokoNov 4, 2023, 10:23 PM
107 points
24 comments6 min readLW link

My techno-op­ti­mism [By Vi­talik Bu­terin]

habrykaNov 27, 2023, 11:53 PM
107 points
17 comments2 min readLW link
(www.lesswrong.com)

New LessWrong fea­ture: Dialogue Matching

Bird ConceptNov 16, 2023, 9:27 PM
106 points
22 comments3 min readLW link

Pick­ing Men­tors For Re­search Programmes

Raymond DNov 10, 2023, 1:01 PM
105 points
8 comments4 min readLW link

Learn­ing-the­o­retic agenda read­ing list

Vanessa KosoyNov 9, 2023, 5:25 PM
103 points
1 comment2 min readLW link1 review

On the Ex­ec­u­tive Order

ZviNov 1, 2023, 2:20 PM
100 points
4 comments30 min readLW link
(thezvi.wordpress.com)

Never Drop A Ball

ScrewtapeNov 23, 2023, 4:15 AM
99 points
8 comments6 min readLW link1 review

Kids or No kids

Kids or no kidsNov 14, 2023, 6:37 PM
98 points
10 comments13 min readLW link

Coup probes: Catch­ing catas­tro­phes with probes trained off-policy

Fabien RogerNov 17, 2023, 5:58 PM
91 points
9 comments11 min readLW link1 review

Public Call for In­ter­est in Math­e­mat­i­cal Alignment

DavidmanheimNov 22, 2023, 1:22 PM
90 points
9 comments1 min readLW link

Large Lan­guage Models can Strate­gi­cally De­ceive their Users when Put Un­der Pres­sure.

ReaderMNov 15, 2023, 4:36 PM
89 points
9 comments2 min readLW link1 review
(arxiv.org)

Growth and Form in a Toy Model of Superposition

Nov 8, 2023, 11:08 AM
89 points
7 comments14 min readLW link

Un­trusted smart mod­els and trusted dumb models

BuckNov 4, 2023, 3:06 AM
87 points
17 comments6 min readLW link1 review

Dario Amodei’s pre­pared re­marks from the UK AI Safety Sum­mit, on An­thropic’s Re­spon­si­ble Scal­ing Policy

Zac Hatfield-DoddsNov 1, 2023, 6:10 PM
85 points
1 comment4 min readLW link
(www.anthropic.com)

Some Rules for an Alge­bra of Bayes Nets

Nov 16, 2023, 11:53 PM
84 points
44 comments14 min readLW link1 review

Say­ing the quiet part out loud: trad­ing off x-risk for per­sonal immortality

disturbanceNov 2, 2023, 5:43 PM
84 points
89 comments5 min readLW link

My Crit­i­cism of Sin­gu­lar Learn­ing Theory

Joar SkalseNov 19, 2023, 3:19 PM
83 points
56 comments12 min readLW link

Agent Boundaries Aren’t Markov Blan­kets. [Un­less they’re non-causal; see com­ments.]

abramdemskiNov 20, 2023, 6:23 PM
82 points
11 comments2 min readLW link

Bostrom Goes Unheard

ZviNov 13, 2023, 2:11 PM
81 points
9 comments18 min readLW link

New re­port: “Schem­ing AIs: Will AIs fake al­ign­ment dur­ing train­ing in or­der to get power?”

Joe CarlsmithNov 15, 2023, 5:16 PM
81 points
28 comments30 min readLW link1 review

An­nounc­ing Athena—Women in AI Align­ment Research

Claire ShortNov 7, 2023, 9:46 PM
80 points
2 comments3 min readLW link