How it feels to have your mind hacked by an AI

blakedJan 12, 2023, 12:33 AM
368 points
222 comments17 min readLW link

On not get­ting con­tam­i­nated by the wrong obe­sity ideas

NatáliaJan 28, 2023, 8:18 PM
306 points
69 comments30 min readLW link

Ba­sics of Ra­tion­al­ist Discourse

Duncan Sabien (Deactivated)Jan 27, 2023, 2:40 AM
282 points
193 comments36 min readLW link4 reviews

We don’t trade with ants

KatjaGraceJan 10, 2023, 11:50 PM
271 points
109 comments7 min readLW link1 review
(worldspiritsockpuppet.com)

My Model Of EA Burnout

LoganStrohlJan 25, 2023, 5:52 PM
259 points
50 comments5 min readLW link1 review

Thoughts on the im­pact of RLHF research

paulfchristianoJan 25, 2023, 5:23 PM
253 points
102 comments9 min readLW link

Re­cur­sive Mid­dle Man­ager Hell

RaemonJan 1, 2023, 4:33 AM
224 points
46 comments11 min readLW link1 review

What a com­pute-cen­tric frame­work says about AI take­off speeds

Tom DavidsonJan 23, 2023, 4:02 AM
188 points
30 comments16 min readLW link1 review

Neu­ral net­works gen­er­al­ize be­cause of this one weird trick

Jesse HooglandJan 18, 2023, 12:10 AM
181 points
34 comments15 min readLW link1 review
(www.jessehoogland.com)

Alexan­der and Yud­kowsky on AGI goals

Jan 24, 2023, 9:09 PM
178 points
53 comments26 min readLW link1 review

What I mean by “al­ign­ment is in large part about mak­ing cog­ni­tion aimable at all”

So8resJan 30, 2023, 3:22 PM
171 points
25 comments2 min readLW link

Gra­di­ent hack­ing is ex­tremely difficult

berenJan 24, 2023, 3:45 PM
164 points
22 comments5 min readLW link

Sapir-Whorf for Rationalists

Duncan Sabien (Deactivated)Jan 25, 2023, 7:58 AM
154 points
49 comments19 min readLW link

“Hereti­cal Thoughts on AI” by Eli Dourado

DragonGodJan 19, 2023, 4:11 PM
146 points
38 comments3 min readLW link
(www.elidourado.com)

Why didn’t we get the four-hour work­day?

jasoncrawfordJan 6, 2023, 9:29 PM
139 points
34 comments6 min readLW link
(rootsofprogress.org)

Wolf In­ci­dent Postmortem

jefftkJan 9, 2023, 3:20 AM
136 points
13 comments1 min readLW link
(www.jefftk.com)

How to slow down sci­en­tific progress, ac­cord­ing to Leo Szilard

jasoncrawfordJan 5, 2023, 6:26 PM
134 points
18 comments2 min readLW link
(rootsofprogress.org)

Ba­sic Facts about Lan­guage Model Internals

Jan 4, 2023, 1:01 PM
130 points
19 comments9 min readLW link

In­duc­tion heads—illustrated

CallumMcDougallJan 2, 2023, 3:35 PM
128 points
12 comments3 min readLW link

How to Bounded Distrust

ZviJan 9, 2023, 1:10 PM
122 points
17 comments4 min readLW link1 review
(thezvi.wordpress.com)

Tran­script of Sam Alt­man’s in­ter­view touch­ing on AI safety

Andy_McKenzieJan 20, 2023, 4:14 PM
121 points
42 comments10 min readLW link

Com­pendium of prob­lems with RLHF

Charbel-RaphaëlJan 29, 2023, 11:40 AM
120 points
16 comments10 min readLW link

AGI and the EMH: mar­kets are not ex­pect­ing al­igned or un­al­igned AI in the next 30 years

Jan 10, 2023, 4:06 PM
119 points
44 comments26 min readLW link

Soft op­ti­miza­tion makes the value tar­get bigger

Jeremy GillenJan 2, 2023, 4:06 PM
119 points
20 comments12 min readLW link

Why I’m join­ing Anthropic

evhubJan 5, 2023, 1:12 AM
118 points
4 comments2 min readLW link

Touch re­al­ity as soon as pos­si­ble (when do­ing ma­chine learn­ing re­search)

LawrenceCJan 3, 2023, 7:11 PM
117 points
9 comments8 min readLW link1 review

The Foun­tain of Health: a First Prin­ci­ples Guide to Rejuvenation

PhilJacksonJan 7, 2023, 6:34 PM
115 points
38 comments41 min readLW link

Run­ning by Default

jefftkJan 5, 2023, 1:50 PM
112 points
40 comments1 min readLW link
(www.jefftk.com)

Iron defi­cien­cies are very bad and you should treat them

ElizabethJan 12, 2023, 9:10 AM
108 points
34 comments11 min readLW link1 review
(acesounderglass.com)

Ve­gan Nutri­tion Test­ing Pro­ject: In­terim Report

ElizabethJan 20, 2023, 5:50 AM
102 points
37 comments8 min readLW link
(acesounderglass.com)

Large lan­guage mod­els learn to rep­re­sent the world

gjmJan 22, 2023, 1:10 PM
101 points
20 comments3 min readLW link1 review

Con­crete Rea­sons for Hope about AI

Zac Hatfield-DoddsJan 14, 2023, 1:22 AM
101 points
13 comments1 min readLW link

2022 was the year AGI ar­rived (Just don’t call it that)

Logan ZoellnerJan 4, 2023, 3:19 PM
101 points
60 comments3 min readLW link

Pa­ram­e­ter Scal­ing Comes for RL, Maybe

1a3ornJan 24, 2023, 1:55 PM
100 points
3 comments14 min readLW link

2022 Unoffi­cial LessWrong Gen­eral Cen­sus

ScrewtapeJan 30, 2023, 6:36 PM
97 points
33 comments2 min readLW link

Cat­e­go­riz­ing failures as “outer” or “in­ner” mis­al­ign­ment is of­ten confused

Rohin ShahJan 6, 2023, 3:48 PM
93 points
21 comments8 min readLW link

Disen­tan­gling Shard The­ory into Atomic Claims

Leon LangJan 13, 2023, 4:23 AM
86 points
6 comments18 min readLW link

Re­view AI Align­ment posts to help figure out how to make a proper AI Align­ment review

Jan 10, 2023, 12:19 AM
85 points
31 comments2 min readLW link

“Endgame safety” for AGI

Steven ByrnesJan 24, 2023, 2:15 PM
85 points
10 comments6 min readLW link

Child­hood Roundup #1

ZviJan 6, 2023, 1:00 PM
84 points
27 comments8 min readLW link
(thezvi.wordpress.com)

The Align­ment Prob­lem from a Deep Learn­ing Per­spec­tive (ma­jor rewrite)

Jan 10, 2023, 4:06 PM
84 points
8 comments39 min readLW link
(arxiv.org)

Book Re­view: Wor­lds of Flow

rememberJan 16, 2023, 8:17 PM
83 points
3 comments9 min readLW link

Con­fus­ing the ideal for the necessary

adamShimiJan 16, 2023, 5:29 PM
79 points
6 comments1 min readLW link
(epistemologicalvigilance.substack.com)

On AI and In­ter­est Rates

ZviJan 17, 2023, 3:00 PM
79 points
13 comments8 min readLW link
(thezvi.wordpress.com)

Com­pound­ing Re­source X

RaemonJan 11, 2023, 3:14 AM
77 points
6 comments9 min readLW link

Si­mu­lacra Levels Summary

ZviJan 30, 2023, 1:40 PM
77 points
14 comments7 min readLW link
(thezvi.wordpress.com)

Against Boltz­mann mesaoptimizers

porbyJan 30, 2023, 2:55 AM
77 points
6 comments4 min readLW link

Spread­ing mes­sages to help with the most im­por­tant century

HoldenKarnofskyJan 25, 2023, 6:20 PM
75 points
4 comments18 min readLW link
(www.cold-takes.com)

Went­worth and Larsen on buy­ing time

Jan 9, 2023, 9:31 PM
74 points
6 comments12 min readLW link

Some Thoughts on AI Art

abramdemskiJan 25, 2023, 2:18 PM
74 points
20 comments7 min readLW link