SolidGoldMag­ikarp (plus, prompt gen­er­a­tion)

5 Feb 2023 22:02 UTC
665 points
204 comments12 min readLW link

The Waluigi Effect (mega-post)

Cleo Nardo3 Mar 2023 3:22 UTC
618 points
188 comments16 min readLW link

The Talk: a brief ex­pla­na­tion of sex­ual dimorphism

Malmesbury18 Sep 2023 16:23 UTC
482 points
72 comments16 min readLW link

How much do you be­lieve your re­sults?

Eric Neyman6 May 2023 20:31 UTC
459 points
14 comments15 min readLW link
(ericneyman.wordpress.com)

Steer­ing GPT-2-XL by adding an ac­ti­va­tion vector

13 May 2023 18:42 UTC
423 points
97 comments50 min readLW link

The ants and the grasshopper

Richard_Ngo4 Jun 2023 22:00 UTC
417 points
35 comments5 min readLW link
(www.narrativeark.xyz)

Fo­cus on the places where you feel shocked ev­ery­one’s drop­ping the ball

So8res2 Feb 2023 0:27 UTC
413 points
61 comments4 min readLW link

Dou­glas Hofs­tadter changes his mind on Deep Learn­ing & AI risk (June 2023)?

gwern3 Jul 2023 0:48 UTC
411 points
54 comments7 min readLW link
(www.youtube.com)

Sig­nifi­cantly En­hanc­ing Adult In­tel­li­gence With Gene Edit­ing May Be Possible

12 Dec 2023 18:14 UTC
411 points
162 comments33 min readLW link

Bing Chat is blatantly, ag­gres­sively misaligned

evhub15 Feb 2023 5:29 UTC
396 points
170 comments2 min readLW link

Things I Learned by Spend­ing Five Thou­sand Hours In Non-EA Charities

jenn1 Jun 2023 20:48 UTC
387 points
34 comments8 min readLW link
(jenn.site)

GPTs are Pre­dic­tors, not Imitators

Eliezer Yudkowsky8 Apr 2023 19:59 UTC
376 points
90 comments3 min readLW link

State­ment on AI Ex­tinc­tion—Signed by AGI Labs, Top Aca­demics, and Many Other Notable Figures

Dan H30 May 2023 9:05 UTC
372 points
77 comments1 min readLW link
(www.safe.ai)

Not­ing an er­ror in Inad­e­quate Equilibria

Matthew Barnett8 Feb 2023 1:33 UTC
359 points
56 comments2 min readLW link

My Ob­jec­tions to “We’re All Gonna Die with Eliezer Yud­kowsky”

Quintin Pope21 Mar 2023 0:06 UTC
356 points
225 comments39 min readLW link

How it feels to have your mind hacked by an AI

blaked12 Jan 2023 0:33 UTC
355 points
219 comments17 min readLW link

How to have Poly­geni­cally Screened Children

GeneSmith7 May 2023 16:01 UTC
344 points
108 comments27 min readLW link

Shut­ting Down the Light­cone Offices

14 Mar 2023 22:47 UTC
337 points
93 comments17 min readLW link

Please don’t throw your mind away

TsviBT15 Feb 2023 21:41 UTC
337 points
44 comments18 min readLW link

Cyborgism

10 Feb 2023 14:47 UTC
333 points
46 comments35 min readLW link

In­side Views, Im­pos­tor Syn­drome, and the Great LARP

johnswentworth25 Sep 2023 16:08 UTC
326 points
53 comments5 min readLW link

Child­hoods of ex­cep­tional people

Henrik Karlsson6 Feb 2023 17:27 UTC
325 points
62 comments15 min readLW link
(escapingflatland.substack.com)

Shar­ing In­for­ma­tion About Nonlinear

Ben Pace7 Sep 2023 6:51 UTC
322 points
323 comments34 min readLW link

EA Ve­gan Ad­vo­cacy is not truth­seek­ing, and it’s ev­ery­one’s problem

Elizabeth28 Sep 2023 23:30 UTC
319 points
247 comments22 min readLW link
(acesounderglass.com)

Against Al­most Every The­ory of Im­pact of Interpretability

Charbel-Raphaël17 Aug 2023 18:44 UTC
315 points
83 comments26 min readLW link

Un­der­stand­ing and con­trol­ling a maze-solv­ing policy network

11 Mar 2023 18:59 UTC
312 points
22 comments23 min readLW link

Shal­low re­view of live agen­das in al­ign­ment & safety

27 Nov 2023 11:10 UTC
311 points
69 comments29 min readLW link

On not get­ting con­tam­i­nated by the wrong obe­sity ideas

Natália28 Jan 2023 20:18 UTC
308 points
67 comments30 min readLW link

Align­ment Grant­mak­ing is Fund­ing-Limited Right Now

johnswentworth19 Jul 2023 16:49 UTC
307 points
67 comments1 min readLW link

Model Or­ganisms of Misal­ign­ment: The Case for a New Pillar of Align­ment Research

8 Aug 2023 1:30 UTC
306 points
26 comments18 min readLW link

Fuck­ing God­damn Ba­sics of Ra­tion­al­ist Discourse

LoganStrohl4 Feb 2023 1:47 UTC
301 points
97 comments1 min readLW link

Book Re­view: How Minds Change

bc4026bd4aaa5b7fe25 May 2023 17:55 UTC
298 points
52 comments15 min readLW link

Paus­ing AI Devel­op­ments Isn’t Enough. We Need to Shut it All Down by Eliezer Yudkowsky

jacquesthibs29 Mar 2023 23:16 UTC
298 points
296 comments3 min readLW link
(time.com)

LW Team is ad­just­ing mod­er­a­tion policy

Raemon4 Apr 2023 20:41 UTC
296 points
182 comments3 min readLW link

When do “brains beat brawn” in Chess? An experiment

titotal28 Jun 2023 13:33 UTC
293 points
79 comments7 min readLW link
(titotal.substack.com)

Speak­ing to Con­gres­sional staffers about AI risk

4 Dec 2023 23:08 UTC
289 points
23 comments16 min readLW link

The Parable of the King and the Ran­dom Process

moridinamael1 Mar 2023 22:18 UTC
288 points
22 comments6 min readLW link

Pre­dictable up­dat­ing about AI risk

Joe Carlsmith8 May 2023 21:53 UTC
288 points
23 comments36 min readLW link

Towards Monose­man­tic­ity: De­com­pos­ing Lan­guage Models With Dic­tionary Learning

Zac Hatfield-Dodds5 Oct 2023 21:01 UTC
286 points
21 comments2 min readLW link
(transformer-circuits.pub)

So­cial Dark Matter

[DEACTIVATED] Duncan Sabien16 Nov 2023 20:00 UTC
284 points
112 comments34 min readLW link

Hooray for step­ping out of the limelight

So8res1 Apr 2023 2:45 UTC
281 points
24 comments1 min readLW link

OpenAI: The Bat­tle of the Board

Zvi22 Nov 2023 17:30 UTC
277 points
82 comments11 min readLW link
(thezvi.wordpress.com)

My May 2023 pri­ori­ties for AI x-safety: more em­pa­thy, more unifi­ca­tion of con­cerns, and less vil­ifi­ca­tion of OpenAI

Andrew_Critch24 May 2023 0:02 UTC
272 points
39 comments8 min readLW link

Guide to ra­tio­nal­ist in­te­rior decorating

mingyuan19 Jun 2023 6:47 UTC
272 points
45 comments12 min readLW link

Notes on Teach­ing in Prison

jsd19 Apr 2023 1:53 UTC
270 points
12 comments12 min readLW link

The Base Rate Times, news through pre­dic­tion markets

vandemonian6 Jun 2023 17:42 UTC
268 points
40 comments4 min readLW link

We don’t trade with ants

KatjaGrace10 Jan 2023 23:50 UTC
265 points
109 comments7 min readLW link
(worldspiritsockpuppet.com)

OpenAI: Facts from a Weekend

Zvi20 Nov 2023 15:30 UTC
264 points
158 comments9 min readLW link
(thezvi.wordpress.com)

Ac­ci­den­tally Load Bearing

jefftk13 Jul 2023 16:10 UTC
264 points
14 comments1 min readLW link
(www.jefftk.com)

The 6D effect: When com­pa­nies take risks, one email can be very pow­er­ful.

scasper4 Nov 2023 20:08 UTC
261 points
40 comments3 min readLW link