Prob­lems with pre­dic­tive his­tory classes

dkl920 Jul 2023 23:28 UTC
15 points
5 comments1 min readLW link

An­nounce­ment: AI Nar­ra­tions Available for All New LessWrong Posts

20 Jul 2023 22:17 UTC
71 points
28 comments1 min readLW link

AI #21: The Cup Overfloweth

Zvi20 Jul 2023 21:30 UTC
47 points
4 comments64 min readLW link
(thezvi.wordpress.com)

All AGI Safety ques­tions wel­come (es­pe­cially ba­sic ones) [July 2023]

smallsilo20 Jul 2023 20:20 UTC
38 points
40 comments2 min readLW link
(forum.effectivealtruism.org)

Growth of Publi­cly Available Ge­netic Se­quenc­ing Data

jefftk20 Jul 2023 19:50 UTC
11 points
2 comments1 min readLW link
(www.jefftk.com)

Progress links and tweets, 2023-07-20: “A god­dess en­throned on a car”

jasoncrawford20 Jul 2023 18:28 UTC
12 points
4 comments2 min readLW link
(rootsofprogress.org)

Boundary Place­ment Rebellion

tailcalled20 Jul 2023 17:40 UTC
54 points
21 comments12 min readLW link

Go­ing Beyond Lin­ear Mode Con­nec­tivity: The Lay­er­wise Lin­ear Fea­ture Connectivity

zhanpeng_zhou20 Jul 2023 17:38 UTC
22 points
13 comments3 min readLW link
(openreview.net)

Even Su­per­hu­man Go AIs Have Sur­pris­ing Failure Modes

20 Jul 2023 17:31 UTC
129 points
22 comments10 min readLW link
(far.ai)

Paper di­ges­tion: “May We Have Your At­ten­tion Please? Hu­man-Rights NGOs and the Prob­lem of Global Com­mu­ni­ca­tion”

Klara Helene Nielsen20 Jul 2023 17:08 UTC
4 points
1 comment2 min readLW link
(journals.sagepub.com)

The (short) case for pre­dict­ing what Aliens value

Jim Buhler20 Jul 2023 15:25 UTC
14 points
5 comments3 min readLW link

Does Cir­cuit Anal­y­sis In­ter­pretabil­ity Scale? Ev­i­dence from Mul­ti­ple Choice Ca­pa­bil­ities in Chinchilla

20 Jul 2023 10:50 UTC
44 points
3 comments2 min readLW link
(arxiv.org)

Spec­u­la­tive in­fer­ences about path de­pen­dence in LLM su­per­vised fine-tun­ing from re­sults on lin­ear mode con­nec­tivity and model souping

RobertKirk20 Jul 2023 9:56 UTC
39 points
2 comments5 min readLW link

A case for ga­mete per­son­hood (re­duc­tio ad ab­sur­dum)

Ansyn131220 Jul 2023 8:25 UTC
−1 points
4 comments1 min readLW link

Con­tra Con­tra the So­cial Model of Disability

DirectedEvolution20 Jul 2023 6:59 UTC
20 points
22 comments16 min readLW link

[Question] Do you speed up ca­pa­bil­ities when you do AI in­te­gra­tions and con­sume over­hangs?

Michael Tontchev20 Jul 2023 6:40 UTC
6 points
1 comment1 min readLW link

[Question] How nec­es­sary is in­tu­ition, for ad­vanced math?

Nicholas / Heather Kross20 Jul 2023 0:18 UTC
11 points
8 comments1 min readLW link

Pro­ject Lawful Au­dio­book: An Unoffi­cial Fan Pro­duc­tion with ElevenLabs AI

Askwho19 Jul 2023 23:34 UTC
22 points
3 comments1 min readLW link
(askwhocastsai.substack.com)

Us­ing pre­dic­tors in cor­rigible systems

porby19 Jul 2023 22:29 UTC
19 points
6 comments27 min readLW link

men­tal num­ber lines

bhauth19 Jul 2023 21:01 UTC
10 points
5 comments1 min readLW link

[Question] Any sug­ges­tions for an im­pact­ful mas­ter’s the­sis in Poli­ti­cal Science?

Klara Helene Nielsen19 Jul 2023 17:44 UTC
1 point
0 comments1 min readLW link

In­ci­dent re­port­ing for AI safety

Zach Stein-Perlman19 Jul 2023 17:00 UTC
22 points
0 comments1 min readLW link

Align­ment Grant­mak­ing is Fund­ing-Limited Right Now

johnswentworth19 Jul 2023 16:49 UTC
312 points
68 comments1 min readLW link

Zener Science

Screwtape19 Jul 2023 16:40 UTC
16 points
11 comments6 min readLW link

Tal­linn, Es­to­nia ACX Sum­mer Meetup

Andrew19 Jul 2023 16:22 UTC
1 point
1 comment1 min readLW link

Desider­ata for an AI

Nathan Helm-Burger19 Jul 2023 16:18 UTC
9 points
0 comments4 min readLW link

Valuism—an ap­proach to life for you to consider

spencerg19 Jul 2023 15:23 UTC
17 points
2 comments1 min readLW link

He­donic Loops and Tam­ing RL

beren19 Jul 2023 15:12 UTC
20 points
14 comments9 min readLW link

[Question] What Caused the Puz­zling De­cline in Ac­tivism Against Policy Violence Towards Black Peo­ple?

ChristianKl19 Jul 2023 14:40 UTC
12 points
2 comments1 min readLW link

Lisa Feld­man Bar­rett ver­sus Paul Ek­man on fa­cial ex­pres­sions & ba­sic emotions

Steven Byrnes19 Jul 2023 14:26 UTC
30 points
15 comments15 min readLW link

AISN#15: China and the US take ac­tion to reg­u­late AI, re­sults from a tour­na­ment fore­cast­ing AI risk, up­dates on xAI’s plan, and Meta re­leases its open-source and com­mer­cially available Llama 2

19 Jul 2023 13:01 UTC
16 points
0 comments6 min readLW link
(newsletter.safe.ai)

Tech­nolog­i­cal solu­tions to the cli­mate crisis

dominicq19 Jul 2023 12:39 UTC
6 points
5 comments3 min readLW link
(sundaystopwatch.eu)

Se­cret Cos­mos: Introduction

Al Link19 Jul 2023 11:51 UTC
−35 points
3 comments14 min readLW link
(allink.substack.com)

Cri­tiques of promi­nent AI safety or­ga­ni­za­tions: Introduction

Omega.19 Jul 2023 6:54 UTC
7 points
0 comments5 min readLW link
(forum.effectivealtruism.org)

House Gro­cery Spending

jefftk19 Jul 2023 3:00 UTC
13 points
0 comments5 min readLW link
(www.jefftk.com)

A brief his­tory of computers

Adam Zerner19 Jul 2023 2:59 UTC
72 points
18 comments33 min readLW link

Sim­ple al­ign­ment plan that maybe works

Iknownothing18 Jul 2023 22:48 UTC
4 points
8 comments1 min readLW link

Pros­pera-dump

tailcalled18 Jul 2023 21:36 UTC
10 points
16 comments1 min readLW link

Tiny Mech In­terp Pro­jects: Emer­gent Po­si­tional Embed­dings of Words

Neel Nanda18 Jul 2023 21:24 UTC
51 points
1 comment9 min readLW link

Quick Thoughts on Lan­guage Models

RohanS18 Jul 2023 20:38 UTC
6 points
0 comments4 min readLW link

Still no Lie De­tec­tor for LLMs

18 Jul 2023 19:56 UTC
47 points
2 comments21 min readLW link

Meta an­nounces Llama 2; “open sources” it for com­mer­cial use

LawrenceC18 Jul 2023 19:28 UTC
46 points
12 comments1 min readLW link
(about.fb.com)

The Rope Man­age­ment The­ory: A Com­pre­hen­sive Ap­proach to Mo­du­lat­ing Re­ward Per­cep­tion and Miti­gat­ing He­donic Adaptation

Eris Discordia18 Jul 2023 17:45 UTC
−23 points
2 comments3 min readLW link

AI Im­pacts Quar­terly Newslet­ter, Apr-Jun 2023

18 Jul 2023 17:14 UTC
6 points
0 comments3 min readLW link
(blog.aiimpacts.org)

Clever ar­guers give weak ev­i­dence, not zero

dkl918 Jul 2023 17:07 UTC
7 points
2 comments1 min readLW link
(dkl9.net)

Mea­sur­ing and Im­prov­ing the Faith­ful­ness of Model-Gen­er­ated Rea­son­ing

18 Jul 2023 16:36 UTC
111 points
14 comments6 min readLW link

[Question] Least-prob­le­matic Re­source for learn­ing RL?

Dalcy18 Jul 2023 16:30 UTC
9 points
7 comments1 min readLW link

Char­ter Cities: why they’re ex­cit­ing & how they might work

Jackson Wagner18 Jul 2023 13:57 UTC
20 points
7 comments1 min readLW link

Nar­ra­tive The­ory. Part 6. Ar­tifi­cial Neu­ral Networks

Eris18 Jul 2023 9:22 UTC
3 points
0 comments2 min readLW link

Train for in­cor­rigi­bil­ity, then re­verse it (Shut­down Prob­lem Con­test Sub­mis­sion)

Daniel_Eth18 Jul 2023 8:26 UTC
9 points
1 comment1 min readLW link