Bro­ken La­tents: Study­ing SAEs and Fea­ture Co-oc­cur­rence in Toy Models

30 Dec 2024 22:50 UTC
21 points
3 comments15 min readLW link

Ge­net­i­cally ed­ited mosquitoes haven’t scaled yet. Why?

alexey30 Dec 2024 21:37 UTC
24 points
0 comments1 min readLW link
(eryney.substack.com)

Linkpost: Look at the Water

J Bostock30 Dec 2024 19:49 UTC
4 points
3 comments4 min readLW link

The low In­for­ma­tion Den­sity of Eliezer Yud­kowsky & LessWrong

Felix Olszewski30 Dec 2024 19:43 UTC
14 points
8 comments1 min readLW link

o3, Oh My

Zvi30 Dec 2024 14:10 UTC
60 points
17 comments36 min readLW link
(thezvi.wordpress.com)

World mod­els I’m cur­rently building

samuelshadrach30 Dec 2024 8:26 UTC
1 point
0 comments16 min readLW link
(samuelshadrach.com)

Is “VNM-agent” one of sev­eral op­tions, for what minds can grow up into?

AnnaSalamon30 Dec 2024 6:36 UTC
88 points
54 comments2 min readLW link

Why I’m Mov­ing from Mechanis­tic to Pro­saic Interpretability

Daniel Tan30 Dec 2024 6:35 UTC
111 points
34 comments5 min readLW link

When do ex­perts think hu­man-level AI will be cre­ated?

30 Dec 2024 6:20 UTC
10 points
0 comments2 min readLW link
(aisafety.info)

2025 Pre­dic­tion Thread

habryka30 Dec 2024 1:50 UTC
77 points
18 comments1 min readLW link

The Great OpenAI De­bate: Should It Stay ‘Open’ or Go Pri­vate?

Satya30 Dec 2024 1:14 UTC
−1 points
0 comments3 min readLW link

Learn to write well BEFORE you have some­thing worth saying

eukaryote29 Dec 2024 23:42 UTC
67 points
18 comments3 min readLW link
(eukaryotewritesblog.com)

Teach­ing Claude to Meditate

Gordon Seidoh Worley29 Dec 2024 22:27 UTC
−7 points
4 comments23 min readLW link

Ac­tion: how do you REALLY go about do­ing?

DDthinker29 Dec 2024 22:00 UTC
−7 points
1 comment4 min readLW link

Be­gan a pay-on-re­sults coach­ing ex­per­i­ment, made $40,300 since July

Chipmonk29 Dec 2024 21:12 UTC
43 points
14 comments1 min readLW link
(chrislakin.blog)

Cor­rigi­bil­ity should be an AI’s Only Goal

PeterMcCluskey29 Dec 2024 20:25 UTC
9 points
1 comment8 min readLW link
(bayesianinvestor.com)

Mak­ing LLMs safer is more in­tu­itive than you think: How Com­mon Sense and Diver­sity Im­prove AI Align­ment

Jeba Sania29 Dec 2024 19:27 UTC
−5 points
1 comment6 min readLW link

[Question] Could my work, “Beyond HaHa” benefit the LessWrong com­mu­nity?

P. João29 Dec 2024 16:14 UTC
9 points
6 comments1 min readLW link

Book Sum­mary: Zero to One

bilalchughtai29 Dec 2024 16:13 UTC
27 points
2 comments8 min readLW link

Bos­ton Sols­tice 2024 Retrospective

jefftk29 Dec 2024 15:40 UTC
15 points
0 comments4 min readLW link
(www.jefftk.com)

Some ar­gu­ments against a land value tax

Matthew Barnett29 Dec 2024 15:17 UTC
83 points
39 comments15 min readLW link

Pre­dic­tions of Near-Term So­cietal Changes Due to Ar­tifi­cial Intelligence

Annapurna29 Dec 2024 14:53 UTC
10 points
0 comments6 min readLW link
(jorgevelez.substack.com)

Con­sid­er­a­tions on orca intelligence

Towards_Keeperhood29 Dec 2024 14:35 UTC
48 points
5 comments9 min readLW link

AI Align­ment, and where we stand.

afeller0829 Dec 2024 14:08 UTC
−17 points
0 comments2 min readLW link

The Le­gacy of Com­puter Science

Johannes C. Mayer29 Dec 2024 13:15 UTC
17 points
0 comments1 min readLW link
(groups.csail.mit.edu)

Shal­low re­view of tech­ni­cal AI safety, 2024

29 Dec 2024 12:01 UTC
174 points
33 comments41 min readLW link

Dish­brain and im­pli­ca­tions.

RussellThor29 Dec 2024 10:42 UTC
4 points
0 comments2 min readLW link

Notes on Altruism

David Gross29 Dec 2024 3:13 UTC
17 points
2 comments34 min readLW link

Re­ject­ing An­thro­po­mor­phic Bias: Ad­dress­ing Fears of AGI and Transformation

Gedankensprünge29 Dec 2024 1:48 UTC
−17 points
1 comment3 min readLW link

What hap­pens next?

Logan Zoellner29 Dec 2024 1:41 UTC
40 points
19 comments2 min readLW link

The Mis­con­cep­tion of AGI as an Ex­is­ten­tial Threat: A Reassessment

Gedankensprünge29 Dec 2024 1:39 UTC
−25 points
0 comments2 min readLW link

Does Claude Pri­ori­tize Some Prompt In­put Chan­nels Over Others?

keltan29 Dec 2024 1:21 UTC
9 points
2 comments5 min readLW link

Im­pact in AI Safety Now Re­quires Spe­cific Strate­gic Insight

MiloSal29 Dec 2024 0:40 UTC
27 points
1 comment6 min readLW link
(ameliorology.substack.com)

Mo­ral­ity Is Still Demanding

utilistrutil29 Dec 2024 0:33 UTC
−8 points
2 comments1 min readLW link

Emer­gence and Am­plifi­ca­tion of Survival

jgraves0128 Dec 2024 23:52 UTC
−1 points
0 comments3 min readLW link

[Question] Has Some­one Checked The Cold-Water-In-Left-Ear Thing?

Maloew28 Dec 2024 20:15 UTC
9 points
0 comments1 min readLW link

By de­fault, cap­i­tal will mat­ter more than ever af­ter AGI

L Rudolf L28 Dec 2024 17:52 UTC
267 points
99 comments16 min readLW link
(nosetgauge.substack.com)

AI As­sis­tants Should Have a Direct Line to Their Developers

Jan_Kulveit28 Dec 2024 17:01 UTC
55 points
6 comments2 min readLW link

No, the Poly­mar­ket price does not mean we can im­me­di­ately con­clude what the prob­a­bil­ity of a bird flu pan­demic is. We also need to know the in­ter­est rate!

Christopher King28 Dec 2024 16:05 UTC
5 points
8 comments1 min readLW link

The av­er­age ra­tio­nal­ist IQ is about 122

Rockenots28 Dec 2024 15:42 UTC
22 points
23 comments1 min readLW link

Why OpenAI’s Struc­ture Must Evolve To Ad­vance Our Mission

stuhlmueller28 Dec 2024 4:24 UTC
19 points
1 comment1 min readLW link
(openai.com)

The Eng­ineer­ing Ar­gu­ment Fal­lacy: Why Tech­nolog­i­cal Suc­cess Doesn’t Val­i­date Physics

Wenitte Apiou28 Dec 2024 0:49 UTC
−16 points
5 comments2 min readLW link

The Robot, the Pup­pet-mas­ter, and the Psychohistorian

WillPetillo28 Dec 2024 0:12 UTC
7 points
2 comments3 min readLW link

Progress links and short notes, 2024-12-27: Clini­cal trial abun­dance, grid-scale fu­sion, per­mit­ting vs. com­pli­ance, cross­word ma­nia, and more

jasoncrawford27 Dec 2024 23:34 UTC
11 points
0 comments2 min readLW link
(newsletter.rootsofprogress.org)

Greedy-Ad­van­tage-Aware RLHF

sej202027 Dec 2024 19:47 UTC
48 points
15 comments13 min readLW link

De­con­struct­ing ar­gu­ments against AI art

DMMF27 Dec 2024 19:40 UTC
7 points
5 comments5 min readLW link
(danfrank.ca)

From the Archives: a story

Richard_Ngo27 Dec 2024 16:36 UTC
18 points
1 comment16 min readLW link
(www.narrativeark.xyz)

[Question] What’s the best met­ric for mea­sur­ing qual­ity of life?

ChristianKl27 Dec 2024 14:29 UTC
10 points
5 comments1 min readLW link

Re­view: Planecrash

L Rudolf L27 Dec 2024 14:18 UTC
325 points
41 comments21 min readLW link
(nosetgauge.substack.com)

Good For­tune and Many Worlds

Jonah Wilberg27 Dec 2024 13:21 UTC
4 points
0 comments5 min readLW link