Mak­ing a con­ser­va­tive case for alignment

15 Nov 2024 18:55 UTC
201 points
68 comments7 min readLW link

The Case For Giv­ing To The Shrimp Welfare Project

omnizoid15 Nov 2024 16:03 UTC
−6 points
14 comments7 min readLW link

Win/​con­tinue/​lose sce­nar­ios and ex­e­cute/​re­place/​au­dit protocols

Buck15 Nov 2024 15:47 UTC
54 points
2 comments7 min readLW link

Antonym Heads Pre­dict Se­man­tic Op­po­sites in Lan­guage Models

Jake Ward15 Nov 2024 15:32 UTC
3 points
0 comments5 min readLW link

Propos­ing the Con­di­tional AI Safety Treaty (linkpost TIME)

otto.barten15 Nov 2024 13:59 UTC
10 points
8 comments3 min readLW link
(time.com)

A The­ory of Equil­ibrium in the Offense-Defense Balance

Maxwell Tabarrok15 Nov 2024 13:51 UTC
25 points
6 comments2 min readLW link
(www.maximum-progress.com)

Bos­ton Sec­u­lar Sols­tice 2024: Call for Singers and Musicans

jefftk15 Nov 2024 13:50 UTC
22 points
0 comments1 min readLW link
(www.jefftk.com)

An Un­canny Moat

Adam Newgas15 Nov 2024 11:39 UTC
8 points
0 comments4 min readLW link
(www.boristhebrave.com)

[Question] What are some pos­i­tive de­vel­op­ments in AI safety in 2024?

Satron15 Nov 2024 10:32 UTC
10 points
5 comments1 min readLW link

If I care about mea­sure, choices have ad­di­tional bur­den (+AI gen­er­ated LW-com­ments)

avturchin15 Nov 2024 10:27 UTC
5 points
11 comments2 min readLW link

What are Emo­tions?

Myles H15 Nov 2024 4:20 UTC
4 points
13 comments8 min readLW link

The Third Fun­da­men­tal Question

Screwtape15 Nov 2024 4:01 UTC
66 points
7 comments6 min readLW link

Dance Differentiation

jefftk15 Nov 2024 2:30 UTC
14 points
0 comments1 min readLW link
(www.jefftk.com)

Break­ing be­liefs about sav­ing the world

Oxidize15 Nov 2024 0:46 UTC
2 points
3 comments9 min readLW link

Col­lege tech­ni­cal AI safety hackathon ret­ro­spec­tive—Ge­or­gia Tech

yix15 Nov 2024 0:22 UTC
39 points
2 comments5 min readLW link
(open.substack.com)

Gw­ern Bran­wen in­ter­view on Dwarkesh Pa­tel’s pod­cast: “How an Anony­mous Re­searcher Pre­dicted AI’s Tra­jec­tory”

Said Achmiz14 Nov 2024 23:53 UTC
80 points
0 comments1 min readLW link
(www.dwarkeshpatel.com)

In­ter­nal mu­sic player: phe­nomenol­ogy of earworms

dkl914 Nov 2024 23:29 UTC
6 points
4 comments2 min readLW link
(dkl9.net)

The For­ag­ing (Ex-)Ban­dit [Rule­set & Reflec­tions]

abstractapplic14 Nov 2024 20:16 UTC
27 points
3 comments2 min readLW link

Seven les­sons I didn’t learn from elec­tion day

Eric Neyman14 Nov 2024 18:39 UTC
97 points
33 comments13 min readLW link
(ericneyman.wordpress.com)

Effects of Non-Uniform Spar­sity on Su­per­po­si­tion in Toy Models

Shreyans Jain14 Nov 2024 16:59 UTC
4 points
3 comments6 min readLW link

AI #90: The Wall

Zvi14 Nov 2024 14:10 UTC
32 points
6 comments42 min readLW link
(thezvi.wordpress.com)

Evolu­tion­ary prompt op­ti­miza­tion for SAE fea­ture visualization

14 Nov 2024 13:06 UTC
16 points
0 comments9 min readLW link

AXRP Epi­sode 38.0 - Zhijing Jin on LLMs, Causal­ity, and Multi-Agent Systems

DanielFilan14 Nov 2024 7:00 UTC
14 points
0 comments12 min readLW link

Fron­tierMath: A Bench­mark for Eval­u­at­ing Ad­vanced Math­e­mat­i­cal Rea­son­ing in AI

Tamay14 Nov 2024 6:13 UTC
39 points
0 comments3 min readLW link
(epoch.ai)

Con­crete Meth­ods for Heuris­tic Es­ti­ma­tion on Neu­ral Networks

Oliver Daniels14 Nov 2024 5:07 UTC
28 points
0 comments27 min readLW link

Here­sies in the Shadow of the Sequences

Cole Wyeth14 Nov 2024 5:01 UTC
17 points
12 comments2 min readLW link

liter­ally Hitler

David Gross14 Nov 2024 3:20 UTC
−13 points
0 comments4 min readLW link

Thoughts af­ter the Wolfram and Yud­kowsky discussion

Tahp14 Nov 2024 1:43 UTC
25 points
13 comments6 min readLW link

[Question] Why would ASI share any re­sources with us?

Satron13 Nov 2024 23:38 UTC
6 points
8 comments1 min readLW link

Neutrality

sarahconstantin13 Nov 2024 23:10 UTC
158 points
27 comments11 min readLW link
(sarahconstantin.substack.com)

Anvil Problems

Screwtape13 Nov 2024 22:57 UTC
89 points
13 comments3 min readLW link

[Question] Us­ing hex to get mur­der ad­vice from GPT-4o

Laurence Freeman13 Nov 2024 18:30 UTC
10 points
5 comments6 min readLW link

Con­fronting the le­gion of doom.

Spiritus Dei13 Nov 2024 17:03 UTC
−18 points
2 comments5 min readLW link

Is Deep Learn­ing Ac­tu­ally Hit­ting a Wall? Eval­u­at­ing Ilya Sutskever’s Re­cent Claims

garrison13 Nov 2024 17:00 UTC
84 points
14 comments1 min readLW link
(garrisonlovely.substack.com)

MIT Fu­tureTech are hiring ‍a Product and Data Vi­su­al­iza­tion Designer

peterslattery13 Nov 2024 14:48 UTC
2 points
0 comments4 min readLW link

Sparks of Consciousness

Charlie Sanders13 Nov 2024 4:58 UTC
2 points
0 comments3 min readLW link
(www.dailymicrofiction.com)

Con­tra Mu­si­cian Gen­der II

jefftk13 Nov 2024 3:30 UTC
9 points
0 comments1 min readLW link
(www.jefftk.com)

Flip­ping Out: The Cos­mic Coin­flip Thought Ex­per­i­ment Is Bad Philosophy

Joe Rogero12 Nov 2024 23:55 UTC
34 points
17 comments4 min readLW link

In­cen­tive de­sign and ca­pa­bil­ity elicitation

Joe Carlsmith12 Nov 2024 20:56 UTC
31 points
0 comments12 min readLW link

The Hu­man­i­tar­ian Economy

kylefurlong12 Nov 2024 18:25 UTC
−7 points
14 comments6 min readLW link

Cur­rent At­ti­tudes Toward AI Provide Lit­tle Data Rele­vant to At­ti­tudes Toward AGI

Seth Herd12 Nov 2024 18:23 UTC
16 points
2 comments4 min readLW link

Ba­sics of Han­dling Disagree­ments with People

Camille Berger 12 Nov 2024 17:55 UTC
34 points
4 comments6 min readLW link

Regis­tra­tions Open for 2024 NYC Sec­u­lar Sols­tice & Megameetup

12 Nov 2024 17:50 UTC
13 points
0 comments1 min readLW link

2024 NYC Sec­u­lar Sols­tice & Megameetup

12 Nov 2024 17:46 UTC
18 points
0 comments1 min readLW link

2025 Q1 Pivotal Re­search Fel­low­ship (Tech­ni­cal & Policy)

12 Nov 2024 10:56 UTC
6 points
0 comments2 min readLW link

The­o­ries With Men­tal­is­tic Atoms Are As Val­idly Called The­o­ries As The­o­ries With Only Non-Men­tal­is­tic Atoms

Lorec12 Nov 2024 6:45 UTC
5 points
5 comments8 min readLW link

The ly­ing p value

kqr12 Nov 2024 6:12 UTC
13 points
7 comments1 min readLW link
(entropicthoughts.com)

Model­ing AI-driven oc­cu­pa­tional change over the next 10 years and beyond

2120eth12 Nov 2024 4:58 UTC
1 point
0 comments2 min readLW link

How to Live Well: My Philos­o­phy of Life

Philosofer12312 Nov 2024 4:05 UTC
−5 points
2 comments1 min readLW link

The Pack­ag­ing and the Payload

Screwtape12 Nov 2024 3:07 UTC
76 points
1 comment5 min readLW link