The Deep Lore of LightHaven, with Oliver Habryka (TBC epi­sode 228)

24 Dec 2024 22:45 UTC
45 points
4 comments91 min readLW link
(thebayesianconspiracy.substack.com)

Ac­knowl­edg­ing Back­ground In­for­ma­tion with P(Q|I)

JenniferRM24 Dec 2024 18:50 UTC
29 points
8 comments14 min readLW link

Game The­ory and Be­hav­ioral Eco­nomics in The Stock Mar­ket

Jaiveer Singh24 Dec 2024 18:15 UTC
1 point
0 comments3 min readLW link

[Question] What are the main ar­gu­ments against AGI?

Edy Nastase24 Dec 2024 15:49 UTC
1 point
6 comments1 min readLW link

[Question] Recom­men­da­tions on com­mu­ni­ties that dis­cuss AI ap­pli­ca­tions in society

Annapurna24 Dec 2024 13:37 UTC
7 points
2 comments1 min readLW link

AIs Will In­creas­ingly Fake Alignment

Zvi24 Dec 2024 13:00 UTC
89 points
0 comments52 min readLW link
(thezvi.wordpress.com)

Ap­ply to the 2025 PIBBSS Sum­mer Re­search Fellowship

24 Dec 2024 10:25 UTC
15 points
0 comments2 min readLW link

Hu­man-AI Com­ple­men­tar­ity: A Goal for Am­plified Oversight

rishubjain24 Dec 2024 9:57 UTC
21 points
1 comment1 min readLW link
(deepmindsafetyresearch.medium.com)

Pre­limi­nary Thoughts on Flirt­ing Theory

la .alis.24 Dec 2024 7:37 UTC
12 points
6 comments3 min readLW link

[Question] Why is neu­ron count of hu­man brain rele­vant to AI timelines?

xpostah24 Dec 2024 5:15 UTC
6 points
7 comments1 min readLW link

How Much to Give is a Prag­matic Question

jefftk24 Dec 2024 4:20 UTC
12 points
1 comment2 min readLW link
(www.jefftk.com)

Do you need a bet­ter map of your myr­iad of maps to the ter­ri­tory?

CstineSublime24 Dec 2024 2:00 UTC
11 points
2 comments5 min readLW link

Panology

JenniferRM23 Dec 2024 21:40 UTC
11 points
8 comments5 min readLW link

Aris­to­tle, Aquinas, and the Evolu­tion of Tele­ol­ogy: From Pur­pose to Mean­ing.

Spiritus Dei23 Dec 2024 19:37 UTC
−7 points
0 comments6 min readLW link

Peo­ple aren’t prop­erly cal­ibrated on FrontierMath

cakubilo23 Dec 2024 19:35 UTC
30 points
4 comments3 min readLW link

Near- and medium-term AI Con­trol Safety Cases

Martín Soto23 Dec 2024 17:37 UTC
9 points
0 comments6 min readLW link

[Ra­tion­al­ity Malaysia] 2024 year-end meetup!

Doris Liew23 Dec 2024 16:02 UTC
1 point
0 comments1 min readLW link

Printable book of some ra­tio­nal­ist cre­ative writ­ing (from Scott A. & Eliezer)

CounterBlunder23 Dec 2024 15:44 UTC
5 points
0 comments1 min readLW link

Monthly Roundup #25: De­cem­ber 2024

Zvi23 Dec 2024 14:20 UTC
18 points
3 comments26 min readLW link
(thezvi.wordpress.com)

Ex­plor­ing the pe­ter­todd /​ Leilan du­al­ity in GPT-2 and GPT-J

mwatkins23 Dec 2024 13:17 UTC
10 points
0 comments17 min readLW link

[Question] What are the strongest ar­gu­ments for very short timelines?

Kaj_Sotala23 Dec 2024 9:38 UTC
94 points
73 comments1 min readLW link

Re­duce AI Self-Alle­giance by say­ing “he” in­stead of “I”

Knight Lee23 Dec 2024 9:32 UTC
6 points
4 comments2 min readLW link

Fund­ing Case: AI Safety Camp 11

23 Dec 2024 8:51 UTC
23 points
0 comments6 min readLW link
(manifund.org)

What is com­pute gov­er­nance?

Vishakha23 Dec 2024 6:32 UTC
6 points
0 comments2 min readLW link
(aisafety.info)

Stop Mak­ing Sense

JenniferRM23 Dec 2024 5:16 UTC
15 points
0 comments3 min readLW link

Hire (or Be­come) a Think­ing Assistant

Raemon23 Dec 2024 3:58 UTC
119 points
42 comments8 min readLW link

Non-Ob­vi­ous Benefits of Insurance

jefftk23 Dec 2024 3:40 UTC
21 points
5 comments2 min readLW link
(www.jefftk.com)

Vi­sion of a pos­i­tive Singularity

RussellThor23 Dec 2024 2:19 UTC
4 points
0 comments4 min readLW link

Ide­olo­gies are slow and nec­es­sary, for now

Gabriel Alfour23 Dec 2024 1:57 UTC
9 points
1 comment1 min readLW link
(cognition.cafe)

Pro­pa­ganda Is Every­where—LLM Models Are No Exception

Yanling Guo23 Dec 2024 1:39 UTC
−13 points
0 comments3 min readLW link

[Question] Has An­thropic checked if Claude fakes al­ign­ment for in­tended val­ues too?

Maloew23 Dec 2024 0:43 UTC
4 points
1 comment1 min readLW link

Ve­gans need to eat just enough Meat—em­per­i­cally eval­u­ate the min­i­mum am­mount of meat that max­i­mizes utility

Johannes C. Mayer22 Dec 2024 22:08 UTC
55 points
34 comments3 min readLW link

We are in a New Paradigm of AI Progress—OpenAI’s o3 model makes huge gains on the tough­est AI bench­marks in the world

garrison22 Dec 2024 21:45 UTC
17 points
3 comments1 min readLW link
(garrisonlovely.substack.com)

My AI timelines

xpostah22 Dec 2024 21:06 UTC
12 points
2 comments5 min readLW link
(samuelshadrach.com)

A break­down of AI ca­pa­bil­ity lev­els fo­cused on AI R&D la­bor acceleration

ryan_greenblatt22 Dec 2024 20:56 UTC
92 points
5 comments6 min readLW link

How I saved 1 hu­man life (in ex­pec­ta­tion) with­out over­think­ing it

Christopher King22 Dec 2024 20:53 UTC
14 points
0 comments4 min readLW link

Towards mu­tu­ally as­sured cooperation

mikko22 Dec 2024 20:46 UTC
4 points
0 comments2 min readLW link

Check­ing in on Scott’s com­po­si­tion image bet with ima­gen 3

Dave Orr22 Dec 2024 19:04 UTC
61 points
0 comments1 min readLW link

Woloch & Wosatan

JackOfAllTrades22 Dec 2024 15:46 UTC
−11 points
0 comments2 min readLW link

A primer on ma­chine learn­ing in cryo-elec­tron microscopy (cryo-EM)

Abhishaike Mahajan22 Dec 2024 15:11 UTC
17 points
0 comments25 min readLW link
(www.owlposting.com)

Notes from Copen­hagen Sec­u­lar Sols­tice 2024

Søren Elverlin22 Dec 2024 15:08 UTC
9 points
0 comments3 min readLW link

Proof Ex­plained for “Ro­bust Agents Learn Causal World Model”

Dalcy22 Dec 2024 15:06 UTC
18 points
0 comments15 min readLW link

sub­func­tional over­laps in at­ten­tional se­lec­tion his­tory im­plies mo­men­tum for de­ci­sion-trajectories

Emrik22 Dec 2024 14:12 UTC
19 points
1 comment2 min readLW link

It looks like there are some good fund­ing op­por­tu­ni­ties in AI safety right now

Benjamin_Todd22 Dec 2024 12:41 UTC
20 points
0 comments4 min readLW link
(benjamintodd.substack.com)

What o3 Be­comes by 2028

Vladimir_Nesov22 Dec 2024 12:37 UTC
123 points
15 comments5 min readLW link

The Align­ment Simulator

Yair Halberstadt22 Dec 2024 11:45 UTC
24 points
3 comments2 min readLW link
(yairhalberstadt.github.io)

The­o­ret­i­cal Align­ment’s Se­cond Chance

lunatic_at_large22 Dec 2024 5:03 UTC
19 points
0 comments2 min readLW link

Ori­ent­ing to 3 year AGI timelines

Nikola Jurkovic22 Dec 2024 1:15 UTC
218 points
37 comments8 min readLW link

ARC-AGI is a gen­uine AGI test but o3 cheated :(

Knight Lee22 Dec 2024 0:58 UTC
0 points
2 comments2 min readLW link

When AI 10x’s AI R&D, What Do We Do?

Logan Riggs21 Dec 2024 23:56 UTC
70 points
14 comments4 min readLW link