Koan: di­v­in­ing alien datas­truc­tures from RAM activations

TsviBT5 Apr 2024 18:04 UTC
42 points
10 comments21 min readLW link

On the 2nd CWT with Jonathan Haidt

Zvi5 Apr 2024 17:30 UTC
27 points
3 comments33 min readLW link
(thezvi.wordpress.com)

End-to-end hack­ing with lan­guage models

tchauvin5 Apr 2024 15:06 UTC
29 points
0 comments8 min readLW link

Par­tial value takeover with­out world takeover

KatjaGrace5 Apr 2024 6:20 UTC
89 points
23 comments3 min readLW link
(worldspiritsockpuppet.com)

On Com­plex­ity Science

Garrett Baker5 Apr 2024 2:24 UTC
50 points
19 comments4 min readLW link

Us­ing game the­ory to elect a cen­trist in the 2024 US Pres­i­den­tial Election

Ebenezer Dukakis5 Apr 2024 0:46 UTC
−3 points
0 comments8 min readLW link

New re­port: A re­view of the em­piri­cal ev­i­dence for ex­is­ten­tial risk from AI via mis­al­igned power-seeking

4 Apr 2024 23:41 UTC
31 points
5 comments1 min readLW link
(blog.aiimpacts.org)

Quick ev­i­dence re­view of bulk­ing & cutting

jp4 Apr 2024 21:43 UTC
31 points
5 comments4 min readLW link

LLMs for Align­ment Re­search: a safety pri­or­ity?

abramdemski4 Apr 2024 20:03 UTC
145 points
24 comments11 min readLW link

On Leif We­nar’s Ab­surdly Un­con­vinc­ing Cri­tique Of Effec­tive Altru­ism

omnizoid4 Apr 2024 19:01 UTC
8 points
2 comments14 min readLW link

Run evals on base mod­els too!

orthonormal4 Apr 2024 18:43 UTC
47 points
6 comments1 min readLW link

Let’s Fund: Im­pact of our $1M crowd­funded grant to the Cen­ter for Clean En­ergy Innovation

Hauke Hillebrandt4 Apr 2024 16:28 UTC
5 points
0 comments1 min readLW link
(lets-fund.org)

The Buck­ling World Hy­poth­e­sis—Vi­su­al­is­ing Vuln­er­a­ble Worlds

Rosco-Hunter4 Apr 2024 15:51 UTC
−5 points
2 comments4 min readLW link

Can AI Trans­form the Elec­torate into a Ci­ti­zen’s Assem­bly?

Rosco-Hunter4 Apr 2024 15:45 UTC
−6 points
0 comments4 min readLW link

AI Discrim­i­na­tion Re­quire­ments: A Reg­u­la­tory Review

4 Apr 2024 15:43 UTC
7 points
0 comments6 min readLW link

Try­ing to Do More Good

jefftk4 Apr 2024 14:20 UTC
18 points
0 comments12 min readLW link
(www.jefftk.com)

Lan­guage and Ca­pa­bil­ities: Test­ing LLM Math­e­mat­i­cal Abil­ities Across Languages

Ethan Edwards4 Apr 2024 13:18 UTC
24 points
2 comments36 min readLW link

AI #58: Star­gate AGI

Zvi4 Apr 2024 13:10 UTC
49 points
9 comments60 min readLW link
(thezvi.wordpress.com)

Cult of equilibrium

Templarrr4 Apr 2024 9:19 UTC
11 points
2 comments1 min readLW link

[Question] Should you re­fuse this bet in Tech­ni­color Sleep­ing Beauty?

Ape in the coat4 Apr 2024 8:55 UTC
16 points
15 comments1 min readLW link

[Question] What’s with all the bans re­cently?

Gerald Monroe4 Apr 2024 6:16 UTC
65 points
83 comments4 min readLW link

Best in Class Life Improvement

sapphire4 Apr 2024 1:51 UTC
68 points
20 comments16 min readLW link

[Question] What is the pur­pose and ap­pli­ca­tion of AI De­bate?

VojtaKovarik4 Apr 2024 0:38 UTC
13 points
9 comments1 min readLW link

Con­crete em­piri­cal re­search pro­jects in mechanis­tic anomaly detection

3 Apr 2024 23:07 UTC
43 points
3 comments10 min readLW link

A gen­tle in­tro­duc­tion to mechanis­tic anomaly detection

Erik Jenner3 Apr 2024 23:06 UTC
71 points
2 comments11 min readLW link

$250K in Prizes: SafeBench Com­pe­ti­tion An­nounce­ment

ozhang3 Apr 2024 22:07 UTC
26 points
0 comments1 min readLW link

The Case for Pre­dic­tive Models

Rubi J. Hudson3 Apr 2024 18:22 UTC
43 points
7 comments8 min readLW link

Book Re­view (mini): Co-In­tel­li­gence by Ethan Mollick

Darren McKee3 Apr 2024 17:33 UTC
4 points
0 comments1 min readLW link

Spar­sify: A mechanis­tic in­ter­pretabil­ity re­search agenda

Lee Sharkey3 Apr 2024 12:34 UTC
94 points
22 comments22 min readLW link

Just be­cause 2 things are op­po­sites, doesn’t mean they’re just the same but flipped

Alok Singh3 Apr 2024 8:59 UTC
20 points
18 comments2 min readLW link
(alok.github.io)

Fal­ling fer­til­ity ex­pla­na­tions and Israel

Yair Halberstadt3 Apr 2024 3:27 UTC
31 points
4 comments2 min readLW link

Na­ture is an in­finite sphere whose cen­ter is ev­ery­where and cir­cum­fer­ence is nowhere

Alok Singh3 Apr 2024 2:24 UTC
11 points
2 comments3 min readLW link

The Ra­tion­al­ist Hag­gadot Collection

maia2 Apr 2024 20:02 UTC
22 points
0 comments1 min readLW link
(tigrennatenn.neocities.org)

[Question] How Often Does ¬Cor­re­la­tion ⇏ ¬Cau­sa­tion?

niplav2 Apr 2024 17:58 UTC
19 points
17 comments2 min readLW link

[EA xpost] The Ra­tionale-Shaped Hole At The Heart Of Forecasting

dschwarz2 Apr 2024 17:40 UTC
23 points
2 comments2 min readLW link
(forum.effectivealtruism.org)

Reli­gion = Cult + Culture

Eneasz2 Apr 2024 16:44 UTC
17 points
9 comments4 min readLW link
(deathisbad.substack.com)

BIDA Elec­tion Thoughts

jefftk2 Apr 2024 15:30 UTC
9 points
0 comments1 min readLW link
(www.jefftk.com)

Fer­til­ity Roundup #3

Zvi2 Apr 2024 14:50 UTC
21 points
11 comments31 min readLW link
(thezvi.wordpress.com)

What can we learn about childrea­r­ing from J. S. Mill?

Adam Scherlis2 Apr 2024 6:06 UTC
9 points
1 comment1 min readLW link

OMMC An­nounces RIP

1 Apr 2024 23:20 UTC
189 points
5 comments2 min readLW link

Co­her­ence of Caches and Agents

johnswentworth1 Apr 2024 23:04 UTC
76 points
9 comments11 min readLW link

LessWrong: After Dark, a new side of LessWrong

So8res1 Apr 2024 22:44 UTC
35 points
5 comments1 min readLW link

Gra­di­ent Des­cent on the Hu­man Brain

1 Apr 2024 22:39 UTC
52 points
5 comments2 min readLW link

[Question] Do I count as e/​acc for ex­clu­sion pur­poses?

denyeverywhere1 Apr 2024 21:18 UTC
0 points
31 comments1 min readLW link

Self Ex­plain­ing Neu­ral Net­works, the in­ter­pretabil­ity tech­nique no one seems to be talk­ing about.

f3mi1 Apr 2024 20:52 UTC
5 points
0 comments4 min readLW link

Death with Awesomeness

osmarks1 Apr 2024 20:24 UTC
3 points
2 comments2 min readLW link

[GPT-4] On the Grad­ual Emer­gence of Mech­a­nized In­tel­lect: A Trea­tise from the Year 1924

tailcalled1 Apr 2024 19:14 UTC
11 points
0 comments2 min readLW link

Notes on Dwarkesh Pa­tel’s Pod­cast with Sholto Dou­glas and Tren­ton Bricken

Zvi1 Apr 2024 19:10 UTC
41 points
1 comment16 min readLW link
(thezvi.wordpress.com)

So You Created a So­ciopath—New Book An­nounce­ment!

Garrett Baker1 Apr 2024 18:02 UTC
52 points
3 comments1 min readLW link

An­nounc­ing Suffer­ing For Good

Garrett Baker1 Apr 2024 17:08 UTC
72 points
5 comments1 min readLW link