How Your Phys­iol­ogy Affects the Mind’s Pro­jec­tion Fallacy

YanLyutnev14 Dec 2024 21:10 UTC
0 points
0 comments6 min readLW link

In­tro­duc­ing the Ev­i­dence Color Wheel

Larry Lee14 Dec 2024 16:08 UTC
6 points
0 comments3 min readLW link

An Illus­trated Sum­mary of “Ro­bust Agents Learn Causal World Model”

Dalcy14 Dec 2024 15:02 UTC
63 points
2 comments10 min readLW link

Best-of-N Jailbreaking

14 Dec 2024 4:58 UTC
78 points
5 comments2 min readLW link
(arxiv.org)

D&D.Sci Dun­geon­build­ing: the Dun­geon Tournament

aphyer14 Dec 2024 4:30 UTC
49 points
16 comments3 min readLW link

Creat­ing In­ter­pretable La­tent Spaces with Gra­di­ent Routing

Jacob G-W14 Dec 2024 4:00 UTC
26 points
6 comments2 min readLW link
(jacobgw.com)

Prob­a­bil­ity of death by suicide by a 26 year old

John Wiseman14 Dec 2024 3:33 UTC
−25 points
4 comments1 min readLW link

Ma­tryoshka Sparse Autoencoders

Noa Nabeshima14 Dec 2024 2:52 UTC
91 points
15 comments11 min readLW link

[Question] What is MIRI cur­rently do­ing?

Roko14 Dec 2024 2:39 UTC
32 points
14 comments1 min readLW link

The o1 Sys­tem Card Is Not About o1

Zvi13 Dec 2024 20:30 UTC
116 points
5 comments16 min readLW link
(thezvi.wordpress.com)

Arch-an­ar­chy and The Fable of the Dragon-Tyrant

Peter lawless 13 Dec 2024 20:15 UTC
−10 points
0 comments1 min readLW link

Com­mu­ni­ca­tions in Hard Mode (My new job at MIRI)

tanagrabeast13 Dec 2024 20:13 UTC
202 points
25 comments5 min readLW link

First Thoughts on Detachmentism

Jacob Peterson13 Dec 2024 1:19 UTC
−11 points
5 comments9 min readLW link

How to Build Heaven: A Con­strained Boltz­mann Brain Generator

High Tides13 Dec 2024 1:04 UTC
−8 points
3 comments5 min readLW link

Rep­re­sent­ing Ir­ra­tional­ity in Game Theory

Larry Lee13 Dec 2024 0:50 UTC
−1 points
3 comments11 min readLW link

“Char­ity” as a con­fla­tion­ary al­li­ance term

Jan_Kulveit12 Dec 2024 21:49 UTC
34 points
2 comments5 min readLW link

Just one more ex­po­sure bro

Chipmonk12 Dec 2024 21:37 UTC
51 points
6 comments2 min readLW link
(chrislakin.blog)

The Dangers of Mir­rored Life

12 Dec 2024 20:58 UTC
119 points
7 comments29 min readLW link
(www.asimov.press)

Effec­tive Net­work­ing as Send­ing Hard to Fake Signals

vaishnav9212 Dec 2024 20:32 UTC
25 points
2 comments7 min readLW link
(www.optimaloutliers.com)

Mini PAPR Review

jefftk12 Dec 2024 19:10 UTC
10 points
0 comments2 min readLW link
(www.jefftk.com)

Biolog­i­cal risk from the mir­ror world

jasoncrawford12 Dec 2024 19:07 UTC
333 points
37 comments7 min readLW link
(newsletter.rootsofprogress.org)

Nat­u­ral­is­tic dualism

Arturo Macias12 Dec 2024 16:19 UTC
−4 points
0 comments4 min readLW link

AI #94: Not Now, Google

Zvi12 Dec 2024 15:40 UTC
49 points
3 comments64 min readLW link
(thezvi.wordpress.com)

Con­scious­ness, In­tel­li­gence, and AI – Some Quick Notes [call it a mini-ram­ble]

Bill Benzon12 Dec 2024 15:04 UTC
−3 points
0 comments4 min readLW link

The Dis­solu­tion of AI Safety

Roko12 Dec 2024 10:34 UTC
8 points
44 comments1 min readLW link
(www.transhumanaxiology.com)

Is Op­ti­miza­tion Cor­rect?

Yoshinori Okamoto12 Dec 2024 10:27 UTC
−9 points
0 comments2 min readLW link

AXRP Epi­sode 38.3 - Erik Jen­ner on Learned Look-Ahead

DanielFilan12 Dec 2024 5:40 UTC
20 points
0 comments16 min readLW link

Public com­put­ers can make ad­dic­tive tools safe

dkl911 Dec 2024 19:55 UTC
23 points
0 comments1 min readLW link
(dkl9.net)

Solv­ing New­comb’s Para­dox In Real Life

Alice Wanderland11 Dec 2024 19:48 UTC
3 points
0 comments1 min readLW link
(open.substack.com)

The “Think It Faster” Exercise

Raemon11 Dec 2024 19:14 UTC
142 points
35 comments13 min readLW link

Fore­cast With GiveWell

ChristianWilliams11 Dec 2024 17:52 UTC
11 points
0 comments1 min readLW link
(www.metaculus.com)

A short­com­ing of con­crete demon­stra­tions as AGI risk advocacy

Steven Byrnes11 Dec 2024 16:48 UTC
103 points
27 comments2 min readLW link

Why Isn’t Tesla Level 3?

jefftk11 Dec 2024 14:50 UTC
22 points
7 comments2 min readLW link
(www.jefftk.com)

In­vest­ing in Ro­bust Safety Mechanisms is crit­i­cal for re­duc­ing Sys­temic Risks

11 Dec 2024 13:37 UTC
4 points
3 comments2 min readLW link

Post-Quan­tum In­vest­ing: Dump Crypto for In­dex Funds and Real Es­tate?

G11 Dec 2024 11:59 UTC
8 points
5 comments1 min readLW link

Low-effort re­view of “AI For Hu­man­ity”

Charlie Steiner11 Dec 2024 9:54 UTC
13 points
0 comments4 min readLW link

SAEBench: A Com­pre­hen­sive Bench­mark for Sparse Autoencoders

11 Dec 2024 6:30 UTC
82 points
6 comments2 min readLW link
(www.neuronpedia.org)

Zom­bies! Sub­stance Dual­ist Zom­bies?

Ape in the coat11 Dec 2024 6:10 UTC
15 points
7 comments6 min readLW link

My thoughts on cor­re­la­tion and causation

Victor Porton11 Dec 2024 5:08 UTC
−13 points
3 comments1 min readLW link

Why em­piri­cists should be­lieve in AI risk

Knight Lee11 Dec 2024 3:51 UTC
5 points
0 comments1 min readLW link

[Question] fake al­ign­ment solu­tions????

KvmanThinking11 Dec 2024 3:31 UTC
1 point
6 comments1 min readLW link

Se­cond-Time Free

jefftk11 Dec 2024 3:30 UTC
24 points
4 comments1 min readLW link
(www.jefftk.com)

Fron­tier AI sys­tems have sur­passed the self-repli­cat­ing red line

aproteinengine11 Dec 2024 3:06 UTC
9 points
4 comments1 min readLW link
(github.com)

The Tech­nist Re­for­ma­tion: A Dis­cus­sion with o1 About The Com­ing Eco­nomic Event Horizon

Yuli_Ban11 Dec 2024 2:34 UTC
5 points
2 comments17 min readLW link

LessWrong au­dio: help us choose the new voice

11 Dec 2024 2:24 UTC
23 points
1 comment1 min readLW link

Ap­ply to at­tend a Global Challenges Pro­ject work­shop in 2025!

LiamE11 Dec 2024 0:41 UTC
6 points
0 comments2 min readLW link
(forum.effectivealtruism.org)

The MVO and The MVP

kwang10 Dec 2024 23:17 UTC
0 points
0 comments7 min readLW link
(kevw.substack.com)

What is Con­fi­dence—in Game The­ory and Life?

James Stephen Brown10 Dec 2024 23:06 UTC
3 points
0 comments8 min readLW link
(nonzerosum.games)

Com­pu­ta­tional func­tion­al­ism prob­a­bly can’t ex­plain phe­nom­e­nal consciousness

EuanMcLean10 Dec 2024 17:11 UTC
17 points
36 comments12 min readLW link

o1 Turns Pro

Zvi10 Dec 2024 17:00 UTC
59 points
3 comments14 min readLW link
(thezvi.wordpress.com)