Cur­ricu­lum of Ascension

andrew sauer7 Nov 2024 23:54 UTC
13 points
0 comments18 min readLW link

An­a­lyz­ing how SAE fea­tures evolve across a for­ward pass

7 Nov 2024 22:07 UTC
47 points
0 comments1 min readLW link
(arxiv.org)

Mar­kets Are In­for­ma­tion—Beat­ing the Sports­books at Their Own Game

JJXW7 Nov 2024 20:58 UTC
9 points
1 comment2 min readLW link
(thehobbyist.substack.com)

Sig­nal­ing with Small Orange Diamonds

jefftk7 Nov 2024 20:20 UTC
39 points
1 comment1 min readLW link
(www.jefftk.com)

Fun­da­men­tal Uncer­tainty: Chap­ter 9 - How do we live with un­cer­tainty?

Gordon Seidoh Worley7 Nov 2024 18:15 UTC
11 points
2 comments15 min readLW link

AI #89: Trump Card

Zvi7 Nov 2024 16:30 UTC
42 points
12 comments42 min readLW link
(thezvi.wordpress.com)

Quan­tum Im­mor­tal­ity: A Per­spec­tive if AI Doomers are Prob­a­bly Right

7 Nov 2024 16:06 UTC
10 points
55 comments14 min readLW link

On Tar­geted Ma­nipu­la­tion and De­cep­tion when Op­ti­miz­ing LLMs for User Feedback

7 Nov 2024 15:39 UTC
50 points
7 comments11 min readLW link

In the Name of All That Needs Saving

pleiotroth7 Nov 2024 15:26 UTC
18 points
2 comments22 min readLW link

Agency over­hang as a proxy for Sharp left turn

7 Nov 2024 12:14 UTC
6 points
0 comments5 min readLW link

The Case Against Mo­ral Realism

Zero Contradictions7 Nov 2024 10:14 UTC
−5 points
10 comments1 min readLW link
(thewaywardaxolotl.blogspot.com)

[Question] What are the pri­mary drivers that caused se­lec­tion pres­sure for in­tel­li­gence in hu­mans?

Towards_Keeperhood7 Nov 2024 9:40 UTC
8 points
15 comments1 min readLW link

The Lo­gis­tics of Distri­bu­tion of Mean­ing: Against Epistemic Bureaucratization

Sahil7 Nov 2024 5:27 UTC
27 points
1 comment12 min readLW link

SAEs are highly dataset de­pen­dent: a case study on the re­fusal direction

7 Nov 2024 5:22 UTC
66 points
4 comments14 min readLW link

Should CA, TX, OK, and LA merge into a gi­ant swing state, just for elec­tions?

Thomas Kwa6 Nov 2024 23:01 UTC
115 points
35 comments1 min readLW link

New Fund­ing Cat­e­gory Open in Fore­sight’s AI Safety Grants

Allison Duettmann6 Nov 2024 22:59 UTC
15 points
0 comments1 min readLW link

Scat­tered thoughts on what it means for an LLM to believe

TheManxLoiner6 Nov 2024 22:10 UTC
5 points
4 comments5 min readLW link

The Bayesian Con­spir­acy Live Recording

Eneasz6 Nov 2024 16:25 UTC
9 points
0 comments1 min readLW link

An­thropic: Three Sketches of ASL-4 Safety Case Components

Zach Stein-Perlman6 Nov 2024 16:00 UTC
95 points
33 comments1 min readLW link
(alignment.anthropic.com)

Meme Talk­ing Points

ymeskhout6 Nov 2024 15:27 UTC
34 points
0 comments3 min readLW link

Ad­vi­sors for Smaller Ma­jor Donors?

jefftk6 Nov 2024 14:30 UTC
18 points
2 comments3 min readLW link
(www.jefftk.com)

Scis­sors State­ments for Pres­i­dent?

AnnaSalamon6 Nov 2024 10:38 UTC
118 points
32 comments1 min readLW link

[Question] How to cite LessWrong as an aca­demic source?

PhilosophicalSoul6 Nov 2024 8:28 UTC
6 points
6 comments1 min readLW link

How to put Cal­ifor­nia and Texas on the cam­paign trail!

Yair Halberstadt6 Nov 2024 6:08 UTC
25 points
4 comments1 min readLW link

LDT (and ev­ery­thing else) can be irrational

Christopher King6 Nov 2024 4:05 UTC
11 points
7 comments2 min readLW link

Join my new sub­scriber chat

sarahconstantin6 Nov 2024 2:30 UTC
7 points
0 comments1 min readLW link
(sarahconstantin.substack.com)

Grace­ful Degradation

Screwtape5 Nov 2024 23:57 UTC
79 points
8 comments4 min readLW link

An al­ter­na­tive ap­proach to superbabies

Towards_Keeperhood5 Nov 2024 22:56 UTC
48 points
19 comments3 min readLW link

Ap­ply to be a men­tor in SPAR!

agucova5 Nov 2024 21:32 UTC
5 points
0 comments1 min readLW link

Go­ing Beyond “im­ma­tu­rity”

moisentinel5 Nov 2024 20:51 UTC
−3 points
2 comments2 min readLW link

In­tent al­ign­ment as a step­ping-stone to value alignment

Seth Herd5 Nov 2024 20:43 UTC
37 points
6 comments3 min readLW link

Why Re­cur­sion Phar­ma­ceu­ti­cals aban­doned cell paint­ing for bright­field imaging

Abhishaike Mahajan5 Nov 2024 14:51 UTC
29 points
1 comment18 min readLW link
(www.owlposting.com)

Win­ning isn’t enough

5 Nov 2024 11:37 UTC
38 points
18 comments9 min readLW link

An­thropic—The case for tar­geted regulation

anaguma5 Nov 2024 7:07 UTC
11 points
0 comments2 min readLW link
(www.anthropic.com)

The Shal­low Bench

Karl Faulks5 Nov 2024 5:07 UTC
48 points
5 comments3 min readLW link

Us­ing Nar­ra­tive Prompt­ing to Ex­tract Policy Fore­casts from LLMs

Max Ghenis5 Nov 2024 4:37 UTC
5 points
0 comments1 min readLW link

ML4Good (AI Safety Boot­camp) - Ex­pe­rience report

JanEbbing5 Nov 2024 1:18 UTC
13 points
0 comments3 min readLW link

Catas­trophic Cy­ber Ca­pa­bil­ities Bench­mark (3CB): Ro­bustly Eval­u­at­ing LLM Agent Cy­ber Offense Capabilities

5 Nov 2024 1:01 UTC
8 points
0 comments6 min readLW link
(www.apartresearch.com)

[Question] Could or­cas be (trained to be) smarter than hu­mans? 

Towards_Keeperhood4 Nov 2024 23:29 UTC
58 points
22 comments1 min readLW link

Me­tastatic Cancer Treat­ment Since 2010: The Suc­cess Stories

sarahconstantin4 Nov 2024 22:50 UTC
51 points
2 comments6 min readLW link
(sarahconstantin.substack.com)

Bay Win­ter Sols­tice 2024: Speech Auditions

ozymandias4 Nov 2024 22:31 UTC
32 points
1 comment1 min readLW link

Em­pa­thy/​Sys­tem­iz­ing Quo­tient is a poor/​bi­ased model for the autism/​sex link

tailcalled4 Nov 2024 21:11 UTC
35 points
0 comments7 min readLW link

Distributed espionage

margetmagenta4 Nov 2024 19:43 UTC
3 points
0 comments1 min readLW link

We can survive

Oxidize4 Nov 2024 19:33 UTC
−13 points
7 comments2 min readLW link

GPT-8 may not be ASI

rvzlxax4094 Nov 2024 19:31 UTC
−2 points
1 comment3 min readLW link

AI timelines don’t ac­count for base rate of tech progress

rvzlxax4094 Nov 2024 19:31 UTC
−10 points
2 comments1 min readLW link

Up­date on the Mys­te­ri­ous Trump Buy­ers on Polymarket

Annapurna4 Nov 2024 19:22 UTC
19 points
9 comments1 min readLW link
(jorgevelez.substack.com)

[In­tu­itive self-mod­els] 8. Root­ing Out Free Will Intuitions

Steven Byrnes4 Nov 2024 18:16 UTC
70 points
16 comments24 min readLW link

Op­tion control

Joe Carlsmith4 Nov 2024 17:54 UTC
28 points
0 comments54 min readLW link

[Question] Notic­ing the World

EvolutionByDesign4 Nov 2024 16:41 UTC
4 points
1 comment1 min readLW link