Scal­ing pre­dic­tion mar­kets with meta-markets

Dentosal10 Oct 2024 21:17 UTC
1 point
0 comments2 min readLW link

Startup Suc­cess Rates Are So Low Be­cause the Re­wards Are So Large

AppliedDivinityStudies10 Oct 2024 20:22 UTC
42 points
6 comments2 min readLW link

Can AI Out­pre­dict Hu­mans? Re­sults From Me­tac­u­lus’s Q3 AI Fore­cast­ing Benchmark

ChristianWilliams10 Oct 2024 18:58 UTC
50 points
2 comments1 min readLW link
(www.metaculus.com)

Ra­tion­al­ity Quotes—Fall 2024

Screwtape10 Oct 2024 18:37 UTC
75 points
22 comments1 min readLW link

[Question] why won’t this al­ign­ment plan work?

KvmanThinking10 Oct 2024 15:44 UTC
6 points
7 comments1 min readLW link

AI #85: AI Wins the No­bel Prize

Zvi10 Oct 2024 13:40 UTC
30 points
6 comments31 min readLW link
(thezvi.wordpress.com)

Be­hav­ioral red-team­ing is un­likely to pro­duce clear, strong ev­i­dence that mod­els aren’t scheming

Buck10 Oct 2024 13:36 UTC
100 points
4 comments13 min readLW link

Joshua Achiam Public State­ment Analysis

Zvi10 Oct 2024 12:50 UTC
73 points
14 comments21 min readLW link
(thezvi.wordpress.com)

Do you want to do a de­bate on youtube? I’m look­ing for po­lite, truth-seek­ing par­ti­ci­pants.

Nathan Young10 Oct 2024 9:32 UTC
12 points
0 comments1 min readLW link

Ra­tion­al­ist Gnosticism

tailcalled10 Oct 2024 9:06 UTC
9 points
10 comments3 min readLW link

The deep­est athe­ist: Sam Altman

Trey Edwin10 Oct 2024 3:27 UTC
14 points
2 comments4 min readLW link

Values Are Real Like Harry Potter

9 Oct 2024 23:42 UTC
81 points
17 comments5 min readLW link

Mo­men­tum of Light in Glass

Ben9 Oct 2024 20:19 UTC
141 points
44 comments11 min readLW link

vgillioz’s Shortform

vgillioz9 Oct 2024 19:31 UTC
1 point
2 comments1 min readLW link

Hamil­to­nian Dy­nam­ics in AI: A Novel Ap­proach to Op­ti­miz­ing Rea­son­ing in Lan­guage Models

Javier Marin Valenzuela9 Oct 2024 19:14 UTC
3 points
0 comments10 min readLW link

Tri­an­gu­lat­ing My In­ter­pre­ta­tion of Meth­ods: Black Boxes by Marco J. Nathan

adamShimi9 Oct 2024 19:13 UTC
8 points
0 comments6 min readLW link
(formethods.substack.com)

Scaf­fold­ing for “Notic­ing Me­tacog­ni­tion”

Raemon9 Oct 2024 17:54 UTC
78 points
4 comments17 min readLW link

Safe Pre­dic­tive Agents with Joint Scor­ing Rules

Rubi J. Hudson9 Oct 2024 16:38 UTC
55 points
10 comments17 min readLW link

Demis Hass­abis and Ge­offrey Hin­ton Awarded No­bel Prizes

Anna Gajdova9 Oct 2024 12:56 UTC
47 points
14 comments1 min readLW link

Hu­mans are (mostly) metarational

Yair Halberstadt9 Oct 2024 5:51 UTC
14 points
6 comments3 min readLW link

[Job Ad] MATS is hiring!

9 Oct 2024 2:17 UTC
10 points
0 comments5 min readLW link

Pal­isade is hiring: Exec As­sis­tant, Con­tent Lead, Ops Lead, and Policy Lead

Charlie Rogers-Smith9 Oct 2024 0:04 UTC
11 points
0 comments4 min readLW link

AGI & Con­scious­ness—Joscha Bach

Rahul Chand8 Oct 2024 22:51 UTC
1 point
0 comments10 min readLW link

Video and tran­script of pre­sen­ta­tion on Oth­er­ness and con­trol in the age of AGI

Joe Carlsmith8 Oct 2024 22:30 UTC
35 points
1 comment27 min readLW link

From seeded com­plex­ity to con­scious­ness—yes, it’s all the same.

eschatail8 Oct 2024 21:31 UTC
−23 points
0 comments2 min readLW link

Limits of safe and al­igned AI

Shivam8 Oct 2024 21:30 UTC
2 points
0 comments4 min readLW link

[Question] What con­sti­tutes an in­fo­haz­ard?

K1r4d4rk.v18 Oct 2024 21:29 UTC
−4 points
8 comments1 min readLW link

[Question] What makes one a “ra­tio­nal­ist”?

mathyouf8 Oct 2024 20:25 UTC
7 points
5 comments3 min readLW link

[In­tu­itive self-mod­els] 4. Trance

Steven Byrnes8 Oct 2024 13:30 UTC
63 points
7 comments24 min readLW link

Schel­ling game eval­u­a­tions for AI control

Olli Järviniemi8 Oct 2024 12:01 UTC
65 points
5 comments11 min readLW link

Think­ing About a Pedalboard

jefftk8 Oct 2024 11:50 UTC
9 points
2 comments1 min readLW link
(www.jefftk.com)

Overview of strong hu­man in­tel­li­gence am­plifi­ca­tion methods

TsviBT8 Oct 2024 8:37 UTC
264 points
141 comments10 min readLW link

Near-death experiences

Declan Molony8 Oct 2024 6:34 UTC
3 points
1 comment3 min readLW link

The un­rea­son­able effec­tive­ness of plas­mid se­quenc­ing as a service

Abhishaike Mahajan8 Oct 2024 2:02 UTC
23 points
2 comments13 min readLW link
(www.owlposting.com)

There is a globe in your LLM

jacob_drori8 Oct 2024 0:43 UTC
86 points
4 comments1 min readLW link

MATS AI Safety Strat­egy Cur­ricu­lum v2

7 Oct 2024 22:44 UTC
42 points
6 comments13 min readLW link

2025 Color Trends

sarahconstantin7 Oct 2024 21:20 UTC
40 points
7 comments6 min readLW link
(sarahconstantin.substack.com)

Clar­ify­ing Align­ment Fun­da­men­tals Through the Lens of Ontology

eternal/ephemera7 Oct 2024 20:57 UTC
12 points
4 comments24 min readLW link

Ethics on Cos­mic Scale, Outer Space Treaty, Directed Pansper­mia, For­wards-Con­tam­i­na­tion, Tech­nol­ogy Assess­ment, Plane­tary Pro­tec­tion, and Fermi’s Paradox

MrFantastic7 Oct 2024 20:56 UTC
−12 points
0 comments1 min readLW link

Do­main-spe­cific SAEs

jacob_drori7 Oct 2024 20:15 UTC
27 points
0 comments5 min readLW link

Me­tac­u­lus Is Open Source

ChristianWilliams7 Oct 2024 19:55 UTC
13 points
0 comments1 min readLW link
(www.metaculus.com)

Re­search up­date: Towards a Law of Iter­ated Ex­pec­ta­tions for Heuris­tic Estimators

Eric Neyman7 Oct 2024 19:29 UTC
87 points
2 comments22 min readLW link

AI Model Registries: A Foun­da­tional Tool for AI Governance

7 Oct 2024 19:27 UTC
20 points
1 comment4 min readLW link
(www.convergenceanalysis.org)

Eval­u­at­ing the truth of state­ments in a world of am­bigu­ous lan­guage.

Hastings7 Oct 2024 18:08 UTC
48 points
19 comments2 min readLW link

Ad­vice for journalists

Nathan Young7 Oct 2024 16:46 UTC
100 points
53 comments9 min readLW link
(nathanpmyoung.substack.com)

Time Effi­cient Re­sis­tance Training

romeostevensit7 Oct 2024 15:15 UTC
42 points
8 comments3 min readLW link

A Nar­row Path: a plan to deal with AI ex­tinc­tion risk

7 Oct 2024 13:02 UTC
73 points
11 comments2 min readLW link
(www.narrowpath.co)

Toy Models of Fea­ture Ab­sorp­tion in SAEs

7 Oct 2024 9:56 UTC
46 points
7 comments10 min readLW link

An ar­gu­ment that con­se­quen­tial­ism is incomplete

cousin_it7 Oct 2024 9:45 UTC
32 points
27 comments1 min readLW link

An X-Ray is Worth 15 Fea­tures: Sparse Au­toen­coders for In­ter­pretable Ra­diol­ogy Re­port Generation

7 Oct 2024 8:53 UTC
38 points
0 comments5 min readLW link
(arxiv.org)