Grace­ful Degradation

Screwtape5 Nov 2024 23:57 UTC
79 points
8 comments4 min readLW link

An al­ter­na­tive ap­proach to superbabies

Towards_Keeperhood5 Nov 2024 22:56 UTC
48 points
19 comments3 min readLW link

Ap­ply to be a men­tor in SPAR!

agucova5 Nov 2024 21:32 UTC
5 points
0 comments1 min readLW link

Go­ing Beyond “im­ma­tu­rity”

moisentinel5 Nov 2024 20:51 UTC
−3 points
2 comments2 min readLW link

In­tent al­ign­ment as a step­ping-stone to value alignment

Seth Herd5 Nov 2024 20:43 UTC
35 points
4 comments3 min readLW link

Why Re­cur­sion Phar­ma­ceu­ti­cals aban­doned cell paint­ing for bright­field imaging

Abhishaike Mahajan5 Nov 2024 14:51 UTC
29 points
1 comment18 min readLW link
(www.owlposting.com)

Win­ning isn’t enough

5 Nov 2024 11:37 UTC
28 points
14 comments9 min readLW link

An­thropic—The case for tar­geted regulation

anaguma5 Nov 2024 7:07 UTC
11 points
0 comments2 min readLW link
(www.anthropic.com)

The Shal­low Bench

Karl Faulks5 Nov 2024 5:07 UTC
46 points
5 comments3 min readLW link

Us­ing Nar­ra­tive Prompt­ing to Ex­tract Policy Fore­casts from LLMs

Max Ghenis5 Nov 2024 4:37 UTC
5 points
0 comments1 min readLW link

ML4Good (AI Safety Boot­camp) - Ex­pe­rience report

JanEbbing5 Nov 2024 1:18 UTC
12 points
0 comments3 min readLW link

Catas­trophic Cy­ber Ca­pa­bil­ities Bench­mark (3CB): Ro­bustly Eval­u­at­ing LLM Agent Cy­ber Offense Capabilities

5 Nov 2024 1:01 UTC
8 points
0 comments6 min readLW link
(www.apartresearch.com)

[Question] Could or­cas be (trained to be) smarter than hu­mans? 

Towards_Keeperhood4 Nov 2024 23:29 UTC
58 points
20 comments1 min readLW link

Me­tastatic Cancer Treat­ment Since 2010: The Suc­cess Stories

sarahconstantin4 Nov 2024 22:50 UTC
50 points
2 comments6 min readLW link
(sarahconstantin.substack.com)

Bay Win­ter Sols­tice 2024: Speech Auditions

ozymandias4 Nov 2024 22:31 UTC
32 points
1 comment1 min readLW link

Em­pa­thy/​Sys­tem­iz­ing Quo­tient is a poor/​bi­ased model for the autism/​sex link

tailcalled4 Nov 2024 21:11 UTC
33 points
0 comments7 min readLW link

Distributed espionage

margetmagenta4 Nov 2024 19:43 UTC
3 points
0 comments1 min readLW link

We can survive

Oxidize4 Nov 2024 19:33 UTC
−13 points
7 comments2 min readLW link

GPT-8 may not be ASI

rvzlxax4094 Nov 2024 19:31 UTC
−2 points
1 comment3 min readLW link

AI timelines don’t ac­count for base rate of tech progress

rvzlxax4094 Nov 2024 19:31 UTC
−10 points
2 comments1 min readLW link

Up­date on the Mys­te­ri­ous Trump Buy­ers on Polymarket

Annapurna4 Nov 2024 19:22 UTC
19 points
9 comments1 min readLW link
(jorgevelez.substack.com)

[In­tu­itive self-mod­els] 8. Root­ing Out Free Will Intuitions

Steven Byrnes4 Nov 2024 18:16 UTC
64 points
16 comments24 min readLW link

Op­tion control

Joe Carlsmith4 Nov 2024 17:54 UTC
28 points
0 comments54 min readLW link

[Question] Notic­ing the World

EvolutionByDesign4 Nov 2024 16:41 UTC
4 points
1 comment1 min readLW link

The cur­rent state of RSPs

Zach Stein-Perlman4 Nov 2024 16:00 UTC
23 points
2 comments9 min readLW link

[Question] Does the “an­cient wis­dom” ar­gu­ment have any val­idity? If a par­tic­u­lar teach­ing or tra­di­tion is old, to what ex­tent does this make it more trust­wor­thy?

SpectrumDT4 Nov 2024 15:20 UTC
18 points
49 comments1 min readLW link

A brief his­tory of the au­to­mated corporation

owencb4 Nov 2024 14:35 UTC
26 points
1 comment5 min readLW link
(strangecities.substack.com)

Ab­strac­tions are not Natural

Alfred Harwood4 Nov 2024 11:10 UTC
25 points
21 comments11 min readLW link

[Linkpost] Build­ing Altru­is­tic and Mo­ral AI Agent with Brain-in­spired Affec­tive Em­pa­thy Mechanisms

Gunnar_Zarncke4 Nov 2024 10:15 UTC
13 points
0 comments1 min readLW link
(arxiv.org)

Con­text-de­pen­dent consequentialism

4 Nov 2024 9:29 UTC
31 points
6 comments27 min readLW link

Sur­vival with­out dignity

L Rudolf L4 Nov 2024 2:29 UTC
347 points
29 comments15 min readLW link
(nosetgauge.substack.com)

Drug de­vel­op­ment costs can range over two or­ders of magnitude

rossry3 Nov 2024 23:13 UTC
38 points
0 comments11 min readLW link

Redefin­ing Tol­er­ance: Beyond Pop­per’s Paradox

mindprison3 Nov 2024 22:23 UTC
−1 points
0 comments3 min readLW link

Goal: Un­der­stand Intelligence

Johannes C. Mayer3 Nov 2024 21:20 UTC
13 points
19 comments1 min readLW link

Cur­rent safety train­ing tech­niques do not fully trans­fer to the agent setting

3 Nov 2024 19:24 UTC
156 points
8 comments5 min readLW link

Why our poli­ti­ci­ans aren’t Median

Yair Halberstadt3 Nov 2024 14:03 UTC
60 points
15 comments3 min readLW link

Hu­man Bio­di­ver­sity (Part 4: As­tral Codex Ten)

Evan_Gaensbauer3 Nov 2024 4:20 UTC
−13 points
6 comments1 min readLW link
(reflectivealtruism.com)

Un­der­stand­ing in­com­pa­ra­bil­ity ver­sus in­com­men­su­ra­bil­ity in re­la­tion to RLHF

artemiocobb2 Nov 2024 22:57 UTC
1 point
1 comment2 min readLW link

elec­tric turbofans

bhauth2 Nov 2024 22:50 UTC
61 points
2 comments5 min readLW link
(bhauth.com)

Real­ity as Cat­e­gory-The­o­retic State Machines: A Math­e­mat­i­cal Framework

Wenitte Apiou2 Nov 2024 21:04 UTC
−8 points
0 comments2 min readLW link

The Me­dian Re­searcher Problem

johnswentworth2 Nov 2024 20:16 UTC
162 points
71 comments1 min readLW link

Test­ing “True” Lan­guage Un­der­stand­ing in LLMs: A Sim­ple Proposal

MtryaSam2 Nov 2024 19:12 UTC
9 points
2 comments2 min readLW link

Test­ing “True” Lan­guage Un­der­stand­ing in LLMs: A Sim­ple Proposal

MtryaSam2 Nov 2024 19:12 UTC
−3 points
0 comments2 min readLW link

[Question] Feed­back re­quest: what am I miss­ing?

Nathan Helm-Burger2 Nov 2024 17:38 UTC
35 points
5 comments1 min readLW link

Frag­ile, Ro­bust, and An­tifrag­ile Prefer­ence Satisfaction

adamShimi2 Nov 2024 17:25 UTC
19 points
0 comments5 min readLW link
(formethods.substack.com)

Higher Order Signs, Hal­lu­ci­na­tion and Schizophrenia

Nicolas Villarreal2 Nov 2024 16:33 UTC
3 points
0 comments13 min readLW link
(nicolasdvillarreal.substack.com)

[Question] Is OpenAI net nega­tive for AI Safety?

Lysandre Terrisse2 Nov 2024 16:18 UTC
4 points
0 comments1 min readLW link

Two ar­gu­ments against longter­mist thought experiments

momom22 Nov 2024 10:22 UTC
15 points
5 comments3 min readLW link

Both-Side­sism—When Fair & Balanced Goes Wrong

James Stephen Brown2 Nov 2024 3:04 UTC
3 points
15 comments6 min readLW link
(nonzerosum.games)

What can we learn from in­se­cure do­mains?

Logan Zoellner1 Nov 2024 23:53 UTC
14 points
21 comments1 min readLW link