Miles Brundage re­signed from OpenAI, and his AGI readi­ness team was disbanded

garrison23 Oct 2024 23:40 UTC
118 points
1 comment7 min readLW link
(garrisonlovely.substack.com)

A metaphor: what “green lights” for AGI would look like

Lorec23 Oct 2024 23:24 UTC
−1 points
6 comments2 min readLW link

Motte-and-Bailey: a Short Explanation

Lorec23 Oct 2024 22:29 UTC
12 points
0 comments1 min readLW link

Self-pre­dic­tion acts as an emer­gent regularizer

23 Oct 2024 22:27 UTC
84 points
4 comments4 min readLW link

Tech­ni­cal Risks of (Lethal) Au­tonomous Weapons Systems

Heramb23 Oct 2024 20:41 UTC
2 points
0 comments1 min readLW link
(encodejustice.org)

Ap­peal­ing to the Public

jefftk23 Oct 2024 19:00 UTC
16 points
0 comments5 min readLW link
(www.jefftk.com)

In­tro­duc­ing Transluce — A Let­ter from the Founders

jsteinhardt23 Oct 2024 18:10 UTC
74 points
2 comments3 min readLW link
(bounded-regret.ghost.io)

Are we drop­ping the ball on Recom­men­da­tion AIs?

Charbel-Raphaël23 Oct 2024 17:48 UTC
41 points
17 comments6 min readLW link

A bird’s eye view of ARC’s research

Jacob_Hilton23 Oct 2024 15:50 UTC
119 points
12 comments7 min readLW link
(www.alignment.org)

[Question] Ar­tifi­cial V/​S Organoid Intelligence

10xyz23 Oct 2024 14:31 UTC
5 points
0 comments1 min readLW link

AI safety tax dynamics

owencb23 Oct 2024 12:18 UTC
22 points
0 comments6 min readLW link
(strangecities.substack.com)

What is malev­olence? On the na­ture, mea­sure­ment, and dis­tri­bu­tion of dark traits

23 Oct 2024 8:41 UTC
76 points
15 comments1 min readLW link

Join a LessWrong Team for the Unag­ing Sys­tem Challenge

Crissman23 Oct 2024 6:01 UTC
15 points
5 comments1 min readLW link

Word Spaghetti

Gordon Seidoh Worley23 Oct 2024 5:39 UTC
18 points
9 comments3 min readLW link

Monose­man­tic­ity & Quantization

Rahul Chand22 Oct 2024 22:57 UTC
1 point
0 comments9 min readLW link

[Question] What is the alpha in one bit of ev­i­dence?

J Bostock22 Oct 2024 21:57 UTC
20 points
13 comments1 min readLW link

Catas­trophic sab­o­tage as a ma­jor threat model for hu­man-level AI systems

evhub22 Oct 2024 20:57 UTC
91 points
11 comments15 min readLW link

Why I quit effec­tive al­tru­ism, and why Ti­mothy Tel­leen-Law­ton is stay­ing (for now)

Elizabeth22 Oct 2024 18:20 UTC
75 points
79 comments1 min readLW link
(acesounderglass.com)

De­ci­sion-Mak­ing Un­der Uncer­tainty: Les­sons From AI

Jonasb22 Oct 2024 17:54 UTC
−1 points
0 comments5 min readLW link
(www.denominations.io)

Test­ing Ge­netic Eng­ineer­ing De­tec­tion with Spike-Ins

jefftk22 Oct 2024 17:20 UTC
9 points
0 comments1 min readLW link
(naobservatory.org)

Pre­dic­tions as Public Works Pro­ject — What Me­tac­u­lus Is Build­ing Next

ChristianWilliams22 Oct 2024 16:35 UTC
4 points
0 comments1 min readLW link
(www.metaculus.com)

Gorges of gen­der on a ter­rain of traits

dkl922 Oct 2024 16:18 UTC
−7 points
1 comment3 min readLW link
(dkl9.net)

A Defense of Peer Review

22 Oct 2024 16:16 UTC
23 points
1 comment22 min readLW link
(www.asimov.press)

BIG-Bench Ca­nary Con­tam­i­na­tion in GPT-4

Jozdien22 Oct 2024 15:40 UTC
123 points
13 comments4 min readLW link

[Paper Blog­post] When Your AIs De­ceive You: Challenges with Par­tial Ob­serv­abil­ity in RLHF

Leon Lang22 Oct 2024 13:57 UTC
50 points
1 comment18 min readLW link
(arxiv.org)

[In­tu­itive self-mod­els] 6. Awak­en­ing /​ En­light­en­ment /​ PNSE

Steven Byrnes22 Oct 2024 13:23 UTC
62 points
8 comments21 min readLW link

Re­solv­ing von Neu­mann-Mor­gen­stern In­con­sis­tent Preferences

niplav22 Oct 2024 11:45 UTC
31 points
5 comments58 min readLW link

Lenses of Control

WillPetillo22 Oct 2024 7:51 UTC
14 points
0 comments9 min readLW link

A Brief Ex­pla­na­tion of AI Control

Aaron_Scher22 Oct 2024 7:00 UTC
7 points
1 comment6 min readLW link

Longevity, AI, and Cog­ni­tive Re­search Hackathon @ MIT

ekkolápto22 Oct 2024 6:19 UTC
1 point
0 comments1 min readLW link

Con­ver­sa­tional Sign­posts—An An­ti­dote to Dull So­cial Interactions

Declan Molony22 Oct 2024 5:37 UTC
11 points
6 comments2 min readLW link

I got dysen­tery so you don’t have to

eukaryote22 Oct 2024 4:55 UTC
315 points
4 comments17 min readLW link
(eukaryotewritesblog.com)

Trans­form­ers Ex­plained (Again)

RohanS22 Oct 2024 4:06 UTC
3 points
0 comments18 min readLW link

Sleep­ing on Stage

jefftk22 Oct 2024 0:50 UTC
26 points
3 comments1 min readLW link
(www.jefftk.com)

The Mask Comes Off: At What Price?

Zvi21 Oct 2024 23:50 UTC
71 points
16 comments8 min readLW link
(thezvi.wordpress.com)

Dist­in­guish­ing ways AI can be “con­cen­trated”

Matthew Barnett21 Oct 2024 22:21 UTC
28 points
2 comments1 min readLW link

Jailbreak­ing ChatGPT and Claude us­ing Web API Con­text Injection

Jaehyuk Lim21 Oct 2024 21:34 UTC
4 points
0 comments3 min readLW link

How to Teach Your Brain to Hate Procrastination

10xyz21 Oct 2024 20:12 UTC
3 points
0 comments2 min readLW link

Paus­ing for what?

MountainPath21 Oct 2024 20:12 UTC
0 points
1 comment1 min readLW link

What is au­ton­omy? Why bound­aries are nec­es­sary.

Chipmonk21 Oct 2024 17:56 UTC
8 points
1 comment1 min readLW link
(chrislakin.blog)

Could ran­domly choos­ing peo­ple to serve as rep­re­sen­ta­tives lead to bet­ter gov­ern­ment?

John Huang21 Oct 2024 17:10 UTC
75 points
13 comments10 min readLW link

There aren’t enough smart peo­ple in biol­ogy do­ing some­thing boring

Abhishaike Mahajan21 Oct 2024 15:52 UTC
27 points
13 comments10 min readLW link

Au­toma­tion collapse

21 Oct 2024 14:50 UTC
70 points
9 comments7 min readLW link

What AI com­pa­nies should do: Some rough ideas

Zach Stein-Perlman21 Oct 2024 14:00 UTC
33 points
10 comments5 min readLW link

[Question] What should OpenAI do that it hasn’t already done, to stop their va­can­cies from be­ing ad­ver­tised on the 80k Job Board?

WitheringWeights21 Oct 2024 13:57 UTC
21 points
0 comments1 min readLW link

A Rocket–In­ter­pretabil­ity Analogy

plex21 Oct 2024 13:55 UTC
149 points
31 comments1 min readLW link

Tokyo AI Safety 2025: Call For Papers

Blaine21 Oct 2024 8:43 UTC
24 points
0 comments3 min readLW link
(www.tais2025.cc)

OpenAI defected, but we can take hon­est actions

Remmelt21 Oct 2024 8:41 UTC
17 points
16 comments1 min readLW link

Slightly More Than You Wanted To Know: Preg­nancy Length Effects

JustisMills21 Oct 2024 1:26 UTC
62 points
4 comments5 min readLW link
(justismills.substack.com)

In­for­ma­tion vs Assurance

johnswentworth20 Oct 2024 23:16 UTC
185 points
17 comments2 min readLW link