[Question] For fun: How long can you hold your breath?

exanova6 Dec 2023 23:36 UTC
1 point
7 comments1 min readLW link

Math­e­mat­ics As Physics

Nox ML6 Dec 2023 22:27 UTC
−2 points
10 comments5 min readLW link

The count­ing ar­gu­ment for schem­ing (Sec­tions 4.1 and 4.2 of “Schem­ing AIs”)

Joe Carlsmith6 Dec 2023 19:28 UTC
10 points
0 comments10 min readLW link

On Trust

johnswentworth6 Dec 2023 19:19 UTC
44 points
26 comments4 min readLW link

Origi­nal­ity vs. Correctness

6 Dec 2023 18:51 UTC
60 points
17 comments25 min readLW link

Pro­posal for im­prov­ing the global on­line dis­course through per­son­al­ised com­ment or­der­ing on all websites

Roman Leventov6 Dec 2023 18:51 UTC
35 points
21 comments6 min readLW link

Google Gem­ini Announced

Jacob G-W6 Dec 2023 16:14 UTC
54 points
22 comments1 min readLW link
(blog.google)

Based Beff Je­zos and the Accelerationists

Zvi6 Dec 2023 16:00 UTC
89 points
29 comments12 min readLW link
(thezvi.wordpress.com)

Bucket Bri­gade: Likely End-of-Life

jefftk6 Dec 2023 15:30 UTC
16 points
1 comment1 min readLW link
(www.jefftk.com)

Why Yud­kowsky is wrong about “co­va­lently bonded equiv­a­lents of biol­ogy”

titotal6 Dec 2023 14:09 UTC
34 points
40 comments1 min readLW link
(open.substack.com)

Me­tac­u­lus Launches Chi­nese AI Chips Tour­na­ment, Sup­port­ing In­sti­tute for AI Policy and Strat­egy Research

ChristianWilliams6 Dec 2023 11:26 UTC
10 points
1 comment1 min readLW link
(www.metaculus.com)

Min­i­mal Vi­able Par­adise: How do we get The Good Fu­ture(TM)?

Nathan Young6 Dec 2023 9:24 UTC
9 points
0 comments7 min readLW link

An­throp­i­cal Para­doxes are Para­doxes of Prob­a­bil­ity Theory

Ape in the coat6 Dec 2023 8:16 UTC
52 points
18 comments5 min readLW link

Digi­tal hu­mans vs merge with AI? Same or differ­ent?

6 Dec 2023 4:56 UTC
21 points
11 comments7 min readLW link

EA In­fras­truc­ture Fund’s Plan to Fo­cus on Prin­ci­ples-First EA

Linch6 Dec 2023 3:24 UTC
27 points
0 comments1 min readLW link

**In defence of He­len Toner, Adam D’An­gelo, and Tasha McCauley**

mrtreasure6 Dec 2023 2:02 UTC
25 points
3 comments9 min readLW link
(pastebin.com)

Some quick thoughts on “AI is easy to con­trol”

Mikhail Samin6 Dec 2023 0:58 UTC
15 points
10 comments7 min readLW link

ACX Cor­val­lis, OR

kenakofer6 Dec 2023 0:23 UTC
1 point
0 comments1 min readLW link

Multi­na­tional cor­po­ra­tions as op­ti­miz­ers: a case for reach­ing across the aisle

sudo-nym6 Dec 2023 0:14 UTC
9 points
10 comments1 min readLW link

[Question] How do you feel about LessWrong these days? [Open feed­back thread]

jacobjacob5 Dec 2023 20:54 UTC
107 points
281 comments1 min readLW link

Cri­tique-a-Thon of AI Align­ment Plans

Iknownothing5 Dec 2023 20:50 UTC
12 points
3 comments1 min readLW link

Ar­gu­ments for/​against schem­ing that fo­cus on the path SGD takes (Sec­tion 3 of “Schem­ing AIs”)

Joe Carlsmith5 Dec 2023 18:48 UTC
10 points
0 comments23 min readLW link

In defence of He­len Toner, Adam D’An­gelo, and Tasha McCauley (OpenAI post)

mrtreasure5 Dec 2023 18:40 UTC
6 points
2 comments1 min readLW link
(pastebin.com)

Study­ing The Alien Mind

5 Dec 2023 17:27 UTC
80 points
10 comments15 min readLW link

Deep For­get­ting & Un­learn­ing for Safely-Scoped LLMs

scasper5 Dec 2023 16:48 UTC
122 points
29 comments13 min readLW link

On ‘Re­spon­si­ble Scal­ing Poli­cies’ (RSPs)

Zvi5 Dec 2023 16:10 UTC
48 points
3 comments37 min readLW link
(thezvi.wordpress.com)

We’re all in this together

Tamsin Leake5 Dec 2023 13:57 UTC
69 points
65 comments2 min readLW link
(carado.moe)

A So­cratic di­alogue with my student

lsusr5 Dec 2023 9:31 UTC
36 points
14 comments6 min readLW link

Neu­ral un­cer­tainty es­ti­ma­tion re­view ar­ti­cle (for al­ign­ment)

Charlie Steiner5 Dec 2023 8:01 UTC
74 points
3 comments11 min readLW link

An­a­lyz­ing the His­tor­i­cal Rate of Catastrophes

jsteinhardt5 Dec 2023 6:30 UTC
15 points
0 comments16 min readLW link
(bounded-regret.ghost.io)

Some open-source dic­tio­nar­ies and dic­tio­nary learn­ing infrastructure

Sam Marks5 Dec 2023 6:05 UTC
45 points
7 comments5 min readLW link

The LessWrong 2022 Review

habryka5 Dec 2023 4:00 UTC
115 points
43 comments4 min readLW link

Bands And Low-stakes Dances

jefftk5 Dec 2023 3:50 UTC
20 points
0 comments1 min readLW link
(www.jefftk.com)

Ac­cel­er­at­ing sci­ence through evolv­able institutions

jasoncrawford4 Dec 2023 23:21 UTC
19 points
9 comments6 min readLW link
(rootsofprogress.org)

Speak­ing to Con­gres­sional staffers about AI risk

4 Dec 2023 23:08 UTC
297 points
23 comments16 min readLW link

Open Thread – Win­ter 2023/​2024

habryka4 Dec 2023 22:59 UTC
35 points
160 comments1 min readLW link

In­ter­view with Vanessa Kosoy on the Value of The­o­ret­i­cal Re­search for AI

WillPetillo4 Dec 2023 22:58 UTC
37 points
0 comments35 min readLW link

2023 Align­ment Re­search Up­dates from FAR AI

4 Dec 2023 22:32 UTC
18 points
0 comments8 min readLW link
(far.ai)

What’s new at FAR AI

4 Dec 2023 21:18 UTC
41 points
0 comments5 min readLW link
(far.ai)

n of m ring signatures

DanielFilan4 Dec 2023 20:00 UTC
50 points
7 comments1 min readLW link
(danielfilan.com)

Mechanis­tic in­ter­pretabil­ity through clustering

Alistair Fraser4 Dec 2023 18:49 UTC
1 point
0 comments1 min readLW link

Agents which are EU-max­i­miz­ing as a group are not EU-max­i­miz­ing individually

Mlxa4 Dec 2023 18:49 UTC
3 points
2 comments2 min readLW link

Plan­ning in LLMs: In­sights from AlphaGo

jco4 Dec 2023 18:48 UTC
8 points
10 comments11 min readLW link

Non-clas­sic sto­ries about schem­ing (Sec­tion 2.3.2 of “Schem­ing AIs”)

Joe Carlsmith4 Dec 2023 18:44 UTC
9 points
0 comments20 min readLW link

6. The Mutable Values Prob­lem in Value Learn­ing and CEV

RogerDearnaley4 Dec 2023 18:31 UTC
12 points
0 comments49 min readLW link

Up­dates to Open Phil’s ca­reer de­vel­op­ment and tran­si­tion fund­ing program

4 Dec 2023 18:10 UTC
28 points
0 comments2 min readLW link

[Valence se­ries] 1. Introduction

Steven Byrnes4 Dec 2023 15:40 UTC
98 points
14 comments16 min readLW link

South Bay Meetup 12/​9

David Friedman4 Dec 2023 7:32 UTC
2 points
0 comments1 min readLW link

Hash­marks: Pri­vacy-Pre­serv­ing Bench­marks for High-Stakes AI Evaluation

Paul Bricman4 Dec 2023 7:31 UTC
12 points
6 comments16 min readLW link
(arxiv.org)

A call for a quan­ti­ta­tive re­port card for AI bioter­ror­ism threat models

Juno4 Dec 2023 6:35 UTC
12 points
0 comments10 min readLW link