Sig­nifi­cantly En­hanc­ing Adult In­tel­li­gence With Gene Edit­ing May Be Possible

Dec 12, 2023, 6:14 PM
458 points
206 comments33 min readLW link2 reviews

Speak­ing to Con­gres­sional staffers about AI risk

Dec 4, 2023, 11:08 PM
312 points
25 comments15 min readLW link1 review

Con­stel­la­tions are Younger than Continents

Jeffrey HeningerDec 19, 2023, 6:12 AM
263 points
21 comments2 min readLW link

AI Con­trol: Im­prov­ing Safety De­spite In­ten­tional Subversion

Dec 13, 2023, 3:51 PM
236 points
24 comments10 min readLW link4 reviews

Thoughts on “AI is easy to con­trol” by Pope & Belrose

Steven ByrnesDec 1, 2023, 5:30 PM
197 points
63 comments14 min readLW link1 review

Is be­ing sexy for your homies?

ValentineDec 13, 2023, 8:37 PM
193 points
100 comments14 min readLW link2 reviews

“Hu­man­ity vs. AGI” Will Never Look Like “Hu­man­ity vs. AGI” to Humanity

Thane RuthenisDec 16, 2023, 8:08 PM
191 points
34 comments5 min readLW link

Effec­tive Asper­sions: How the Non­lin­ear In­ves­ti­ga­tion Went Wrong

TracingWoodgrainsDec 19, 2023, 12:00 PM
188 points
172 commentsLW link2 reviews

re: Yud­kowsky on biolog­i­cal materials

bhauthDec 11, 2023, 1:28 PM
182 points
30 comments5 min readLW link

Crit­i­cal re­view of Chris­ti­ano’s dis­agree­ments with Yudkowsky

Vanessa KosoyDec 27, 2023, 4:02 PM
176 points
40 comments15 min readLW link

The ‘Ne­glected Ap­proaches’ Ap­proach: AE Stu­dio’s Align­ment Agenda

Dec 18, 2023, 8:35 PM
175 points
22 comments12 min readLW link1 review

2023 Unoffi­cial LessWrong Cen­sus/​Survey

ScrewtapeDec 2, 2023, 4:41 AM
169 points
81 comments1 min readLW link

How use­ful is mechanis­tic in­ter­pretabil­ity?

Dec 1, 2023, 2:54 AM
167 points
54 comments25 min readLW link

The likely first longevity drug is based on sketchy sci­ence. This is bad for sci­ence and bad for longevity.

BobBurgersDec 12, 2023, 2:42 AM
161 points
34 comments5 min readLW link

Most Peo­ple Don’t Real­ize We Have No Idea How Our AIs Work

Thane RuthenisDec 21, 2023, 8:02 PM
159 points
42 comments1 min readLW link

Succession

Richard_NgoDec 20, 2023, 7:25 PM
159 points
48 comments11 min readLW link
(www.narrativeark.xyz)

The Plan − 2023 Version

johnswentworthDec 29, 2023, 11:34 PM
152 points
40 comments31 min readLW link1 review

Dis­cus­sion: Challenges with Un­su­per­vised LLM Knowl­edge Discovery

Dec 18, 2023, 11:58 AM
147 points
21 comments10 min readLW link

AI Views Snapshots

Rob BensingerDec 13, 2023, 12:45 AM
142 points
61 comments1 min readLW link

The Dark Arts

Dec 19, 2023, 4:41 AM
134 points
49 comments9 min readLW link

Cur­rent AIs Provide Nearly No Data Rele­vant to AGI Alignment

Thane RuthenisDec 15, 2023, 8:16 PM
131 points
157 comments8 min readLW link1 review

Nat­u­ral La­tents: The Math

Dec 27, 2023, 7:03 PM
128 points
40 comments12 min readLW link2 reviews

Deep For­get­ting & Un­learn­ing for Safely-Scoped LLMs

scasperDec 5, 2023, 4:48 PM
126 points
30 comments13 min readLW link

Bayesian Injustice

Kevin DorstDec 14, 2023, 3:44 PM
124 points
10 comments6 min readLW link
(kevindorst.substack.com)

The LessWrong 2022 Review

habrykaDec 5, 2023, 4:00 AM
115 points
43 comments4 min readLW link

Map­ping the se­man­tic void: Strange go­ings-on in GPT em­bed­ding spaces

mwatkinsDec 14, 2023, 1:10 PM
114 points
31 comments14 min readLW link

What I Would Do If I Were Work­ing On AI Governance

johnswentworthDec 8, 2023, 6:43 AM
110 points
32 comments10 min readLW link

“AI Align­ment” is a Danger­ously Over­loaded Term

RokoDec 15, 2023, 2:34 PM
108 points
100 comments3 min readLW link

[Question] How do you feel about LessWrong these days? [Open feed­back thread]

Bird ConceptDec 5, 2023, 8:54 PM
107 points
285 comments1 min readLW link

Fact Find­ing: At­tempt­ing to Re­v­erse-Eng­ineer Fac­tual Re­call on the Neu­ron Level (Post 1)

Dec 23, 2023, 2:44 AM
106 points
10 comments22 min readLW link2 reviews

On the fu­ture of lan­guage models

owencbDec 20, 2023, 4:58 PM
105 points
17 commentsLW link

The Witness

Richard_NgoDec 3, 2023, 10:27 PM
105 points
5 comments14 min readLW link
(www.narrativeark.xyz)

Non­lin­ear’s Ev­i­dence: De­bunk­ing False and Mislead­ing Claims

KatWoodsDec 12, 2023, 1:16 PM
104 points
171 commentsLW link

[Valence se­ries] 1. Introduction

Steven ByrnesDec 4, 2023, 3:40 PM
99 points
16 comments16 min readLW link2 reviews

A Crisper Ex­pla­na­tion of Si­mu­lacrum Levels

Thane RuthenisDec 23, 2023, 10:13 PM
92 points
13 comments13 min readLW link

Niet­zsche’s Mo­ral­ity in Plain English

Arjun PanicksseryDec 4, 2023, 12:57 AM
92 points
14 comments4 min readLW link1 review
(arjunpanickssery.substack.com)

Mean­ing & Agency

abramdemskiDec 19, 2023, 10:27 PM
91 points
17 comments14 min readLW link

Pre­dic­tion Mar­kets aren’t Magic

SimonMDec 21, 2023, 12:54 PM
90 points
29 comments3 min readLW link

Based Beff Je­zos and the Accelerationists

ZviDec 6, 2023, 4:00 PM
90 points
29 comments12 min readLW link
(thezvi.wordpress.com)

[Valence se­ries] 2. Valence & Normativity

Steven ByrnesDec 7, 2023, 4:43 PM
88 points
7 comments28 min readLW link1 review

Some for-profit AI al­ign­ment org ideas

Eric HoDec 14, 2023, 2:23 PM
86 points
19 comments9 min readLW link

A Univer­sal Emer­gent De­com­po­si­tion of Retrieval Tasks in Lan­guage Models

Dec 19, 2023, 11:52 AM
84 points
3 comments10 min readLW link
(arxiv.org)

Re­fusal mechanisms: ini­tial ex­per­i­ments with Llama-2-7b-chat

Dec 8, 2023, 5:08 PM
82 points
7 comments7 min readLW link

Study­ing The Alien Mind

Dec 5, 2023, 5:27 PM
80 points
10 comments15 min readLW link

EU poli­cy­mak­ers reach an agree­ment on the AI Act

tlevinDec 15, 2023, 6:02 AM
78 points
7 comments7 min readLW link

OpenAI: Leaks Con­firm the Story

ZviDec 12, 2023, 2:00 PM
77 points
9 comments16 min readLW link
(thezvi.wordpress.com)

Send us ex­am­ple gnarly bugs

Dec 10, 2023, 5:23 AM
77 points
10 comments2 min readLW link

The Offense-Defense Balance Rarely Changes

Maxwell TabarrokDec 9, 2023, 3:21 PM
77 points
23 comments3 min readLW link
(maximumprogress.substack.com)

MATS Sum­mer 2023 Retrospective

Dec 1, 2023, 11:29 PM
77 points
34 comments26 min readLW link

[Valence se­ries] 3. Valence & Beliefs

Steven ByrnesDec 11, 2023, 8:21 PM
77 points
12 comments21 min readLW link1 review