ArchiveSequencesAbout
QuestionsEventsShortformAlignment ForumAF Comments
HomeFeaturedAllTagsRecent Comments
RSS
NewHotActiveOld
Page 2

How AI Takeover Might Hap­pen in 2 Years

joshcFeb 7, 2025, 5:10 PM
416 points
137 comments29 min readLW link
(x.com)

Some ar­ti­cles in “In­ter­na­tional Se­cu­rity” that I enjoyed

BuckJan 31, 2025, 4:23 PM
129 points
10 comments4 min readLW link

“Sharp Left Turn” dis­course: An opinionated review

Steven ByrnesJan 28, 2025, 6:47 PM
206 points
26 comments31 min readLW link

The Case Against AI Con­trol Research

johnswentworthJan 21, 2025, 4:03 PM
353 points
80 comments6 min readLW link

The Gen­tle Romance

Richard_NgoJan 19, 2025, 6:29 PM
240 points
46 comments15 min readLW link
(www.asimov.press)

Don’t ig­nore bad vibes you get from people

Kaj_SotalaJan 18, 2025, 9:20 AM
150 points
50 comments2 min readLW link
(kajsotala.fi)

What Is The Align­ment Prob­lem?

johnswentworthJan 16, 2025, 1:20 AM
178 points
50 comments25 min readLW link

How will we up­date about schem­ing?

ryan_greenblattJan 6, 2025, 8:21 PM
169 points
20 comments36 min readLW link

Re­view: Planecrash

L Rudolf LDec 27, 2024, 2:18 PM
358 points
45 comments22 min readLW link
(nosetgauge.substack.com)

A Three-Layer Model of LLM Psychology

Jan_KulveitDec 26, 2024, 4:49 PM
217 points
13 comments8 min readLW link

What Goes Without Saying

sarahconstantinDec 20, 2024, 6:00 PM
333 points
28 comments5 min readLW link
(sarahconstantin.substack.com)

When Is In­surance Worth It?

kqrDec 19, 2024, 7:07 PM
173 points
71 comments4 min readLW link
(entropicthoughts.com)

Align­ment Fak­ing in Large Lan­guage Models

ryan_greenblatt, evhub, Carson Denison, Benjamin Wright, Fabien Roger, Monte M, Sam Marks, Johannes Treutlein, Sam Bowman and Buck
Dec 18, 2024, 5:19 PM
483 points
75 comments10 min readLW link

AIs Will In­creas­ingly At­tempt Shenanigans

ZviDec 16, 2024, 3:20 PM
114 points
2 comments26 min readLW link
(thezvi.wordpress.com)

Biolog­i­cal risk from the mir­ror world

jasoncrawfordDec 12, 2024, 7:07 PM
333 points
38 comments7 min readLW link
(newsletter.rootsofprogress.org)

The “Think It Faster” Exercise

RaemonDec 11, 2024, 7:14 PM
144 points
35 comments13 min readLW link

Sub­skills of “Listen­ing to Wis­dom”

RaemonDec 9, 2024, 3:01 AM
154 points
29 comments42 min readLW link

Un­der­stand­ing Shap­ley Values with Venn Diagrams

Carson LDec 6, 2024, 9:56 PM
214 points
36 commentsLW link
(medium.com)

Pas­sages I High­lighted in The Let­ters of J.R.R.Tolkien

Ivan VendrovNov 25, 2024, 1:47 AM
139 points
38 comments31 min readLW link

“It’s a 10% chance which I did 10 times, so it should be 100%”

egor.timatkovNov 18, 2024, 1:14 AM
154 points
59 comments2 min readLW link
PreviousBack to topNext