All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 202120222023 2024 2025

All Jan Feb Mar Apr May Jun JulAugSep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 202122 23 24 25 26 27 28 29 30 31

Interpretability Tools Are an Attack Channel

Thane RuthenisAug 17, 2022, 6:47 PM

42 points

14 comments1 min readLW link

Human Mimicry Mainly Works When We’re Already Close

johnswentworthAug 17, 2022, 6:41 PM

82 points

16 comments5 min readLW link

Thoughts on ‘List of Lethalities’

Alex Lawsen Aug 17, 2022, 6:33 PM

27 points

0 comments10 min readLW link

The longest training run

Jsevillamol, Tamay, Owen D and anson.ho

Aug 17, 2022, 5:18 PM

71 points

12 comments9 min readLW link

(epochai.org)

Spoiler-Free Review: Across the Obelisk

ZviAug 17, 2022, 2:30 PM

17 points

0 comments6 min readLW link

(thezvi.wordpress.com)

Autonomy as taking responsibility for reference maintenance

Ramana KumarAug 17, 2022, 12:50 PM

61 points

3 comments5 min readLW link

Duplicating Rasberry Pi Images

jefftkAug 17, 2022, 12:10 PM

9 points

4 comments4 min readLW link

(www.jefftk.com)

ACX Meetup—Amsterdam

Pierre VandenbergheAug 17, 2022, 9:56 AM

2 points

1 comment1 min readLW link

Insufficient awareness of how everything sucks

FlaglandbaseAug 17, 2022, 8:01 AM

−13 points

5 comments1 min readLW link

Mesa-optimization for goals defined only within a training environment is dangerous

Rubi J. HudsonAug 17, 2022, 3:56 AM

6 points

2 comments4 min readLW link

ACX / SSC Meetup Singapore

DGAug 17, 2022, 2:08 AM

2 points

1 comment1 min readLW link

That-time-of-year Astral Codex Ten Meetup

Ben SmithAug 17, 2022, 12:02 AM

3 points

2 comments1 min readLW link

SSC Reno Meetup

StevenAug 16, 2022, 11:37 PM

1 point

3 comments1 min readLW link

My thoughts on direct work (and joining LessWrong)

RobertMAug 16, 2022, 6:53 PM

58 points

4 comments6 min readLW link

We can make the future a million years from now go better [video]

WriterAug 16, 2022, 1:03 PM

7 points

1 comment6 min readLW link

(youtu.be)

The Open Society and Its Enemies: Summary and Thoughts

mattoAug 16, 2022, 11:44 AM

12 points

4 comments17 min readLW link

An introduction to signalling theory

MvolzAug 16, 2022, 9:37 AM

17 points

1 comment5 min readLW link

Understanding differences between humans and intelligence-in-general to build safe AGI

Florian_DietzAug 16, 2022, 8:27 AM

7 points

8 comments1 min readLW link

Against population ethics

jasoncrawfordAug 16, 2022, 5:19 AM

29 points

39 comments3 min readLW link

Deception as the optimal: mesa-optimizers and inner alignment

Eleni AngelouAug 16, 2022, 4:49 AM

11 points

0 comments5 min readLW link

Crowdsourcing Anki Decks

ArdenAug 16, 2022, 2:53 AM

1 point

0 comments1 min readLW link

What Makes an Idea Understandable? On Architecturally and Culturally Natural Ideas.

NickyP, Peter S. Park and Stephen Fowler

Aug 16, 2022, 2:09 AM

21 points

2 comments16 min readLW link

Dwarves & D.Sci: Data Fortress Evaluation & Ruleset

aphyerAug 16, 2022, 12:15 AM

26 points

10 comments8 min readLW link

I’m mildly skeptical that blindness prevents schizophrenia

Steven ByrnesAug 15, 2022, 11:36 PM

83 points

9 comments4 min readLW link

What’s General-Purpose Search, And Why Might We Expect To See It In Trained ML Systems?

johnswentworthAug 15, 2022, 10:48 PM

156 points

18 comments10 min readLW link

“What Mistakes Are You Making Right Now?”

David UdellAug 15, 2022, 9:19 PM

13 points

2 comments1 min readLW link

On Preference Manipulation in Reward Learning Processes

Felix HofstätterAug 15, 2022, 7:32 PM

8 points

0 comments4 min readLW link

Cambist Booking: Discussing What We Value

ScrewtapeAug 15, 2022, 6:24 PM

5 points

1 comment1 min readLW link

Capital and inequality

NathanBarnardAug 15, 2022, 5:23 PM

7 points

2 comments5 min readLW link

[Question] Are there practical exercises for developing the Scout mindset?

ChristianKlAug 15, 2022, 5:23 PM

15 points

2 comments1 min readLW link

[Question] How do you get a job as a software developer?

lsusrAug 15, 2022, 2:45 PM

22 points

24 comments1 min readLW link

The Parable of the Boy Who Cried 5% Chance of Wolf

KatWoodsAug 15, 2022, 2:33 PM

140 points

24 comments2 min readLW link

And the Revenues Are So Small

ZviAug 15, 2022, 1:00 PM

19 points

5 comments11 min readLW link

(thezvi.wordpress.com)

Extreme Security

lcAug 15, 2022, 12:11 PM

38 points

6 comments5 min readLW link

No shortcuts to knowledge: Why AI needs to ease up on scaling and learn how to code

YldedlyAug 15, 2022, 8:42 AM

5 points

0 comments1 min readLW link

(deoxyribose.github.io)

Seeking Interns/RAs for Mechanistic Interpretability Projects

Neel NandaAug 15, 2022, 7:11 AM

61 points

0 comments2 min readLW link

A Mechanistic Interpretability Analysis of Grokking

Neel Nanda and Tom Lieberum

Aug 15, 2022, 2:41 AM

373 points

48 comments36 min readLW link 1 review

(colab.research.google.com)

[Question] If a nuke is coming towards SF Bay can people bunker in BART tunnels?

Pee DoomAug 15, 2022, 1:56 AM

15 points

2 comments1 min readLW link

[Question] What is the probability that a superintelligent, sentient AGI is actually infeasible?

Nathan1123Aug 14, 2022, 10:41 PM

−3 points

6 comments1 min readLW link

Dealing With Delusions

adrusiAug 14, 2022, 9:11 PM

9 points

2 comments1 min readLW link

All the posts I will never write

Alexander Gietelink OldenzielAug 14, 2022, 6:29 PM

54 points

8 comments8 min readLW link

Brain-like AGI project “aintelope”

Gunnar_ZarnckeAug 14, 2022, 4:33 PM

54 points

2 comments1 min readLW link

AI Transparency: Why it’s critical and how to obtain it.

Zohar JacksonAug 14, 2022, 10:31 AM

6 points

1 comment5 min readLW link

A brief note on Simplicity Bias

carboniferous_umbraculum Aug 14, 2022, 2:05 AM

20 points

0 comments4 min readLW link

Evolution is a bad analogy for AGI: inner alignment

Quintin PopeAug 13, 2022, 10:15 PM

81 points

15 comments8 min readLW link

An Uncanny Prison

Nathan1123Aug 13, 2022, 9:40 PM

3 points

3 comments2 min readLW link

Florida Elections

DoubleAug 13, 2022, 8:10 PM

−3 points

8 comments1 min readLW link

Cultivating Valiance

Shoshannah TekofskyAug 13, 2022, 6:47 PM

35 points

4 comments4 min readLW link

An extended rocket alignment analogy

rememberAug 13, 2022, 6:22 PM

28 points

3 comments4 min readLW link

[Question] The OpenAI playground for GPT-3 is a terrible interface. Is there any great local (or web) app for exploring/learning with language models?

avivAug 13, 2022, 4:34 PM

3 points

1 comment1 min readLW link