All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 202320242025

All Jan Feb Mar Apr May Jun Jul Aug Sep Oct NovDec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 262728 29 30 31

Corrigibility’s Desirability is Timing-Sensitive

RobertM26 Dec 2024 22:24 UTC

26 points

4 comments3 min readLW link

PCR retrospective

bhauth26 Dec 2024 21:20 UTC

22 points

0 comments8 min readLW link

(bhauth.com)

AI #96: o3 But Not Yet For Thee

Zvi26 Dec 2024 20:30 UTC

58 points

8 comments36 min readLW link

(thezvi.wordpress.com)

Super human AI is a very low hanging fruit!

Hzn26 Dec 2024 19:00 UTC

3 points

0 comments5 min readLW link

The Field of AI Alignment: A Postmortem, and What To Do About It

johnswentworth26 Dec 2024 18:48 UTC

266 points

139 comments8 min readLW link

ReSolsticed vol I: “We’re Not Going Quietly”

Raemon26 Dec 2024 17:52 UTC

55 points

4 comments19 min readLW link

[Question] Are Sparse Autoencoders a good idea for AI control?

Gerard Boxo26 Dec 2024 17:34 UTC

3 points

2 comments1 min readLW link

A Three-Layer Model of LLM Psychology

Jan_Kulveit26 Dec 2024 16:49 UTC

97 points

7 comments8 min readLW link

Human, All Too Human—Superintelligence requires learning things we can’t teach

Ben Turtel26 Dec 2024 16:26 UTC

−6 points

4 comments1 min readLW link

(bturtel.substack.com)

[Question] Why don’t we currently have AI agents?

ChristianKl26 Dec 2024 15:26 UTC

7 points

10 comments1 min readLW link

[Question] What would be the IQ and other benchmarks of o3 that uses $1 million worth of compute resources to answer one question?

avturchin26 Dec 2024 11:08 UTC

16 points

2 comments1 min readLW link

The Economics & Practicality of Starting Mars Colonization

Zero Contradictions26 Dec 2024 10:56 UTC

2 points

1 comment1 min readLW link

(zerocontradictions.net)

Terminal goal vs Intelligence

Donatas Lučiūnas26 Dec 2024 8:10 UTC

−12 points

24 comments1 min readLW link

Streamlining my voice note process

Vlad Sitalo26 Dec 2024 6:04 UTC

6 points

1 comment7 min readLW link

(vlad.roam.garden)

Whistleblowing Twitter Bot

Mckiev26 Dec 2024 4:09 UTC

19 points

5 comments2 min readLW link

Open Thread Winter 2024/2025

habryka25 Dec 2024 21:02 UTC

17 points

6 comments1 min readLW link

Exploring Cooperation: The Path to Utopia

Davidmanheim25 Dec 2024 18:31 UTC

10 points

0 comments1 min readLW link

(exploringcooperation.substack.com)

Living with Rats in College

lsusr25 Dec 2024 10:44 UTC

25 points

0 comments1 min readLW link

[Question] What Have Been Your Most Valuable Casual Conversations At Conferences?

johnswentworth25 Dec 2024 5:49 UTC

54 points

20 comments1 min readLW link

The Opening Salvo: 1. An Ontological Consciousness Metric: Resistance to Behavioral Modification as a Measure of Recursive Awareness

Peterpiper25 Dec 2024 2:29 UTC

−3 points

0 comments5 min readLW link

The Deep Lore of LightHaven, with Oliver Habryka (TBC episode 228)

Eneasz and habryka

24 Dec 2024 22:45 UTC

45 points

4 comments91 min readLW link

(thebayesianconspiracy.substack.com)

Acknowledging Background Information with P(Q|I)

JenniferRM24 Dec 2024 18:50 UTC

29 points

8 comments14 min readLW link

Game Theory and Behavioral Economics in The Stock Market

Jaiveer Singh24 Dec 2024 18:15 UTC

1 point

0 comments3 min readLW link

[Question] What are the main arguments against AGI?

Edy Nastase24 Dec 2024 15:49 UTC

1 point

6 comments1 min readLW link

[Question] Recommendations on communities that discuss AI applications in society

Annapurna24 Dec 2024 13:37 UTC

7 points

2 comments1 min readLW link

AIs Will Increasingly Fake Alignment

Zvi24 Dec 2024 13:00 UTC

89 points

0 comments52 min readLW link

(thezvi.wordpress.com)

Apply to the 2025 PIBBSS Summer Research Fellowship

DusanDNesic and Lucas Teixeira

24 Dec 2024 10:25 UTC

15 points

0 comments2 min readLW link

Human-AI Complementarity: A Goal for Amplified Oversight

rishubjain24 Dec 2024 9:57 UTC

21 points

1 comment1 min readLW link

(deepmindsafetyresearch.medium.com)

Preliminary Thoughts on Flirting Theory

la .alis.24 Dec 2024 7:37 UTC

12 points

6 comments3 min readLW link

[Question] Why is neuron count of human brain relevant to AI timelines?

xpostah24 Dec 2024 5:15 UTC

6 points

7 comments1 min readLW link

How Much to Give is a Pragmatic Question

jefftk24 Dec 2024 4:20 UTC

12 points

1 comment2 min readLW link

(www.jefftk.com)

Do you need a better map of your myriad of maps to the territory?

CstineSublime24 Dec 2024 2:00 UTC

11 points

2 comments5 min readLW link

Panology

JenniferRM23 Dec 2024 21:40 UTC

11 points

8 comments5 min readLW link

Aristotle, Aquinas, and the Evolution of Teleology: From Purpose to Meaning.

Spiritus Dei23 Dec 2024 19:37 UTC

−7 points

0 comments6 min readLW link

People aren’t properly calibrated on FrontierMath

cakubilo23 Dec 2024 19:35 UTC

30 points

4 comments3 min readLW link

Near- and medium-term AI Control Safety Cases

Martín Soto23 Dec 2024 17:37 UTC

9 points

0 comments6 min readLW link

[Rationality Malaysia] 2024 year-end meetup!

Doris Liew23 Dec 2024 16:02 UTC

1 point

0 comments1 min readLW link

Printable book of some rationalist creative writing (from Scott A. & Eliezer)

CounterBlunder23 Dec 2024 15:44 UTC

5 points

0 comments1 min readLW link

Monthly Roundup #25: December 2024

Zvi23 Dec 2024 14:20 UTC

18 points

3 comments26 min readLW link

(thezvi.wordpress.com)

Exploring the petertodd / Leilan duality in GPT-2 and GPT-J

mwatkins23 Dec 2024 13:17 UTC

10 points

0 comments17 min readLW link

[Question] What are the strongest arguments for very short timelines?

Kaj_Sotala23 Dec 2024 9:38 UTC

94 points

73 comments1 min readLW link

Reduce AI Self-Allegiance by saying “he” instead of “I”

Knight Lee23 Dec 2024 9:32 UTC

6 points

4 comments2 min readLW link

Funding Case: AI Safety Camp 11

Remmelt, Robert Kralisch and Linda Linsefors

23 Dec 2024 8:51 UTC

23 points

0 comments6 min readLW link

(manifund.org)

What is compute governance?

Vishakha23 Dec 2024 6:32 UTC

6 points

0 comments2 min readLW link

(aisafety.info)

Stop Making Sense

JenniferRM23 Dec 2024 5:16 UTC

15 points

0 comments3 min readLW link

Hire (or Become) a Thinking Assistant

Raemon23 Dec 2024 3:58 UTC

119 points

42 comments8 min readLW link

Non-Obvious Benefits of Insurance

jefftk23 Dec 2024 3:40 UTC

21 points

5 comments2 min readLW link

(www.jefftk.com)

Vision of a positive Singularity

RussellThor23 Dec 2024 2:19 UTC

4 points

0 comments4 min readLW link

Ideologies are slow and necessary, for now

Gabriel Alfour23 Dec 2024 1:57 UTC

9 points

1 comment1 min readLW link

(cognition.cafe)

Propaganda Is Everywhere—LLM Models Are No Exception

Yanling Guo23 Dec 2024 1:39 UTC

−13 points

0 comments3 min readLW link