All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024

All Jan Feb Mar Apr May Jun Jul Aug Sep OctNovDec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 262728 29 30

Situational awareness (Section 2.1 of “Scheming AIs”)

Joe Carlsmith26 Nov 2023 23:00 UTC

10 points

5 comments8 min readLW link

AXRP Episode 26 - AI Governance with Elizabeth Seger

DanielFilan26 Nov 2023 23:00 UTC

14 points

0 comments66 min readLW link

Solving Two-Sided Adverse Selection with Prediction Market Matchmaking

Saul Munn26 Nov 2023 20:10 UTC

16 points

7 comments4 min readLW link

(www.brasstacks.blog)

Wikipedia is not so great, and what can be done about it.

euserx26 Nov 2023 19:13 UTC

0 points

27 comments16 min readLW link

(forum.effectivealtruism.org)

[Question] Help me solve this problem: The basilisk isn’t real, but people are

canary_itm26 Nov 2023 17:44 UTC

−19 points

4 comments1 min readLW link

Twin Cities ACX Meetup—December 2023

Timothy M.26 Nov 2023 17:32 UTC

1 point

1 comment1 min readLW link

Spaced repetition for teaching two-year olds how to read (Interview)

Chipmonk26 Nov 2023 16:52 UTC

48 points

9 comments5 min readLW link

(chipmonk.substack.com)

Paper out now on creatine and cognitive performance

Fabienne26 Nov 2023 10:58 UTC

58 points

2 comments1 min readLW link

Why Q*, if real, might be a game changer

Shmi26 Nov 2023 6:12 UTC

5 points

6 comments1 min readLW link

Moral Reality Check (a short story)

jessicata26 Nov 2023 5:03 UTC

148 points

45 comments21 min readLW link 1 review

(unstableontology.com)

Accounting for Foregone Pay

jefftk26 Nov 2023 3:30 UTC

11 points

0 comments2 min readLW link

(www.jefftk.com)

Corrigibility or DWIM is an attractive primary goal for AGI

Seth Herd25 Nov 2023 19:37 UTC

16 points

4 comments1 min readLW link

On “slack” in training (Section 1.5 of “Scheming AIs”)

Joe Carlsmith25 Nov 2023 17:51 UTC

1 point

0 comments5 min readLW link

Announcing New Beginner-friendly Book on AI Safety and Risk

Darren McKee25 Nov 2023 15:57 UTC

64 points

2 comments1 min readLW link

Fertility as Metascience

Maxwell Tabarrok25 Nov 2023 15:42 UTC

20 points

1 comment3 min readLW link

(maximumprogress.substack.com)

Reaction to “Empowerment is (almost) All We Need” : an open-ended alternative

Ryo 25 Nov 2023 15:35 UTC

9 points

3 comments5 min readLW link

How Microsoft’s ruthless employee evaluation system annihilated team collaboration.

positivesum25 Nov 2023 13:25 UTC

3 points

2 comments1 min readLW link

(tryingtruly.substack.com)

What are the results of more parental supervision and less outdoor play?

juliawise25 Nov 2023 12:52 UTC

226 points

31 comments5 min readLW link

A simple treacherous turn demonstration

Nikola Jurkovic25 Nov 2023 4:51 UTC

22 points

5 comments3 min readLW link

The two paragraph argument for AI risk

CronoDAS25 Nov 2023 2:01 UTC

19 points

8 comments1 min readLW link

Goodhart’s Law Example: Training Verifiers to Solve Math Word Problems

Chris_Leong25 Nov 2023 0:53 UTC

27 points

2 comments1 min readLW link

(arxiv.org)

Some thoughts on CBDC

PixelatedPenguin25 Nov 2023 0:32 UTC

−1 points

1 comment1 min readLW link

Testing for consequence-blindness in LLMs using the HI-ADS unit test.

David Scott Krueger (formerly: capybaralet)24 Nov 2023 23:35 UTC

25 points

2 comments2 min readLW link

Epoch is hiring an ML Distributed Systems Senior Researcher

merilalama and Jaime Sevilla Molina

24 Nov 2023 22:33 UTC

2 points

0 comments4 min readLW link

(careers.rethinkpriorities.org)

Article Discussion And Free Pizza—St Paul

25Hour24 Nov 2023 21:02 UTC

1 point

0 comments1 min readLW link

Why focus on schemers in particular (Sections 1.3 and 1.4 of “Scheming AIs”)

Joe Carlsmith24 Nov 2023 19:18 UTC

8 points

0 comments22 min readLW link

Surviving and Shaping Long-Term Competitions: Lessons from Net Assessment

Gentzel and ihavenoahidea

24 Nov 2023 18:18 UTC

5 points

0 comments13 min readLW link

Ability to solve long-horizon tasks correlates with wanting things in the behaviorist sense

So8res24 Nov 2023 17:37 UTC

195 points

84 comments5 min readLW link 1 review

The Limitations of GPT-4

p.b.24 Nov 2023 15:30 UTC

27 points

12 comments4 min readLW link

Progress links digest, 2023-11-24: Bottlenecks of aging, Starship launches, and much more

jasoncrawford24 Nov 2023 15:25 UTC

40 points

1 comment14 min readLW link

(rootsofprogress.org)

[Question] What’s the evidence that LLMs will scale up efficiently beyond GPT4? i.e. couldn’t GPT5, etc., be very inefficient?

M. Y. Zuo24 Nov 2023 15:22 UTC

9 points

6 comments1 min readLW link

Sapience, understanding, and “AGI”

Seth Herd24 Nov 2023 15:13 UTC

15 points

3 comments6 min readLW link

Insulate your ideas

Logan Kieller24 Nov 2023 14:08 UTC

18 points

5 comments2 min readLW link

(logankieller.substack.com)

Bordeaux, Gironde, France – irregular ACX Meetup 2023-12-09

vi21maobk9vp24 Nov 2023 11:17 UTC

5 points

1 comment1 min readLW link

[Question] A Question For People Who Believe In God

yanni kyriacos24 Nov 2023 5:22 UTC

3 points

38 comments1 min readLW link

[Question] First and Last Questions for GPT-5*

Mitchell_Porter24 Nov 2023 5:03 UTC

15 points

5 comments1 min readLW link

4. A Moral Case for Evolved-Sapience-Chauvinism

RogerDearnaley24 Nov 2023 4:56 UTC

10 points

0 comments4 min readLW link

Detecting What’s Been Seen

jefftk24 Nov 2023 3:30 UTC

23 points

0 comments2 min readLW link

(www.jefftk.com)

[Question] Help to find a blog I don’t remember the name of

JavierCC23 Nov 2023 22:49 UTC

3 points

2 comments1 min readLW link

[Question] What did you change your mind about in the last year?

mike_hawke23 Nov 2023 20:53 UTC

41 points

16 comments1 min readLW link

A few Superhuman examples of Superaligned Superintelligence from Google Bard (Thanksgiving 2023)

bionicles and bionalexhoward

23 Nov 2023 19:06 UTC

−9 points

1 comment17 min readLW link

Prepsgiving, A Convergently Instrumental Human Practice

JenniferRM23 Nov 2023 17:24 UTC

39 points

0 comments7 min readLW link

AI #39: The Week of OpenAI

Zvi23 Nov 2023 15:10 UTC

67 points

8 comments28 min readLW link

(thezvi.wordpress.com)

3. Uploading

RogerDearnaley23 Nov 2023 7:39 UTC

21 points

5 comments8 min readLW link

2. AIs as Economic Agents

RogerDearnaley23 Nov 2023 7:07 UTC

9 points

2 comments6 min readLW link

Thomas Kwa’s research journal

Thomas Kwa and Adrià Garriga-alonso

23 Nov 2023 5:11 UTC

79 points

1 comment6 min readLW link

Never Drop A Ball

Screwtape23 Nov 2023 4:15 UTC

63 points

1 comment6 min readLW link

Possible OpenAI’s Q* breakthrough and DeepMind’s AlphaGo-type systems plus LLMs

Burny23 Nov 2023 3:16 UTC

37 points

25 comments2 min readLW link

Boston Secular Solstice: Call for Singers and Musicans

jefftk23 Nov 2023 2:40 UTC

16 points

2 comments1 min readLW link

(www.jefftk.com)

My Mental Model of Infohazards

MadHatter23 Nov 2023 2:37 UTC

8 points

34 comments2 min readLW link 1 review