All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024

All Jan Feb Mar Apr May Jun Jul Aug Sep OctNovDec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 252627 28 29 30

Corrigibility or DWIM is an attractive primary goal for AGI

Seth Herd25 Nov 2023 19:37 UTC

16 points

4 comments1 min readLW link

On “slack” in training (Section 1.5 of “Scheming AIs”)

Joe Carlsmith25 Nov 2023 17:51 UTC

1 point

0 comments5 min readLW link

Announcing New Beginner-friendly Book on AI Safety and Risk

Darren McKee25 Nov 2023 15:57 UTC

64 points

2 comments1 min readLW link

Fertility as Metascience

Maxwell Tabarrok25 Nov 2023 15:42 UTC

20 points

1 comment3 min readLW link

(maximumprogress.substack.com)

Reaction to “Empowerment is (almost) All We Need” : an open-ended alternative

Ryo 25 Nov 2023 15:35 UTC

9 points

3 comments5 min readLW link

How Microsoft’s ruthless employee evaluation system annihilated team collaboration.

positivesum25 Nov 2023 13:25 UTC

3 points

2 comments1 min readLW link

(tryingtruly.substack.com)

What are the results of more parental supervision and less outdoor play?

juliawise25 Nov 2023 12:52 UTC

226 points

31 comments5 min readLW link

A simple treacherous turn demonstration

nikola25 Nov 2023 4:51 UTC

22 points

5 comments3 min readLW link

The two paragraph argument for AI risk

CronoDAS25 Nov 2023 2:01 UTC

19 points

8 comments1 min readLW link

Goodhart’s Law Example: Training Verifiers to Solve Math Word Problems

Chris_Leong25 Nov 2023 0:53 UTC

27 points

2 comments1 min readLW link

(arxiv.org)

Some thoughts on CBDC

PixelatedPenguin25 Nov 2023 0:32 UTC

−1 points

1 comment1 min readLW link

Testing for consequence-blindness in LLMs using the HI-ADS unit test.

David Scott Krueger (formerly: capybaralet)24 Nov 2023 23:35 UTC

25 points

2 comments2 min readLW link

Epoch is hiring an ML Distributed Systems Senior Researcher

merilalama and Jaime Sevilla Molina

24 Nov 2023 22:33 UTC

2 points

0 comments4 min readLW link

(careers.rethinkpriorities.org)

Article Discussion And Free Pizza—St Paul

25Hour24 Nov 2023 21:02 UTC

1 point

0 comments1 min readLW link

Why focus on schemers in particular (Sections 1.3 and 1.4 of “Scheming AIs”)

Joe Carlsmith24 Nov 2023 19:18 UTC

8 points

0 comments22 min readLW link

Surviving and Shaping Long-Term Competitions: Lessons from Net Assessment

Gentzel and ihavenoahidea

24 Nov 2023 18:18 UTC

5 points

0 comments13 min readLW link

Ability to solve long-horizon tasks correlates with wanting things in the behaviorist sense

So8res24 Nov 2023 17:37 UTC

195 points

84 comments5 min readLW link 1 review

The Limitations of GPT-4

p.b.24 Nov 2023 15:30 UTC

27 points

12 comments4 min readLW link

Progress links digest, 2023-11-24: Bottlenecks of aging, Starship launches, and much more

jasoncrawford24 Nov 2023 15:25 UTC

40 points

1 comment14 min readLW link

(rootsofprogress.org)

[Question] What’s the evidence that LLMs will scale up efficiently beyond GPT4? i.e. couldn’t GPT5, etc., be very inefficient?

M. Y. Zuo24 Nov 2023 15:22 UTC

9 points

6 comments1 min readLW link

Sapience, understanding, and “AGI”

Seth Herd24 Nov 2023 15:13 UTC

15 points

3 comments6 min readLW link

Insulate your ideas

Logan Kieller24 Nov 2023 14:08 UTC

18 points

5 comments2 min readLW link

(logankieller.substack.com)

Bordeaux, Gironde, France – irregular ACX Meetup 2023-12-09

vi21maobk9vp24 Nov 2023 11:17 UTC

5 points

1 comment1 min readLW link

[Question] A Question For People Who Believe In God

yanni kyriacos24 Nov 2023 5:22 UTC

3 points

38 comments1 min readLW link

[Question] First and Last Questions for GPT-5*

Mitchell_Porter24 Nov 2023 5:03 UTC

15 points

5 comments1 min readLW link

4. A Moral Case for Evolved-Sapience-Chauvinism

RogerDearnaley24 Nov 2023 4:56 UTC

10 points

0 comments4 min readLW link

Detecting What’s Been Seen

jefftk24 Nov 2023 3:30 UTC

23 points

0 comments2 min readLW link

(www.jefftk.com)

[Question] Help to find a blog I don’t remember the name of

JavierCC23 Nov 2023 22:49 UTC

3 points

2 comments1 min readLW link

[Question] What did you change your mind about in the last year?

mike_hawke23 Nov 2023 20:53 UTC

41 points

16 comments1 min readLW link

A few Superhuman examples of Superaligned Superintelligence from Google Bard (Thanksgiving 2023)

bionicles and bionalexhoward

23 Nov 2023 19:06 UTC

−9 points

1 comment17 min readLW link

Prepsgiving, A Convergently Instrumental Human Practice

JenniferRM23 Nov 2023 17:24 UTC

39 points

0 comments7 min readLW link

AI #39: The Week of OpenAI

Zvi23 Nov 2023 15:10 UTC

67 points

8 comments28 min readLW link

(thezvi.wordpress.com)

3. Uploading

RogerDearnaley23 Nov 2023 7:39 UTC

21 points

5 comments8 min readLW link

2. AIs as Economic Agents

RogerDearnaley23 Nov 2023 7:07 UTC

9 points

2 comments6 min readLW link

Thomas Kwa’s research journal

Thomas Kwa and Adrià Garriga-alonso

23 Nov 2023 5:11 UTC

79 points

1 comment6 min readLW link

Never Drop A Ball

Screwtape23 Nov 2023 4:15 UTC

62 points

1 comment6 min readLW link

Possible OpenAI’s Q* breakthrough and DeepMind’s AlphaGo-type systems plus LLMs

Burny23 Nov 2023 3:16 UTC

37 points

25 comments2 min readLW link

Boston Secular Solstice: Call for Singers and Musicans

jefftk23 Nov 2023 2:40 UTC

16 points

2 comments1 min readLW link

(www.jefftk.com)

My Mental Model of Infohazards

MadHatter23 Nov 2023 2:37 UTC

8 points

34 comments2 min readLW link 1 review

Saturating the Difficulty Levels of Alignment

Johannes C. Mayer23 Nov 2023 0:39 UTC

6 points

0 comments2 min readLW link

Sacramento LW/ACX Meetup

mcint22 Nov 2023 23:52 UTC

1 point

0 comments1 min readLW link

Sam Altman’s ouster at OpenAI was precipitated by letter to board about AI breakthrough—Reuters

Jonathan Yan22 Nov 2023 23:17 UTC

18 points

11 comments1 min readLW link

(www.reuters.com)

Foresight Institute: 2023 Progress & 2024 Plans for funding beneficial technology development

Allison Duettmann22 Nov 2023 22:09 UTC

24 points

1 comment6 min readLW link

AISC project: TinyEvals

Jett Janiak22 Nov 2023 20:47 UTC

22 points

0 comments4 min readLW link

The proposal to add a ``Last Judge″ to an AI, does not remove the urgency, of making progress on the ``what alignment target should be aimed at?″ question.

ThomasCederborg22 Nov 2023 18:59 UTC

1 point

0 comments18 min readLW link

Neither Copernicus, Galileo, nor Kepler had proof

Meow P22 Nov 2023 18:41 UTC

4 points

10 comments1 min readLW link

(www.cricetuscricetus.co.uk)

OpenAI: The Battle of the Board

Zvi22 Nov 2023 17:30 UTC

281 points

83 comments11 min readLW link

(thezvi.wordpress.com)

Altman returns as OpenAI CEO with new board

Seth Herd22 Nov 2023 16:04 UTC

6 points

3 comments1 min readLW link

A taxonomy of non-schemer models (Section 1.2 of “Scheming AIs”)

Joe Carlsmith22 Nov 2023 15:24 UTC

13 points

0 comments13 min readLW link

AI debate: test yourself against chess ‘AIs’

Richard Willis22 Nov 2023 14:58 UTC

26 points

35 comments4 min readLW link