All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024

All JanFebMar Apr May Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 242526 27 28

Agents vs. Predictors: Concrete differentiating factors

evhub24 Feb 2023 23:50 UTC

37 points

3 comments4 min readLW link

Christiano (ARC) and GA (Conjecture) Discuss Alignment Cruxes

Andrea_Miotti, paulfchristiano, Gabriel Alfour and OliviaJ

24 Feb 2023 23:03 UTC

61 points

7 comments47 min readLW link

Retrospective on the 2022 Conjecture AI Discussions

Andrea_Miotti24 Feb 2023 22:41 UTC

90 points

5 comments2 min readLW link

How popular is ChatGPT? Part 1: more popular than Taylor Swift

Harlan24 Feb 2023 22:30 UTC

56 points

0 comments2 min readLW link

(aiimpacts.org)

Are you stably aligned?

Seth Herd24 Feb 2023 22:08 UTC

13 points

0 comments2 min readLW link

Puzzle Cycles

Screwtape24 Feb 2023 21:35 UTC

8 points

2 comments4 min readLW link

Sam Altman: “Planning for AGI and beyond”

LawrenceC24 Feb 2023 20:28 UTC

104 points

54 comments6 min readLW link

(openai.com)

A Proposed Test to Determine the Extent to Which Large Language Models Understand the Real World

Bruce G24 Feb 2023 20:20 UTC

4 points

7 comments8 min readLW link

Meta “open sources” LMs competitive with Chinchilla, PaLM, and code-davinci-002 (Paper)

LawrenceC24 Feb 2023 19:57 UTC

38 points

19 comments1 min readLW link

(research.facebook.com)

Relationship Orientations

DaystarEld24 Feb 2023 19:43 UTC

37 points

1 comment3 min readLW link

(daystareld.com)

The alien simulation meme doesn’t make sense

FTPickle24 Feb 2023 19:27 UTC

4 points

1 comment1 min readLW link

Exit Duty Generator by Matti Häyry

Oldphan24 Feb 2023 18:35 UTC

−2 points

0 comments1 min readLW link

(www.cambridge.org)

2023 Stanford Existential Risks Conference

elizabethcooper24 Feb 2023 18:35 UTC

7 points

0 comments1 min readLW link

How major governments can help with the most important century

HoldenKarnofsky24 Feb 2023 18:20 UTC

29 points

0 comments4 min readLW link

(www.cold-takes.com)

Consent Isn’t Always Enough

jefftk24 Feb 2023 15:40 UTC

57 points

16 comments3 min readLW link

(www.jefftk.com)

[Question] Training for corrigability: obvious problems?

Ben Amitay24 Feb 2023 14:02 UTC

4 points

6 comments1 min readLW link

Death and Desperation

Ustice24 Feb 2023 12:43 UTC

1 point

3 comments1 min readLW link

[Question] Are there rationality techniques similar to staring at the wall for 4 hours?

trevor24 Feb 2023 11:48 UTC

31 points

8 comments1 min readLW link

The fast takeoff motte/bailey

lc24 Feb 2023 7:11 UTC

0 points

7 comments1 min readLW link

AGI systems & humans will both need to solve the alignment problem

Jeffrey Ladish24 Feb 2023 3:29 UTC

59 points

14 comments4 min readLW link

A poor but certain attempt to philosophically undermine the orthogonality of intelligence and aims

Jay9524 Feb 2023 3:03 UTC

−2 points

1 comment1 min readLW link

I wanna Gandalf here

Igor Timofeev24 Feb 2023 1:22 UTC

5 points

4 comments1 min readLW link

[Link] A community alert about Ziz

DanielFilan24 Feb 2023 0:06 UTC

169 points

131 comments2 min readLW link 3 reviews

(medium.com)

Teleosemantics!

abramdemski23 Feb 2023 23:26 UTC

82 points

27 comments6 min readLW link 1 review

AI that shouldn’t work, yet kind of does

Donald Hobson23 Feb 2023 23:18 UTC

27 points

8 comments3 min readLW link

The AGI Optimist’s Dilemma

kaputmi23 Feb 2023 20:20 UTC

−6 points

1 comment1 min readLW link

Searching for a model’s concepts by their shape – a theoretical framework

Kaarel, gekaklam, Walter Laurito , Kay Kozaronek, AlexMennen and June Ku

23 Feb 2023 20:14 UTC

51 points

0 comments19 min readLW link

Why I’m Skeptical of De-Extinction

Niko_McCarty23 Feb 2023 19:42 UTC

16 points

1 comment11 min readLW link

(cell.substack.com)

[Question] What causes randomness?

lotsofquestions23 Feb 2023 18:50 UTC

1 point

12 comments1 min readLW link

Somerville Roads Getting More Dangerous?

jefftk23 Feb 2023 18:20 UTC

15 points

1 comment1 min readLW link

(www.jefftk.com)

EIS XII: Summary

scasper23 Feb 2023 17:45 UTC

18 points

0 comments6 min readLW link

How to survive in an AGI cataclysm

RomanS23 Feb 2023 14:34 UTC

−4 points

3 comments4 min readLW link

Covid 2/23/23: Your Best Possible Situation

Zvi23 Feb 2023 13:10 UTC

92 points

9 comments5 min readLW link

(thezvi.wordpress.com)

Full Transcript: Eliezer Yudkowsky on the Bankless podcast

remember and Andrea_Miotti

23 Feb 2023 12:34 UTC

138 points

89 comments75 min readLW link

Automated Sandwiching & Quantifying Human-LLM Cooperation: ScaleOversight hackathon results

Esben Kran, Fazl, Sabrina Zaki, gabrielrecc and rz2383

23 Feb 2023 10:48 UTC

8 points

0 comments6 min readLW link

[Question] How to estimate a pre-aligned value for a common discussion ground?

EL_File413823 Feb 2023 10:38 UTC

−4 points

12 comments1 min readLW link

Interpersonal alignment intuitions

TekhneMakre23 Feb 2023 9:37 UTC

29 points

18 comments2 min readLW link

Big Mac Subsidy?

jefftk23 Feb 2023 4:00 UTC

157 points

25 comments2 min readLW link

(www.jefftk.com)

[Question] What moral systems (e.g utilitarianism) are common among LessWrong users?

hollowing23 Feb 2023 3:33 UTC

1 point

9 comments1 min readLW link

AGI is likely to be cautious

PonPonPon23 Feb 2023 1:16 UTC

9 points

14 comments3 min readLW link

Short Notes on Research Process

Shoshannah Tekofsky22 Feb 2023 23:41 UTC

21 points

0 comments2 min readLW link

Video/animation: Neel Nanda explains what mechanistic interpretability is

DanielFilan22 Feb 2023 22:42 UTC

24 points

7 comments1 min readLW link

(youtu.be)

A Telepathic Exam about AI and Consequentialism

alkexr22 Feb 2023 21:00 UTC

4 points

4 comments4 min readLW link

[Question] Injecting noise to GPT to get multiple answers

bipolo22 Feb 2023 20:02 UTC

1 point

1 comment1 min readLW link

EIS XI: Moving Forward

scasper22 Feb 2023 19:05 UTC

19 points

2 comments9 min readLW link

Building and Entertaining Couples

Jacob Falkovich22 Feb 2023 19:02 UTC

85 points

11 comments4 min readLW link

Can submarines swim?

jasoncrawford22 Feb 2023 18:48 UTC

18 points

14 comments13 min readLW link

(rootsofprogress.org)

Is there a ML agent that abandons it’s utility function out-of-distribution without losing capabilities?

Christopher King22 Feb 2023 16:49 UTC

1 point

7 comments1 min readLW link

The male AI alignment solution

TekhneMakre22 Feb 2023 16:34 UTC

−25 points

24 comments1 min readLW link

Progress links and tweets, 2023-02-22

jasoncrawford22 Feb 2023 16:23 UTC

13 points

0 comments1 min readLW link

(rootsofprogress.org)