All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 20232024

AllJanFeb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

All 1 2 345 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Investigating Alternative Futures: Human and Superintelligence Interaction Scenarios

Hiroshi Yamakawa3 Jan 2024 23:46 UTC

1 point

0 comments17 min readLW link

“Attitudes Toward Artificial General Intelligence: Results from American Adults 2021 and 2023”—call for reviewers (Seeds of Science)

rogersbacon3 Jan 2024 20:11 UTC

4 points

0 comments1 min readLW link

What’s up with LLMs representing XORs of arbitrary features?

Sam Marks3 Jan 2024 19:44 UTC

157 points

61 comments16 min readLW link

Spirit Airlines Merger Play

sapphire3 Jan 2024 19:25 UTC

5 points

12 comments1 min readLW link

$300 for the best sci-fi prompt: the results

RomanS3 Jan 2024 19:10 UTC

16 points

19 comments7 min readLW link

Agent membranes/boundaries and formalizing “safety”

Chipmonk3 Jan 2024 17:55 UTC

26 points

46 comments3 min readLW link

Safety First: safety before full alignment. The deontic sufficiency hypothesis.

Chipmonk3 Jan 2024 17:55 UTC

48 points

3 comments3 min readLW link

Practically A Book Review: Appendix to “Nonlinear’s Evidence: Debunking False and Misleading Claims” (ThingOfThings)

tailcalled3 Jan 2024 17:07 UTC

111 points

25 comments2 min readLW link

(thingofthings.substack.com)

Trivial Mathematics as a Path Forward

ACrackedPot3 Jan 2024 16:41 UTC

−4 points

2 comments2 min readLW link

Copyright Confrontation #1

Zvi3 Jan 2024 15:50 UTC

34 points

7 comments18 min readLW link

(thezvi.wordpress.com)

[Question] Theoretically, could we balance the budget painlessly?

Logan Zoellner3 Jan 2024 14:46 UTC

4 points

12 comments1 min readLW link

Johannes’ Biography

Johannes C. Mayer3 Jan 2024 13:27 UTC

24 points

0 comments10 min readLW link

What Helped Me—Kale, Blood, CPAP, X-tiamine, Methylphenidate

Johannes C. Mayer3 Jan 2024 13:22 UTC

35 points

12 comments2 min readLW link

[Question] Does LessWrong make a difference when it comes to AI alignment?

PhilosophicalSoul3 Jan 2024 12:21 UTC

18 points

13 comments1 min readLW link

[Question] Terminology: <something>-ware for ML?

Oliver Sourbut3 Jan 2024 11:42 UTC

17 points

27 comments1 min readLW link

Trading off Lives

jefftk3 Jan 2024 3:40 UTC

53 points

12 comments2 min readLW link

(www.jefftk.com)

MonoPoly Restricted Trust

ymeskhout2 Jan 2024 23:02 UTC

42 points

37 comments9 min readLW link

Agent membranes and causal distance

Chipmonk2 Jan 2024 22:43 UTC

20 points

3 comments3 min readLW link

Focusing on Mal-Alignment

John Fisher2 Jan 2024 19:51 UTC

1 point

0 comments1 min readLW link

Gentleness and the artificial Other

Joe Carlsmith2 Jan 2024 18:21 UTC

292 points

33 comments11 min readLW link

Otherness and control in the age of AGI

Joe Carlsmith2 Jan 2024 18:15 UTC

43 points

0 comments7 min readLW link

Apologizing is a Core Rationalist Skill

johnswentworth2 Jan 2024 17:47 UTC

153 points

42 comments5 min readLW link

Cortés, AI Risk, and the Dynamics of Competing Conquerors

James_Miller2 Jan 2024 16:37 UTC

14 points

2 comments3 min readLW link

OpenAI’s Preparedness Framework: Praise & Recommendations

Akash2 Jan 2024 16:20 UTC

66 points

1 comment7 min readLW link

Dating Roundup #2: If At First You Don’t Succeed

Zvi2 Jan 2024 16:00 UTC

54 points

29 comments47 min readLW link

(thezvi.wordpress.com)

Looking for Reading Recommendations: Content Moderation, Power & Censorship

Joerg Weiss2 Jan 2024 11:37 UTC

2 points

7 comments1 min readLW link

AI Is Not Software

Davidmanheim2 Jan 2024 7:58 UTC

56 points

29 comments5 min readLW link

Are Metaculus AI Timelines Inconsistent?

Chris_Leong2 Jan 2024 6:47 UTC

16 points

7 comments2 min readLW link

Boston Solstice 2023 Retrospective

jefftk2 Jan 2024 3:10 UTC

33 points

0 comments6 min readLW link

(www.jefftk.com)

Steering Llama-2 with contrastive activation additions

Nina Panickssery, Wuschel Schulz, NickGabs, Meg, evhub and TurnTrout

2 Jan 2024 0:47 UTC

124 points

29 comments8 min readLW link

(arxiv.org)

Twin Cities ACX Meetup—January 2024

Timothy M.1 Jan 2024 21:13 UTC

1 point

2 comments1 min readLW link

San Francisco ACX Meetup “First Saturday”

guenael1 Jan 2024 20:58 UTC

1 point

1 comment1 min readLW link

Mech Interp Challenge: January—Deciphering the Caesar Cipher Model

CallumMcDougall1 Jan 2024 18:03 UTC

17 points

0 comments3 min readLW link

Aldix and the Book of Life

ville1 Jan 2024 17:23 UTC

1 point

0 comments4 min readLW link

(medium.com)

Metaculus Hosts ACX 2024 Prediction Contest

ChristianWilliams1 Jan 2024 16:38 UTC

4 points

0 comments1 min readLW link

(www.metaculus.com)

The Act Itself: Exceptionless Moral Norms

SebastianG 1 Jan 2024 16:06 UTC

5 points

11 comments6 min readLW link

Deception Chess

Chris Land1 Jan 2024 15:40 UTC

7 points

2 comments4 min readLW link

Stop talking about p(doom)

Isaac King1 Jan 2024 10:57 UTC

39 points

22 comments3 min readLW link

[Question] What should a non-genius do in the face of rapid progress in GAI to ensure a decent life?

kaler1 Jan 2024 8:22 UTC

11 points

16 comments1 min readLW link

A hermeneutic net for agency

TsviBT1 Jan 2024 8:06 UTC

58 points

4 comments30 min readLW link

Research Jan/Feb 2024

Stephen Fowler1 Jan 2024 6:02 UTC

9 points

0 comments2 min readLW link

2023 in AI predictions

jessicata1 Jan 2024 5:23 UTC

107 points

35 comments5 min readLW link

Rhythm Stage Setup Components

jefftk1 Jan 2024 3:10 UTC

10 points

4 comments2 min readLW link

(www.jefftk.com)

Bayesian updating in real life is mostly about understanding your hypotheses

Max H1 Jan 2024 0:10 UTC

63 points

4 comments11 min readLW link

Dark Art: Inception

Abu Ibrahim31 Dec 2023 21:09 UTC

10 points

0 comments3 min readLW link

A case for AI alignment being difficult

jessicata31 Dec 2023 19:55 UTC

105 points

58 comments15 min readLW link 1 review

(unstableontology.com)

The Roots of Progress 2023 in review

jasoncrawford31 Dec 2023 18:16 UTC

22 points

0 comments11 min readLW link

(rootsofprogress.org)

Extended Navel-Gazing On My 2023 Donations

jenn31 Dec 2023 18:10 UTC

8 points

0 comments1 min readLW link

(jenn.site)

aisafety.info, the Table of Content

Charbel-Raphaël31 Dec 2023 13:57 UTC

23 points

1 comment11 min readLW link

AIOS

samhealy31 Dec 2023 13:23 UTC

−3 points

5 comments6 min readLW link