All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024 2025

AllJanFeb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 252627 28 29 30 31

2022 Less Wrong Census/Survey: Request for Comments

Screwtape25 Jan 2023 20:57 UTC

5 points

29 comments1 min readLW link

Next steps after AGISF at UMich

JakubK25 Jan 2023 20:57 UTC

10 points

0 comments5 min readLW link

(docs.google.com)

AGI will have learnt utility functions

beren25 Jan 2023 19:42 UTC

36 points

3 comments13 min readLW link

[RFC] Possible ways to expand on “Discovering Latent Knowledge in Language Models Without Supervision”.

gekaklam, Walter Laurito , Kaarel and Kay Kozaronek

25 Jan 2023 19:03 UTC

48 points

6 comments12 min readLW link

Spreading messages to help with the most important century

HoldenKarnofsky25 Jan 2023 18:20 UTC

75 points

4 comments18 min readLW link

(www.cold-takes.com)

My Model Of EA Burnout

LoganStrohl25 Jan 2023 17:52 UTC

255 points

50 comments5 min readLW link 1 review

Thoughts on the impact of RLHF research

paulfchristiano25 Jan 2023 17:23 UTC

250 points

102 comments9 min readLW link

[Question] Could AI be used to engineer a sociopolitical situation where humans can solve the problems surrounding AGI?

hollowing25 Jan 2023 17:17 UTC

1 point

6 comments1 min readLW link

Progress links and tweets, 2023-01-25

jasoncrawford25 Jan 2023 16:12 UTC

8 points

0 comments1 min readLW link

(rootsofprogress.org)

Visualisation of Probability Mass

brook25 Jan 2023 15:09 UTC

7 points

0 comments1 min readLW link

When Did EA Start?

jefftk25 Jan 2023 14:30 UTC

37 points

2 comments2 min readLW link

(www.jefftk.com)

Some Thoughts on AI Art

abramdemski25 Jan 2023 14:18 UTC

74 points

20 comments7 min readLW link

Quick thoughts on “scalable oversight” / “super-human feedback” research

David Scott Krueger (formerly: capybaralet)25 Jan 2023 12:55 UTC

27 points

9 comments2 min readLW link

Sapir-Whorf for Rationalists

Duncan Sabien (Deactivated)25 Jan 2023 7:58 UTC

154 points

49 comments19 min readLW link

ChatGPT vs the 2-4-6 Task

cwillu25 Jan 2023 6:59 UTC

20 points

4 comments3 min readLW link

Pessimistic Shard Theory

Garrett Baker25 Jan 2023 0:59 UTC

72 points

13 comments3 min readLW link

Thatcher’s Axiom

Edward P. Könings24 Jan 2023 22:35 UTC

10 points

22 comments4 min readLW link

[Question] Some questions about free will compatibilism

Asking Questions24 Jan 2023 21:54 UTC

3 points

21 comments6 min readLW link

Alexander and Yudkowsky on AGI goals

Scott Alexander and Eliezer Yudkowsky

24 Jan 2023 21:09 UTC

177 points

53 comments26 min readLW link 1 review

[Question] Is _The Age of AI: And Our Human Future_ worth reading

jmh24 Jan 2023 21:05 UTC

4 points

0 comments1 min readLW link

Inverse Scaling Prize: Second Round Winners

Ian McKenzie, Sam Bowman and Ethan Perez

24 Jan 2023 20:12 UTC

58 points

17 comments15 min readLW link

ChatGPT intimates a tantalizing future; its core LLM is organized on multiple levels; and it has broken the idea of thinking.

Bill Benzon24 Jan 2023 19:05 UTC

5 points

0 comments5 min readLW link

How-to Transformer Mechanistic Interpretability—in 50 lines of code or less!

StefanHex24 Jan 2023 18:45 UTC

47 points

5 comments13 min readLW link

The Cabinet of Wikipedian Curiosities

Sam Enright24 Jan 2023 18:22 UTC

36 points

5 comments6 min readLW link

(samenright.com)

Explanatory Parsimony, Explanatory Superfluousness and Uselessness of Newton’s First Law

Jimdrix_Hendri24 Jan 2023 17:21 UTC

−2 points

7 comments2 min readLW link

Guesstimate: Why and how to use it

brook and chanamessinger

24 Jan 2023 16:24 UTC

8 points

0 comments3 min readLW link

(forum.effectivealtruism.org)

GWWC Pledge History

jefftk24 Jan 2023 15:50 UTC

15 points

0 comments3 min readLW link

(www.jefftk.com)

Gradient hacking is extremely difficult

beren24 Jan 2023 15:45 UTC

162 points

22 comments5 min readLW link

[Question] What sci-fi books are most relevant to a future with transformative AI?

sid24 Jan 2023 15:30 UTC

2 points

9 comments1 min readLW link

Grant-making in EA should consider peer-reviewing grant applications along the public-sector model

Ben Smith24 Jan 2023 15:01 UTC

0 points

3 comments1 min readLW link

“Endgame safety” for AGI

Steven Byrnes24 Jan 2023 14:15 UTC

85 points

10 comments6 min readLW link

Thoughts on hardware / compute requirements for AGI

Steven Byrnes24 Jan 2023 14:03 UTC

59 points

30 comments24 min readLW link

Parameter Scaling Comes for RL, Maybe

1a3orn24 Jan 2023 13:55 UTC

100 points

3 comments14 min readLW link

How to find cool things in a new place

Sam F. Brown24 Jan 2023 11:20 UTC

12 points

0 comments1 min readLW link

[Crosspost] ACX 2022 Prediction Contest Results

Scott Alexander, Eric Neyman and Sam Marks

24 Jan 2023 6:56 UTC

46 points

6 comments8 min readLW link

The Human-AI Reflective Equilibrium

Allison Duettmann24 Jan 2023 1:32 UTC

22 points

1 comment24 min readLW link

“Status” can be corrosive; here’s how I handle it

Akash24 Jan 2023 1:25 UTC

71 points

8 comments6 min readLW link

[Question] What area of the digital domain seems safe from AI in the next 5-10 years?

Adrien Chauvet24 Jan 2023 1:16 UTC

11 points

14 comments1 min readLW link

Some of my disagreements with List of Lethalities

TurnTrout24 Jan 2023 0:25 UTC

70 points

7 comments10 min readLW link

Rounding Someone Off

David Udell24 Jan 2023 0:03 UTC

25 points

0 comments5 min readLW link

Life Has a Cruel Symmetry

philh23 Jan 2023 23:40 UTC

21 points

5 comments11 min readLW link

(reasonableapproximation.net)

Highlights and Prizes from the 2021 Review Phase

Raemon23 Jan 2023 21:41 UTC

38 points

14 comments21 min readLW link

[Question] AI safety milestones?

Zach Stein-Perlman23 Jan 2023 21:00 UTC

7 points

5 comments1 min readLW link

[Question] A post-quantum theory of classical gravity?

Logan Zoellner23 Jan 2023 20:39 UTC

13 points

5 comments1 min readLW link

Meals For Unclear Dietary Restrictions

jefftk23 Jan 2023 20:00 UTC

17 points

3 comments2 min readLW link

(www.jefftk.com)

It’s ok

stratospher23 Jan 2023 18:11 UTC

1 point

0 comments2 min readLW link

Experimenting with beta.character.ai

svemirski23 Jan 2023 17:31 UTC

−3 points

5 comments1 min readLW link

This week in fashion

Jan23 Jan 2023 17:23 UTC

29 points

7 comments7 min readLW link

(universalprior.substack.com)

Movie Review: Megan

Zvi23 Jan 2023 12:50 UTC

60 points

19 comments24 min readLW link

(thezvi.wordpress.com)

[Question] Has private AGI research made independent safety research ineffective already? What should we do about this?

Roman Leventov23 Jan 2023 7:36 UTC

43 points

5 comments5 min readLW link