All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 202120222023 2024 2025

All Jan Feb Mar Apr May Jun JulAugSep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 111213 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

[Question] Seriously, what goes wrong with “reward the agent when it makes you smile”?

TurnTroutAug 11, 2022, 10:22 PM

87 points

43 comments2 min readLW link

Encultured AI Pre-planning, Part 2: Providing a Service

Andrew_Critch and Nick Hay

Aug 11, 2022, 8:11 PM

33 points

4 comments3 min readLW link

My summary of the alignment problem

Peter HroššoAug 11, 2022, 7:42 PM

15 points

3 comments2 min readLW link

(threadreaderapp.com)

Language models seem to be much better than humans at next-token prediction

Buck, Fabien Roger and LawrenceC

Aug 11, 2022, 5:45 PM

182 points

60 comments13 min readLW link 1 review

Introducing Pastcasting: A tool for forecasting practice

Sage FutureAug 11, 2022, 5:38 PM

95 points

10 comments2 min readLW link 2 reviews

Pendulums, Policy-Level Decisionmaking, Saving State

CFAR!DuncanAug 11, 2022, 4:47 PM

30 points

3 comments8 min readLW link

Covid 8/11/22: The End Is Never The End

ZviAug 11, 2022, 4:20 PM

28 points

11 comments16 min readLW link

(thezvi.wordpress.com)

Singapore—Small casual dinner in Chinatown #4

Joe RoccaAug 11, 2022, 12:30 PM

3 points

3 comments1 min readLW link

Thoughts on the good regulator theorem

JonasMossAug 11, 2022, 12:08 PM

12 points

0 comments4 min readLW link

How and why to turn everything into audio

KatWoods and AmberDawn

Aug 11, 2022, 8:55 AM

55 points

20 comments5 min readLW link

Shard Theory: An Overview

David UdellAug 11, 2022, 5:44 AM

166 points

34 comments10 min readLW link

[Question] Do advancements in Decision Theory point towards moral absolutism?

Nathan1123Aug 11, 2022, 12:59 AM

0 points

4 comments4 min readLW link

The alignment problem from a deep learning perspective

Richard_NgoAug 10, 2022, 10:46 PM

107 points

15 comments27 min readLW link 1 review

How much alignment data will we need in the long run?

Jacob_HiltonAug 10, 2022, 9:39 PM

37 points

15 comments4 min readLW link

On Ego, Reincarnation, Consciousness and The Universe

qmauryAug 10, 2022, 8:21 PM

−3 points

6 comments5 min readLW link

Formalizing Alignment

Marv KAug 10, 2022, 6:50 PM

4 points

0 comments2 min readLW link

How Do We Align an AGI Without Getting Socially Engineered? (Hint: Box It)

Peter S. Park, NickyP and Stephen Fowler

Aug 10, 2022, 6:14 PM

28 points

30 comments11 min readLW link

Emergent Abilities of Large Language Models [Linkpost]

aogAug 10, 2022, 6:02 PM

25 points

2 comments1 min readLW link

(arxiv.org)

How To Go From Interpretability To Alignment: Just Retarget The Search

johnswentworthAug 10, 2022, 4:08 PM

209 points

34 comments3 min readLW link 1 review

Using GPT-3 to augment human intelligence

Henrik KarlssonAug 10, 2022, 3:54 PM

52 points

8 comments18 min readLW link

(escapingflatland.substack.com)

ACX meetup [August]

sallatikAug 10, 2022, 9:54 AM

1 point

1 comment1 min readLW link

Dissent Collusion

ScrewtapeAug 10, 2022, 2:43 AM

30 points

7 comments3 min readLW link

The Medium Is The Bandage

party girlAug 10, 2022, 1:45 AM

11 points

0 comments10 min readLW link

[Question] Why is increasing public awareness of AI safety not a priority?

FinalFormal2Aug 10, 2022, 1:28 AM

−5 points

14 comments1 min readLW link

Manifold x CSPI $25k Forecasting Tournament

David CheeAug 9, 2022, 9:13 PM

5 points

0 comments1 min readLW link

(www.cspicenter.com)

Proposal: Consider not using distance-direction-dimension words in abstract discussions

moridinamaelAug 9, 2022, 8:44 PM

46 points

18 comments5 min readLW link

[Question] How would two superintelligent AIs interact, if they are unaligned with each other?

Nathan1123Aug 9, 2022, 6:58 PM

4 points

6 comments1 min readLW link

Disagreements about Alignment: Why, and how, we should try to solve them

ojorgensenAug 9, 2022, 6:49 PM

11 points

2 comments16 min readLW link

Progress links and tweets, 2022-08-09

jasoncrawfordAug 9, 2022, 5:35 PM

11 points

3 comments1 min readLW link

(rootsofprogress.org)

[Question] Is it possible to find venture capital for AI research org with strong safety focus?

AnonResearchAug 9, 2022, 4:12 PM

6 points

1 comment1 min readLW link

[Question] Many Gods refutation and Instrumental Goals. (Proper one)

aditya malikAug 9, 2022, 11:59 AM

0 points

15 comments1 min readLW link

Content generation. Where do we draw the line?

Q HomeAug 9, 2022, 10:51 AM

6 points

7 comments2 min readLW link

[Question] What are some alternatives to Shapley values which drop additivity?

eapiAug 9, 2022, 9:16 AM

11 points

6 comments1 min readLW link

(math.stackexchange.com)

Radio Bostrom: Audio narrations of papers by Nick Bostrom

PeterHAug 9, 2022, 8:56 AM

12 points

0 comments2 min readLW link

(forum.effectivealtruism.org)

Team Shard Status Report

David UdellAug 9, 2022, 5:33 AM

38 points

8 comments3 min readLW link

Announcing: Mechanism Design for AI Safety—Reading Group

Rubi J. HudsonAug 9, 2022, 4:21 AM

18 points

3 comments4 min readLW link

[Question] What are some Works that might be useful but are difficult, so forgotten?

TekhneMakreAug 9, 2022, 2:22 AM

10 points

5 comments1 min readLW link

Project proposal: Testing the IBP definition of agent

Jeremy Gillen, Thomas Larsen and JamesH

Aug 9, 2022, 1:09 AM

21 points

4 comments2 min readLW link

How (not) to choose a research project

Garrett Baker, CatGoddess and Johannes C. Mayer

Aug 9, 2022, 12:26 AM

79 points

11 comments7 min readLW link

[Question] Are ya winning, son?

Nathan1123Aug 9, 2022, 12:06 AM

14 points

13 comments2 min readLW link

General alignment properties

TurnTroutAug 8, 2022, 11:40 PM

51 points

2 comments1 min readLW link

Experiment: Be my math tutor?

sudoAug 8, 2022, 10:50 PM

12 points

5 comments1 min readLW link

Encultured AI, Part 1 Appendix: Relevant Research Examples

Andrew_Critch and Nick Hay

Aug 8, 2022, 10:44 PM

11 points

1 comment7 min readLW link

Encultured AI Pre-planning, Part 1: Enabling New Benchmarks

Andrew_Critch and Nick Hay

Aug 8, 2022, 10:44 PM

63 points

2 comments6 min readLW link

Broad Basins and Data Compression

Jeremy Gillen, Stephen Fowler and Thomas Larsen

Aug 8, 2022, 8:33 PM

33 points

6 comments7 min readLW link

Interpretability/Tool-ness/Alignment/Corrigibility are not Composable

johnswentworthAug 8, 2022, 6:05 PM

143 points

13 comments3 min readLW link

LW Meetup @ DEFCON (Las Vegas) − 5-7pm Thu. Aug. 11 at Forum Food Court (Caesars)

jchanAug 8, 2022, 2:57 PM

6 points

0 comments1 min readLW link

A sufficiently paranoid paperclip maximizer

RomanSAug 8, 2022, 11:17 AM

18 points

10 comments2 min readLW link

[Question] Instrumental Goals and Many Gods Refutation

aditya malikAug 8, 2022, 10:46 AM

−10 points

4 comments1 min readLW link

Area under the curve, Eat Dirt, Broccoli Errors, Copernicus & Chaos

CFAR!DuncanAug 8, 2022, 8:17 AM

41 points

0 comments7 min readLW link

Keyboard shortcuts

Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).

Keys shown in grey (e.g., ?) do not require any modifier keys.

General
? Show keyboard shortcuts
Esc Hide keyboard shortcuts

Site navigation
h Go to Home (a.k.a. “Frontpage”) view
f Go to Featured (a.k.a. “Curated”) view
a Go to All (a.k.a. “Community”) view
m Go to Meta view
v Go to Tags view
c Go to Recent Comments view
r Go to Archive view
q Go to Sequences view
t Go to About page
u Go to User or Login page
o Go to Inbox page

Page navigation
, Jump up to top of page
. Jump down to bottom of page
/ Jump to top of comments section
s Search

Page actions
n New post or comment
e Edit current post

Post/comment list views
. Focus next entry in list
, Focus previous entry in list
; Cycle between links in focused entry
Enter Go to currently focused entry
Esc Unfocus currently focused entry
] Go to next page
[ Go to previous page
\ Go to first page
e Edit currently focused post

Editor
k Bold text
i Italic text
l Insert hyperlink
q Blockquote text

Appearance
= Increase text size
- Decrease text size
0 Reset to default text size
′ Cycle through content width settings
1 Switch to default theme [A]
2 Switch to dark theme [B]
3 Switch to grey theme [C]
4 Switch to ultramodern theme [D]
5 Switch to simple theme [E]
6 Switch to brutalist theme [F]
7 Switch to ReadTheSequences theme [G]
8 Switch to classic Less Wrong theme [H]
9 Switch to modern Less Wrong theme [I]
; Open theme tweaker
Enter Save changes and close theme tweaker
Esc Close theme tweaker (without saving)

Slide shows
l Start/resume slideshow
Esc Exit slideshow
→↓ Next slide
←↑ Previous slide
Space Reset slide zoom

Miscellaneous
x Switch to next view on user page
z Switch to previous view on user page
` Toggle compact comment list view
g Toggle anti-kibitzer