All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 202120222023 2024 2025

All Jan Feb Mar Apr May Jun Jul Aug Sep Oct NovDec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 3031

A few thoughts on my self-study for alignment research

Thomas KehrenbergDec 30, 2022, 10:05 PM

6 points

0 comments2 min readLW link

Christmas Microscopy

jefftkDec 30, 2022, 9:10 PM

27 points

0 comments1 min readLW link

(www.jefftk.com)

What “upside” of AI?

False NameDec 30, 2022, 8:58 PM

0 points

5 comments4 min readLW link

Evidence on recursive self-improvement from current ML

berenDec 30, 2022, 8:53 PM

31 points

12 comments6 min readLW link

[Question] Is ChatGPT TAI?

Amal Dec 30, 2022, 7:44 PM

14 points

5 comments1 min readLW link

My thoughts on OpenAI’s alignment plan

Orpheus16Dec 30, 2022, 7:33 PM

55 points

3 comments20 min readLW link

Beyond Rewards and Values: A Non-dualistic Approach to Universal Intelligence

Akira PyinyaDec 30, 2022, 7:05 PM

10 points

4 comments14 min readLW link

10 Years of LessWrong

SebastianG Dec 30, 2022, 5:15 PM

73 points

2 comments4 min readLW link

Chatbots as a Publication Format

derek shillerDec 30, 2022, 2:11 PM

6 points

6 comments4 min readLW link

Human sexuality as an interesting case study of alignment

berenDec 30, 2022, 1:37 PM

39 points

26 comments3 min readLW link

The Twitter Files: Covid Edition

ZviDec 30, 2022, 1:30 PM

32 points

2 comments10 min readLW link

(thezvi.wordpress.com)

Worldly Positions archive, briefly with private drafts

KatjaGraceDec 30, 2022, 12:20 PM

11 points

0 comments1 min readLW link

(worldspiritsockpuppet.com)

Models Don’t “Get Reward”

Sam RingerDec 30, 2022, 10:37 AM

316 points

62 comments5 min readLW link 1 review

The hyperfinite timeline

Alok SinghDec 30, 2022, 9:30 AM

3 points

6 comments1 min readLW link

(alok.github.io)

Reactive devaluation: Bias in Evaluating AGI X-Risks

Remmelt and flandry19

Dec 30, 2022, 9:02 AM

−15 points

9 comments1 min readLW link

Things I carry almost every day, as of late December 2022

DanielFilanDec 30, 2022, 7:40 AM

38 points

9 comments5 min readLW link

(danielfilan.com)

More ways to spot abysses

KatjaGraceDec 30, 2022, 6:30 AM

21 points

1 comment1 min readLW link

(worldspiritsockpuppet.com)

Language models are nearly AGIs but we don’t notice it because we keep shifting the bar

philosophybearDec 30, 2022, 5:15 AM

105 points

13 comments7 min readLW link

Progress links and tweets, 2022-12-29

jasoncrawfordDec 30, 2022, 4:54 AM

12 points

0 comments1 min readLW link

(rootsofprogress.org)

Announcing The Filan Cabinet

DanielFilanDec 30, 2022, 3:10 AM

21 points

2 comments1 min readLW link

(danielfilan.com)

[Question] Effective Evil Causes?

Ulisse MiniDec 30, 2022, 2:56 AM

−12 points

2 comments1 min readLW link

But is it really in Rome? An investigation of the ROME model editing technique

jacquesthibsDec 30, 2022, 2:40 AM

104 points

2 comments18 min readLW link

A Year of AI Increasing AI Progress

TW123Dec 30, 2022, 2:09 AM

148 points

3 comments2 min readLW link

Why not spend more time looking at human alignment?

ajc586Dec 30, 2022, 12:22 AM

11 points

3 comments1 min readLW link

Why and how to write things on the Internet

benkuhnDec 29, 2022, 10:40 PM

20 points

2 comments15 min readLW link

(www.benkuhn.net)

Friendly and Unfriendly AGI are Indistinguishable

ErgoEchoDec 29, 2022, 10:13 PM

−4 points

4 comments4 min readLW link

(neologos.co)

200 COP in MI: Looking for Circuits in the Wild

Neel NandaDec 29, 2022, 8:59 PM

16 points

5 comments13 min readLW link

Thoughts on the implications of GPT-3, two years ago and NOW [here be dragons, we’re swimming, flying and talking with them]

Bill BenzonDec 29, 2022, 8:05 PM

0 points

0 comments5 min readLW link

Covid 12/29/22: Next Up is XBB.1.5

ZviDec 29, 2022, 6:20 PM

33 points

4 comments10 min readLW link

(thezvi.wordpress.com)

Entrepreneurship ETG Might Be Better Than 80k Thought

XodarapDec 29, 2022, 5:51 PM

33 points

0 comments LW link

Internal Interfaces Are a High-Priority Interpretability Target

Thane RuthenisDec 29, 2022, 5:49 PM

26 points

6 comments7 min readLW link

CFP for Rebellion and Disobedience in AI workshop

Ram RachumDec 29, 2022, 4:08 PM

15 points

0 comments1 min readLW link

My scorched-earth policy on New Year’s resolutions

PatrickDFarleyDec 29, 2022, 2:45 PM

29 points

2 comments4 min readLW link

Don’t feed the void. She is fat enough!

Johannes C. MayerDec 29, 2022, 2:18 PM

11 points

0 comments1 min readLW link

[Question] Is there any unified resource on Eliezer’s fatigue?

Johannes C. MayerDec 29, 2022, 2:04 PM

9 points

2 comments1 min readLW link

Logical Probability of Goldbach’s Conjecture: Provable Rule or Coincidence?

avturchinDec 29, 2022, 1:37 PM

5 points

15 comments8 min readLW link

Where do you get your capabilities from?

tailcalledDec 29, 2022, 11:39 AM

37 points

27 comments6 min readLW link

The commercial incentive to intentionally train AI to deceive us

Derek M. JonesDec 29, 2022, 11:30 AM

5 points

1 comment4 min readLW link

(shape-of-code.com)

Infinite necklace: the line as a circle

Alok SinghDec 29, 2022, 10:41 AM

5 points

2 comments1 min readLW link

Privacy Tradeoffs

jefftkDec 29, 2022, 3:40 AM

13 points

1 comment2 min readLW link

(www.jefftk.com)

Against John Searle, Gary Marcus, the Chinese Room thought experiment and its world

philosophybearDec 29, 2022, 3:26 AM

21 points

43 comments8 min readLW link

Large Language Models Suggest a Path to Ems

anithiteDec 29, 2022, 2:20 AM

17 points

2 comments5 min readLW link

[Question] Book recommendations for the history of ML?

Eleni AngelouDec 28, 2022, 11:50 PM

2 points

2 comments1 min readLW link

Rock-Paper-Scissors Can Be Weird

winwonceDec 28, 2022, 11:12 PM

14 points

3 comments1 min readLW link

200 COP in MI: The Case for Analysing Toy Language Models

Neel NandaDec 28, 2022, 9:07 PM

40 points

3 comments7 min readLW link

200 Concrete Open Problems in Mechanistic Interpretability: Introduction

Neel NandaDec 28, 2022, 9:06 PM

106 points

0 comments10 min readLW link

Effective ways to find love?

anonymoususerDec 28, 2022, 8:46 PM

9 points

6 comments1 min readLW link

Classical logic based on propositions-as-subsingleton-types

Thomas KehrenbergDec 28, 2022, 8:16 PM

5 points

0 comments16 min readLW link

In Defense of Wrapper-Minds

Thane RuthenisDec 28, 2022, 6:28 PM

24 points

38 comments3 min readLW link

[Question] What is the best way to approach Expected Value calculations when payoffs are highly skewed?

jmhDec 28, 2022, 2:42 PM

8 points

16 comments1 min readLW link

Keyboard shortcuts

Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).

Keys shown in grey (e.g., ?) do not require any modifier keys.

General
? Show keyboard shortcuts
Esc Hide keyboard shortcuts

Site navigation
h Go to Home (a.k.a. “Frontpage”) view
f Go to Featured (a.k.a. “Curated”) view
a Go to All (a.k.a. “Community”) view
m Go to Meta view
v Go to Tags view
c Go to Recent Comments view
r Go to Archive view
q Go to Sequences view
t Go to About page
u Go to User or Login page
o Go to Inbox page

Page navigation
, Jump up to top of page
. Jump down to bottom of page
/ Jump to top of comments section
s Search

Page actions
n New post or comment
e Edit current post

Post/comment list views
. Focus next entry in list
, Focus previous entry in list
; Cycle between links in focused entry
Enter Go to currently focused entry
Esc Unfocus currently focused entry
] Go to next page
[ Go to previous page
\ Go to first page
e Edit currently focused post

Editor
k Bold text
i Italic text
l Insert hyperlink
q Blockquote text

Appearance
= Increase text size
- Decrease text size
0 Reset to default text size
′ Cycle through content width settings
1 Switch to default theme [A]
2 Switch to dark theme [B]
3 Switch to grey theme [C]
4 Switch to ultramodern theme [D]
5 Switch to simple theme [E]
6 Switch to brutalist theme [F]
7 Switch to ReadTheSequences theme [G]
8 Switch to classic Less Wrong theme [H]
9 Switch to modern Less Wrong theme [I]
; Open theme tweaker
Enter Save changes and close theme tweaker
Esc Close theme tweaker (without saving)

Slide shows
l Start/resume slideshow
Esc Exit slideshow
→↓ Next slide
←↑ Previous slide
Space Reset slide zoom

Miscellaneous
x Switch to next view on user page
z Switch to previous view on user page
` Toggle compact comment list view
g Toggle anti-kibitzer