All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 202120222023 2024 2025

All Jan Feb Mar Apr May Jun Jul Aug Sep OctNovDec

All 1 2 3 4 567 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

Takeaways from a survey on AI alignment resources

DanielFilanNov 5, 2022, 11:40 PM

73 points

10 comments6 min readLW link 1 review

(danielfilan.com)

Unpricable Information and Certificate Hell

eva_Nov 5, 2022, 10:56 PM

13 points

2 comments6 min readLW link

Recommend HAIST resources for assessing the value of RLHF-related alignment research

Sam Marks and Xander Davies

Nov 5, 2022, 8:58 PM

26 points

9 comments3 min readLW link

Instead of technical research, more people should focus on buying time

Orpheus16, OliviaJ and Thomas Larsen

Nov 5, 2022, 8:43 PM

100 points

45 comments14 min readLW link

Provably Honest—A First Step

Srijanak DeNov 5, 2022, 7:18 PM

10 points

2 comments8 min readLW link

Should AI focus on problem-solving or strategic planning? Why not both?

Oliver SiegelNov 5, 2022, 7:17 PM

−12 points

3 comments LW link

How to store human values on a computer

Oliver SiegelNov 5, 2022, 7:17 PM

−12 points

17 comments LW link

The Slippery Slope from DALLE-2 to Deepfake Anarchy

scasperNov 5, 2022, 2:53 PM

17 points

9 comments11 min readLW link

When can a mimic surprise you? Why generative models handle seemingly ill-posed problems

David JohnstonNov 5, 2022, 1:19 PM

8 points

4 comments16 min readLW link

My summary of “Pragmatic AI Safety”

Eleni AngelouNov 5, 2022, 12:54 PM

3 points

0 comments5 min readLW link

Review of the Challenge

SD MarlowNov 5, 2022, 6:38 AM

−14 points

5 comments2 min readLW link

Spectrum of Independence

jefftkNov 5, 2022, 2:40 AM

43 points

7 comments1 min readLW link

(www.jefftk.com)

[paper link] Interpreting systems as solving POMDPs: a step towards a formal understanding of agency

the gears to ascensionNov 5, 2022, 1:06 AM

13 points

2 comments1 min readLW link

(www.semanticscholar.org)

Metaculus is seeking Software Engineers

dschwarzNov 5, 2022, 12:42 AM

18 points

0 comments1 min readLW link

(apply.workable.com)

Should we “go against nature”?

jasoncrawfordNov 4, 2022, 10:14 PM

10 points

3 comments2 min readLW link

(rootsofprogress.org)

How much should we care about non-human animals?

bokovNov 4, 2022, 9:36 PM

16 points

8 comments2 min readLW link

(www.lesswrong.com)

For ELK truth is mostly a distraction

c.troutNov 4, 2022, 9:14 PM

44 points

0 comments21 min readLW link

Toy Models and Tegum Products

Adam JermynNov 4, 2022, 6:51 PM

28 points

7 comments5 min readLW link

Ethan Caballero on Broken Neural Scaling Laws, Deception, and Recursive Self Improvement

Michaël Trazzi and Ethan Caballero

Nov 4, 2022, 6:09 PM

16 points

11 comments10 min readLW link

(theinsideview.ai)

Follow up to medical miracle

ElizabethNov 4, 2022, 6:00 PM

76 points

5 comments6 min readLW link

(acesounderglass.com)

Cross-Void Optimization

pneumynymNov 4, 2022, 5:47 PM

1 point

1 comment8 min readLW link

Monthly Shorts 10/22

CelerNov 4, 2022, 4:30 PM

12 points

0 comments6 min readLW link

(keller.substack.com)

Weekly Roundup #4

ZviNov 4, 2022, 3:00 PM

42 points

1 comment6 min readLW link

(thezvi.wordpress.com)

A new place to discuss cognitive science, ethics and human alignment

Daniel_FriedrichNov 4, 2022, 2:34 PM

3 points

4 comments LW link

A newcomer’s guide to the technical AI safety field

zeshenNov 4, 2022, 2:29 PM

42 points

3 comments10 min readLW link

[Question] Are alignment researchers devoting enough time to improving their research capacity?

Carson JonesNov 4, 2022, 12:58 AM

13 points

3 comments3 min readLW link

[Question] Don’t you think RLHF solves outer alignment?

Charbel-RaphaëlNov 4, 2022, 12:36 AM

9 points

23 comments1 min readLW link

Mechanistic Interpretability as Reverse Engineering (follow-up to “cars and elephants”)

David Scott Krueger (formerly: capybaralet)Nov 3, 2022, 11:19 PM

28 points

3 comments1 min readLW link

[Question] Could a Supreme Court suit work to solve NEPA problems?

ChristianKlNov 3, 2022, 9:10 PM

15 points

0 comments1 min readLW link

[Video] How having Fast Fourier Transforms sooner could have helped with Nuclear Disarmament—Veritaserum

mako yassNov 3, 2022, 9:04 PM

17 points

1 comment LW link

Further considerations on the Evidentialist’s Wager

Martín SotoNov 3, 2022, 8:06 PM

3 points

9 comments8 min readLW link

AI as a Civilizational Risk Part 6/6: What can be done

PashaKamyshevNov 3, 2022, 7:48 PM

2 points

4 comments4 min readLW link

A Mystery About High Dimensional Concept Encoding

Fabien RogerNov 3, 2022, 5:05 PM

46 points

13 comments7 min readLW link

Why do we post our AI safety plans on the Internet?

Peter S. ParkNov 3, 2022, 4:02 PM

4 points

4 comments11 min readLW link

Multiple Deploy-Key Repos

jefftkNov 3, 2022, 3:10 PM

15 points

0 comments1 min readLW link

(www.jefftk.com)

Covid 11/3/22: Asking Forgiveness

ZviNov 3, 2022, 1:50 PM

23 points

3 comments6 min readLW link

(thezvi.wordpress.com)

Adversarial Policies Beat Professional-Level Go AIs

sanxiynNov 3, 2022, 1:27 PM

31 points

35 comments1 min readLW link

(goattack.alignmentfund.org)

K-types vs T-types — what priors do you have?

Cleo NardoNov 3, 2022, 11:29 AM

74 points

25 comments7 min readLW link

Information Markets 2: Optimally Shaped Reward Bets

eva_Nov 3, 2022, 11:08 AM

9 points

0 comments3 min readLW link

The Rational Utilitarian Love Movement (A Historical Retrospective)

Caleb BiddulphNov 3, 2022, 7:11 AM

3 points

0 comments LW link

The Mirror Chamber: A short story exploring the anthropic measure function and why it can matter

mako yassNov 3, 2022, 6:47 AM

30 points

13 comments10 min readLW link

Open Letter Against Reckless Nuclear Escalation and Use

Max TegmarkNov 3, 2022, 5:34 AM

27 points

25 comments1 min readLW link

Lazy Python Argument Parsing

jefftkNov 3, 2022, 2:20 AM

20 points

3 comments1 min readLW link

(www.jefftk.com)

AI as a Civilizational Risk Part 5/6: Relationship between C-risk and X-risk

PashaKamyshevNov 3, 2022, 2:19 AM

2 points

0 comments7 min readLW link

[Question] Is there a good way to award a fixed prize in a prediction contest?

jchanNov 2, 2022, 9:37 PM

18 points

5 comments1 min readLW link

“Are Experiments Possible?” Seeds of Science call for reviewers

rogersbaconNov 2, 2022, 8:05 PM

8 points

0 comments1 min readLW link

Humans do acausal coordination all the time

Adam JermynNov 2, 2022, 2:40 PM

57 points

35 comments3 min readLW link

Far-UVC Light Update: No, LEDs are not around the corner (tweetstorm)

DavidmanheimNov 2, 2022, 12:57 PM

73 points

27 comments4 min readLW link

(twitter.com)

Housing and Transit Thoughts #1

ZviNov 2, 2022, 12:10 PM

35 points

5 comments16 min readLW link

(thezvi.wordpress.com)

Mind is uncountable

Filip SondejNov 2, 2022, 11:51 AM

18 points

22 comments LW link

Keyboard shortcuts

Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).

Keys shown in grey (e.g., ?) do not require any modifier keys.

General
? Show keyboard shortcuts
Esc Hide keyboard shortcuts

Site navigation
h Go to Home (a.k.a. “Frontpage”) view
f Go to Featured (a.k.a. “Curated”) view
a Go to All (a.k.a. “Community”) view
m Go to Meta view
v Go to Tags view
c Go to Recent Comments view
r Go to Archive view
q Go to Sequences view
t Go to About page
u Go to User or Login page
o Go to Inbox page

Page navigation
, Jump up to top of page
. Jump down to bottom of page
/ Jump to top of comments section
s Search

Page actions
n New post or comment
e Edit current post

Post/comment list views
. Focus next entry in list
, Focus previous entry in list
; Cycle between links in focused entry
Enter Go to currently focused entry
Esc Unfocus currently focused entry
] Go to next page
[ Go to previous page
\ Go to first page
e Edit currently focused post

Editor
k Bold text
i Italic text
l Insert hyperlink
q Blockquote text

Appearance
= Increase text size
- Decrease text size
0 Reset to default text size
′ Cycle through content width settings
1 Switch to default theme [A]
2 Switch to dark theme [B]
3 Switch to grey theme [C]
4 Switch to ultramodern theme [D]
5 Switch to simple theme [E]
6 Switch to brutalist theme [F]
7 Switch to ReadTheSequences theme [G]
8 Switch to classic Less Wrong theme [H]
9 Switch to modern Less Wrong theme [I]
; Open theme tweaker
Enter Save changes and close theme tweaker
Esc Close theme tweaker (without saving)

Slide shows
l Start/resume slideshow
Esc Exit slideshow
→↓ Next slide
←↑ Previous slide
Space Reset slide zoom

Miscellaneous
x Switch to next view on user page
z Switch to previous view on user page
` Toggle compact comment list view
g Toggle anti-kibitzer