All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024

All Jan FebMarApr May Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 101112 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Rationalism and social rationalism

philosophybear10 Mar 2023 23:20 UTC

17 points

5 comments10 min readLW link

(philosophybear.substack.com)

Meetup Tip: Nametags

Screwtape10 Mar 2023 21:00 UTC

16 points

2 comments3 min readLW link

[Question] Is ChatGPT (or other LLMs) more ‘sentient’/’conscious/etc. then a baby without a brain?

M. Y. Zuo10 Mar 2023 19:00 UTC

−4 points

2 comments1 min readLW link

The humanity’s biggest mistake

RomanS10 Mar 2023 16:30 UTC

0 points

1 comment2 min readLW link

Operationalizing timelines

Zach Stein-Perlman10 Mar 2023 16:30 UTC

7 points

1 comment3 min readLW link

[Question] What do you think is wrong with rationalist culture?

tailcalled10 Mar 2023 13:17 UTC

16 points

77 comments1 min readLW link

Dice Decision Making

Bart Bussmann10 Mar 2023 13:01 UTC

20 points

14 comments3 min readLW link

Stop calling it “jailbreaking” ChatGPT

Templarrr10 Mar 2023 11:41 UTC

7 points

9 comments2 min readLW link

Long-term memory for LLM via self-replicating prompt

avturchin10 Mar 2023 10:28 UTC

20 points

3 comments2 min readLW link

Thoughts on the OpenAI alignment plan: will AI research assistants be net-positive for AI existential risk?

Jeffrey Ladish10 Mar 2023 8:21 UTC

58 points

3 comments9 min readLW link

Reflections On The Feasibility Of Scalable-Oversight

Felix Hofstätter10 Mar 2023 7:54 UTC

11 points

0 comments12 min readLW link

Japan AI Alignment Conference

Chris Scammell and Katrina Joslin

10 Mar 2023 6:56 UTC

64 points

7 comments1 min readLW link

(www.conjecture.dev)

Everything’s normal until it’s not

Eleni Angelou10 Mar 2023 2:02 UTC

7 points

0 comments3 min readLW link

Acolytes, reformers, and atheists

lc10 Mar 2023 0:48 UTC

9 points

0 comments4 min readLW link

The hot mess theory of AI misalignment: More intelligent agents behave less coherently

Jonathan Yan10 Mar 2023 0:20 UTC

47 points

21 comments1 min readLW link

(sohl-dickstein.github.io)

Why Not Just Outsource Alignment Research To An AI?

johnswentworth9 Mar 2023 21:49 UTC

139 points

49 comments9 min readLW link

What’s Not Our Problem

Jacob Falkovich9 Mar 2023 20:07 UTC

22 points

6 comments9 min readLW link

Questions about Conjecure’s CoEm proposal

Akash and NicholasKees

9 Mar 2023 19:32 UTC

51 points

4 comments2 min readLW link

What Jason has been reading, March 2023

jasoncrawford9 Mar 2023 18:46 UTC

12 points

0 comments6 min readLW link

(rootsofprogress.org)

[Question] “Provide C++ code for a function that outputs a Fibonacci sequence of n terms, where n is provided as a parameter to the function

Thembeka999 Mar 2023 18:37 UTC

−21 points

2 comments1 min readLW link

Anthropic: Core Views on AI Safety: When, Why, What, and How

jonmenaster9 Mar 2023 17:34 UTC

17 points

1 comment22 min readLW link

(www.anthropic.com)

Why do we assume there is a “real” shoggoth behind the LLM? Why not masks all the way down?

Robert_AIZI9 Mar 2023 17:28 UTC

63 points

48 comments2 min readLW link

Anthropic’s Core Views on AI Safety

Zac Hatfield-Dodds9 Mar 2023 16:55 UTC

172 points

39 comments2 min readLW link

(www.anthropic.com)

Some ML-Related Math I Now Understand Better

Fabien Roger9 Mar 2023 16:35 UTC

45 points

4 comments4 min readLW link

The Translucent Thoughts Hypotheses and Their Implications

Fabien Roger9 Mar 2023 16:30 UTC

142 points

7 comments19 min readLW link

IRL in General Environments

michaelcohen9 Mar 2023 13:32 UTC

8 points

20 comments1 min readLW link

Utility uncertainty vs. expected information gain

michaelcohen9 Mar 2023 13:32 UTC

13 points

9 comments1 min readLW link

Value Learning is only Asymptotically Safe

michaelcohen9 Mar 2023 13:32 UTC

5 points

19 comments1 min readLW link

Impact Measure Testing with Honey Pots and Myopia

michaelcohen9 Mar 2023 13:32 UTC

13 points

9 comments1 min readLW link

Just Imitate Humans?

michaelcohen9 Mar 2023 13:31 UTC

11 points

72 comments1 min readLW link

Build a Causal Decision Theorist

michaelcohen9 Mar 2023 13:31 UTC

−2 points

14 comments4 min readLW link

ChatGPT explores the semantic differential

Bill Benzon9 Mar 2023 13:09 UTC

7 points

2 comments7 min readLW link

AI #3

Zvi9 Mar 2023 12:20 UTC

55 points

12 comments62 min readLW link

(thezvi.wordpress.com)

The Scientific Approach To Anything and Everything

Rami Rustom9 Mar 2023 11:27 UTC

5 points

5 comments16 min readLW link

Paper Summary: The Effectiveness of AI Existential Risk Communication to the American and Dutch Public

otto.barten9 Mar 2023 10:47 UTC

14 points

6 comments4 min readLW link

Speed running everyone through the bad alignment bingo. $5k bounty for a LW conversational agent

ArthurB9 Mar 2023 9:26 UTC

140 points

33 comments2 min readLW link

Chomsky on ChatGPT (link)

mukashi9 Mar 2023 7:00 UTC

2 points

6 comments1 min readLW link

How bad a future do ML researchers expect?

KatjaGrace9 Mar 2023 4:50 UTC

122 points

8 comments2 min readLW link

(aiimpacts.org)

Challenge: construct a Gradient Hacker

Thomas Larsen and Thomas Kwa

9 Mar 2023 2:38 UTC

39 points

10 comments1 min readLW link

Basic Facts Beanbag

Screwtape9 Mar 2023 0:05 UTC

6 points

0 comments4 min readLW link

A ranking scale for how severe the side effects of solutions to AI x-risk are

Christopher King8 Mar 2023 22:53 UTC

3 points

0 comments2 min readLW link

Progress links and tweets, 2023-03-08

jasoncrawford8 Mar 2023 20:37 UTC

16 points

0 comments1 min readLW link

(rootsofprogress.org)

Project “MIRI as a Service”

RomanS8 Mar 2023 19:22 UTC

42 points

4 comments1 min readLW link

2022 Survey Results

Screwtape8 Mar 2023 19:16 UTC

48 points

8 comments20 min readLW link

Use the Nato Alphabet

Cedar8 Mar 2023 19:14 UTC

6 points

10 comments1 min readLW link

LessWrong needs a sage mechanic

lc8 Mar 2023 18:57 UTC

34 points

5 comments1 min readLW link

[Question] Mathematical models of Ethics

Victors8 Mar 2023 17:40 UTC

4 points

2 comments1 min readLW link

Against LLM Reductionism

Erich_Grunewald8 Mar 2023 15:52 UTC

140 points

17 comments18 min readLW link

(www.erichgrunewald.com)

Agency, LLMs and AI Safety—A First Pass

Giulio8 Mar 2023 15:42 UTC

2 points

0 comments4 min readLW link

(www.giuliostarace.com)

Why Uncontrollable AI Looks More Likely Than Ever

otto.barten and Roman_Yampolskiy

8 Mar 2023 15:41 UTC

18 points

0 comments4 min readLW link

(time.com)