All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024

All Jan FebMarApr May Jun Jul Aug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 91011 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Why Not Just Outsource Alignment Research To An AI?

johnswentworth9 Mar 2023 21:49 UTC

139 points

49 comments9 min readLW link

What’s Not Our Problem

Jacob Falkovich9 Mar 2023 20:07 UTC

22 points

6 comments9 min readLW link

Questions about Conjecure’s CoEm proposal

Akash and NicholasKees

9 Mar 2023 19:32 UTC

51 points

4 comments2 min readLW link

What Jason has been reading, March 2023

jasoncrawford9 Mar 2023 18:46 UTC

12 points

0 comments6 min readLW link

(rootsofprogress.org)

[Question] “Provide C++ code for a function that outputs a Fibonacci sequence of n terms, where n is provided as a parameter to the function

Thembeka999 Mar 2023 18:37 UTC

−21 points

2 comments1 min readLW link

Anthropic: Core Views on AI Safety: When, Why, What, and How

jonmenaster9 Mar 2023 17:34 UTC

17 points

1 comment22 min readLW link

(www.anthropic.com)

Why do we assume there is a “real” shoggoth behind the LLM? Why not masks all the way down?

Robert_AIZI9 Mar 2023 17:28 UTC

63 points

48 comments2 min readLW link

Anthropic’s Core Views on AI Safety

Zac Hatfield-Dodds9 Mar 2023 16:55 UTC

172 points

39 comments2 min readLW link

(www.anthropic.com)

Some ML-Related Math I Now Understand Better

Fabien Roger9 Mar 2023 16:35 UTC

45 points

4 comments4 min readLW link

The Translucent Thoughts Hypotheses and Their Implications

Fabien Roger9 Mar 2023 16:30 UTC

142 points

7 comments19 min readLW link

IRL in General Environments

michaelcohen9 Mar 2023 13:32 UTC

8 points

20 comments1 min readLW link

Utility uncertainty vs. expected information gain

michaelcohen9 Mar 2023 13:32 UTC

13 points

9 comments1 min readLW link

Value Learning is only Asymptotically Safe

michaelcohen9 Mar 2023 13:32 UTC

5 points

19 comments1 min readLW link

Impact Measure Testing with Honey Pots and Myopia

michaelcohen9 Mar 2023 13:32 UTC

13 points

9 comments1 min readLW link

Just Imitate Humans?

michaelcohen9 Mar 2023 13:31 UTC

11 points

72 comments1 min readLW link

Build a Causal Decision Theorist

michaelcohen9 Mar 2023 13:31 UTC

−2 points

14 comments4 min readLW link

ChatGPT explores the semantic differential

Bill Benzon9 Mar 2023 13:09 UTC

7 points

2 comments7 min readLW link

AI #3

Zvi9 Mar 2023 12:20 UTC

55 points

12 comments62 min readLW link

(thezvi.wordpress.com)

The Scientific Approach To Anything and Everything

Rami Rustom9 Mar 2023 11:27 UTC

5 points

5 comments16 min readLW link

Paper Summary: The Effectiveness of AI Existential Risk Communication to the American and Dutch Public

otto.barten9 Mar 2023 10:47 UTC

14 points

6 comments4 min readLW link

Speed running everyone through the bad alignment bingo. $5k bounty for a LW conversational agent

ArthurB9 Mar 2023 9:26 UTC

140 points

33 comments2 min readLW link

Chomsky on ChatGPT (link)

mukashi9 Mar 2023 7:00 UTC

2 points

6 comments1 min readLW link

How bad a future do ML researchers expect?

KatjaGrace9 Mar 2023 4:50 UTC

122 points

8 comments2 min readLW link

(aiimpacts.org)

Challenge: construct a Gradient Hacker

Thomas Larsen and Thomas Kwa

9 Mar 2023 2:38 UTC

39 points

10 comments1 min readLW link

Basic Facts Beanbag

Screwtape9 Mar 2023 0:05 UTC

6 points

0 comments4 min readLW link

A ranking scale for how severe the side effects of solutions to AI x-risk are

Christopher King8 Mar 2023 22:53 UTC

3 points

0 comments2 min readLW link

Progress links and tweets, 2023-03-08

jasoncrawford8 Mar 2023 20:37 UTC

16 points

0 comments1 min readLW link

(rootsofprogress.org)

Project “MIRI as a Service”

RomanS8 Mar 2023 19:22 UTC

42 points

4 comments1 min readLW link

2022 Survey Results

Screwtape8 Mar 2023 19:16 UTC

48 points

8 comments20 min readLW link

Use the Nato Alphabet

Cedar8 Mar 2023 19:14 UTC

6 points

10 comments1 min readLW link

LessWrong needs a sage mechanic

lc8 Mar 2023 18:57 UTC

34 points

5 comments1 min readLW link

[Question] Mathematical models of Ethics

Victors8 Mar 2023 17:40 UTC

4 points

2 comments1 min readLW link

Against LLM Reductionism

Erich_Grunewald8 Mar 2023 15:52 UTC

140 points

17 comments18 min readLW link

(www.erichgrunewald.com)

Agency, LLMs and AI Safety—A First Pass

Giulio8 Mar 2023 15:42 UTC

2 points

0 comments4 min readLW link

(www.giuliostarace.com)

Why Uncontrollable AI Looks More Likely Than Ever

otto.barten and Roman_Yampolskiy

8 Mar 2023 15:41 UTC

18 points

0 comments4 min readLW link

(time.com)

Universal Modelers

George3d68 Mar 2023 15:39 UTC

6 points

4 comments20 min readLW link

(epistem.ink)

The Kids are Not Okay

Zvi8 Mar 2023 13:30 UTC

85 points

43 comments32 min readLW link

(thezvi.wordpress.com)

Alignment Targets and The Natural Abstraction Hypothesis

Stephen Fowler8 Mar 2023 11:45 UTC

10 points

0 comments3 min readLW link

Computer Input Sucks—A Brain Dump

Johannes C. Mayer8 Mar 2023 11:06 UTC

14 points

11 comments3 min readLW link

Under-Appreciated Ways to Use Flashcards—Part II

Florence Hinder8 Mar 2023 9:54 UTC

25 points

6 comments4 min readLW link

(blog.thoughtsaver.com)

Squeezing foundations research assistance out of formal logic narrow AI.

Donald Hobson8 Mar 2023 9:38 UTC

16 points

1 comment2 min readLW link

Monthly Shorts 1&2/23

Celer8 Mar 2023 7:10 UTC

9 points

0 comments2 min readLW link

(keller.substack.com)

Chapter 1: Pursuing Understanding

Xavier Shrier8 Mar 2023 6:40 UTC

2 points

0 comments10 min readLW link

[Question] Is religion locally correct for consequentialists in some instances?

Robert Feinstein8 Mar 2023 4:02 UTC

4 points

8 comments1 min readLW link

A Polemic

Wofsen8 Mar 2023 3:51 UTC

−15 points

1 comment1 min readLW link

AI Safety in a World of Vulnerable Machine Learning Systems

AdamGleave and EuanMcLean

8 Mar 2023 2:40 UTC

70 points

28 comments29 min readLW link

(far.ai)

[Question] Educating people about rationality: where are we?

plurple8 Mar 2023 1:59 UTC

5 points

3 comments1 min readLW link

[Question] What are MIRI’s big achievements in AI alignment?

tailcalled7 Mar 2023 21:30 UTC

29 points

7 comments1 min readLW link

A Brief Defense of Athleticism

Wofsen7 Mar 2023 20:48 UTC

46 points

5 comments1 min readLW link

[Question] How “grifty” is the Foresight Institute? Are they making button soup?

Cedar7 Mar 2023 19:43 UTC

7 points

3 comments1 min readLW link