27 Jun 2023 23:23 UTC

24 points

1 comment13 min readLW link

(arxiv.org)

Catastrophic Risks from AI #5: Rogue AIs

Dan H, Mantas Mazeika and TW123

27 Jun 2023 22:06 UTC

15 points

0 comments22 min readLW link

(arxiv.org)

AISN #12: Policy Proposals from NTIA’s Request for Comment and Reconsidering Instrumental Convergence

Dan H27 Jun 2023 17:20 UTC

6 points

0 comments1 min readLW link

The Weight of the Future (Why The Apocalypse Can Be A Relief)

Sable27 Jun 2023 17:18 UTC

18 points

14 comments3 min readLW link

(affablyevil.substack.com)

Aligning AI by optimizing for “wisdom”

JustinShovelain and Elliot Mckernon

27 Jun 2023 15:20 UTC

27 points

8 comments12 min readLW link

Freedom under Naturalistic Dualism

Arturo Macias27 Jun 2023 14:34 UTC

1 point

36 comments1 min readLW link

(www.jneurophilosophy.com)

Munk AI debate: confusions and possible cruxes

Steven Byrnes27 Jun 2023 14:18 UTC

244 points

21 comments8 min readLW link

Ateliers: Motivation

Stephen Fowler27 Jun 2023 13:07 UTC

7 points

0 comments2 min readLW link

Self-Blinded Caffeine RCT

niplav27 Jun 2023 12:38 UTC

44 points

9 comments8 min readLW link

An overview of the points system

Iknownothing27 Jun 2023 9:09 UTC

3 points

4 comments1 min readLW link

(ai-plans.com)

AISC team report: Soft-optimization, Bayes and Goodhart

Simon Fischer, benjaminko, jazcarretao, DFNaiff and Jeremy Gillen

27 Jun 2023 6:05 UTC

37 points

2 comments15 min readLW link

Epistemic spot checking one claim in The Precipice

Isaac King27 Jun 2023 1:03 UTC

33 points

3 comments1 min readLW link

nuclear costs are inflation

bhauth26 Jun 2023 22:30 UTC

8 points

42 comments5 min readLW link

(www.bhauth.com)

Man in the Arena

Richard_Ngo26 Jun 2023 21:57 UTC

62 points

6 comments8 min readLW link

Catastrophic Risks from AI #4: Organizational Risks

Dan H, Mantas Mazeika and TW123

26 Jun 2023 19:36 UTC

23 points

0 comments21 min readLW link

(arxiv.org)

The fraught voyage of aligned novelty

TsviBT26 Jun 2023 19:10 UTC

13 points

0 comments17 min readLW link

[Question] Deceptive AI vs. shifting instrumental incentives

Aryeh Englander26 Jun 2023 18:09 UTC

7 points

2 comments3 min readLW link

On the Cost of Thriving Index

Zvi26 Jun 2023 15:30 UTC

33 points

6 comments9 min readLW link

(thezvi.wordpress.com)

“Safety Culture for AI” is important, but isn’t going to be easy

Davidmanheim26 Jun 2023 12:52 UTC

47 points

2 comments2 min readLW link

(forum.effectivealtruism.org)

Direct Preference Optimization in One Minute

lukemarks26 Jun 2023 11:52 UTC

22 points

3 comments2 min readLW link

Self-experiment: A supraphysiological dosage of testosterone.

shapeshifter26 Jun 2023 10:26 UTC

8 points

3 comments1 min readLW link

Confused Attractiveness

Vlad Loweren26 Jun 2023 9:33 UTC

8 points

5 comments6 min readLW link

60+ Possible Futures

Bart Bussmann26 Jun 2023 9:16 UTC

93 points

18 comments11 min readLW link

Bounded surprise exam paradox

cousin_it26 Jun 2023 8:37 UTC

29 points

5 comments2 min readLW link

Model, Care, Execution

Ricki Heicklen and AvitalM

26 Jun 2023 4:05 UTC

111 points

10 comments12 min readLW link 1 review

(bayesshammai.substack.com)

The Fall of Rationality—The Senate of Admins

Ace Delgado26 Jun 2023 1:49 UTC

−10 points

0 comments4 min readLW link

Another medical miracle

Dentin25 Jun 2023 20:43 UTC

191 points

48 comments3 min readLW link

Did Bengio and Tegmark lose a debate about AI x-risk against LeCun and Mitchell?

Karl von Wendt25 Jun 2023 16:59 UTC

106 points

53 comments7 min readLW link

AI-Plans.com—a contributable compendium

Iknownothing25 Jun 2023 14:40 UTC

39 points

7 comments4 min readLW link

(ai-plans.com)

Map of maps of interesting fields

MaxG25 Jun 2023 14:02 UTC

24 points

0 comments1 min readLW link

(glozematrix.substack.com)

Why am I Me?

dadadarren25 Jun 2023 12:07 UTC

45 points

46 comments3 min readLW link

Will the growing deer prion epidemic spread to humans? Why not?

eukaryote25 Jun 2023 4:31 UTC

170 points

33 comments13 min readLW link

(eukaryotewritesblog.com)

Crystal Healing — or the Origins of Expected Utility Maximizers

Alexander Gietelink Oldenziel, RP and Kaarel

25 Jun 2023 3:18 UTC

54 points

11 comments5 min readLW link

What’s in it for AI?

archeon25 Jun 2023 1:17 UTC

−20 points

0 comments1 min readLW link

Lessons Learned: Properly Publicizing a Regional Meetup Event (also, last call to apply!)

Willa25 Jun 2023 0:58 UTC

9 points

2 comments4 min readLW link

San Francisco ACX Meetup “First Saturday” July 1, 1 pm

guenael24 Jun 2023 22:40 UTC

2 points

0 comments1 min readLW link

Correctly Calibrated Trust

habryka24 Jun 2023 19:48 UTC

36 points

3 comments11 min readLW link

(forum.effectivealtruism.org)

Democratic AI Constitution: Round-Robin Debate and Synthesis

scottviteri24 Jun 2023 19:31 UTC

10 points

4 comments5 min readLW link

(scottviteri.com)

DSLT 4. Phase Transitions in Neural Networks

Liam Carroll24 Jun 2023 17:22 UTC

30 points

3 comments16 min readLW link

[Question] Donate Now vs Donate Later—Relative Value of Donations to AI Alignment

AlignmentOptimizer24 Jun 2023 17:20 UTC

4 points

4 comments1 min readLW link

ACX/EA Meetup Bremen

RasmusHB24 Jun 2023 16:23 UTC

3 points

0 comments1 min readLW link

How to prevent Re-Traumatization on Meditation Retreats

EternallyBlissful24 Jun 2023 14:16 UTC

20 points

1 comment5 min readLW link

[Question] Can you prevent negative long-term effects of bad trips with sleep deprivation?

EternallyBlissful24 Jun 2023 14:05 UTC

15 points

5 comments1 min readLW link

We ran a reading group on The Scout Mindset

Neil Crawford and andreamurillo

24 Jun 2023 10:10 UTC

7 points

0 comments2 min readLW link

Crisis Boot Camp: lessons learned and implications for EA

Nicole Ross24 Jun 2023 6:28 UTC

26 points

0 comments1 min readLW link

I just watched don’t look up.

ATheCoder23 Jun 2023 21:22 UTC

0 points

5 comments2 min readLW link

Automatic Rate Limiting on LessWrong

Raemon23 Jun 2023 20:19 UTC

77 points

34 comments7 min readLW link

Catastrophic Risks from AI #3: AI Race

Dan H, Mantas Mazeika and TW123

23 Jun 2023 19:21 UTC

18 points

9 comments29 min readLW link

(arxiv.org)

Write the Worst Post on LessWrong!

Johannes C. Mayer23 Jun 2023 19:17 UTC

−10 points

5 comments4 min readLW link

Slaying the Hydra: toward a new game board for AI

Prometheus23 Jun 2023 17:04 UTC

0 points

5 comments6 min readLW link