All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024

All Jan Feb Mar Apr May JunJulAug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 111213 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Why it’s necessary to shoot yourself in the foot

Jacob G-W11 Jul 2023 21:17 UTC

39 points

7 comments2 min readLW link

(g-w1.github.io)

How do low level hypotheses constrain high level ones? The mystery of the disappearing diamond.

Christopher King11 Jul 2023 19:27 UTC

17 points

11 comments2 min readLW link

[Question] Do we automatically accept propositions?

Aaron Graifman11 Jul 2023 17:45 UTC

10 points

5 comments1 min readLW link

fMRI LIKE APPROACH TO AI ALIGNMENT / DECEPTIVE BEHAVIOUR

Escaque 6611 Jul 2023 17:17 UTC

−1 points

3 comments2 min readLW link

Introducing Fatebook: the fastest way to make and track predictions

Adam B and Sage Future

11 Jul 2023 15:28 UTC

128 points

36 comments1 min readLW link

(fatebook.io)

My Weirdest Experience

Bridgett Kay11 Jul 2023 14:44 UTC

37 points

19 comments1 min readLW link

(dxmrevealed.wordpress.com)

Announcing The Roots of Progress Blog-Building Intensive

jasoncrawford11 Jul 2023 14:04 UTC

10 points

0 comments1 min readLW link

(rootsofprogress.org)

OpenAI Launches Superalignment Taskforce

Zvi11 Jul 2023 13:00 UTC

149 points

40 comments49 min readLW link

(thezvi.wordpress.com)

Critiquing Risks From Learned Optimization, and Avoiding Cached Theories

ProofBySonnet11 Jul 2023 11:38 UTC

1 point

0 comments6 min readLW link

[UPDATE: deadline extended to July 24!] New wind in rationality’s sails: Applications for Epistea Residency 2023 are now open

Jana Meixnerová and Irena Kotíková

11 Jul 2023 11:02 UTC

80 points

7 comments3 min readLW link

Two Hot Takes about Quine

Charlie Steiner11 Jul 2023 6:42 UTC

15 points

0 comments2 min readLW link

Disincentivizing deception in mesa optimizers with Model Tampering

martinkunev11 Jul 2023 0:44 UTC

3 points

0 comments2 min readLW link

Drawn Out: a story

Richard_Ngo11 Jul 2023 0:08 UTC

77 points

2 comments8 min readLW link

Definitions are about efficiency and consistency with common language.

Nacruno9610 Jul 2023 23:46 UTC

1 point

0 comments4 min readLW link

Reframing Evolution—An information wavefront traveling through time

Joshua Clancy10 Jul 2023 22:36 UTC

1 point

0 comments5 min readLW link

(midflip.org)

GPT-7: The Tale of the Big Computer (An Experimental Story)

Justin Bullock10 Jul 2023 20:22 UTC

4 points

4 comments5 min readLW link

Cost-effectiveness of professional field-building programs for AI safety research

Dan H10 Jul 2023 18:28 UTC

8 points

5 comments1 min readLW link

Cost-effectiveness of student programs for AI safety research

Dan H10 Jul 2023 18:28 UTC

15 points

2 comments1 min readLW link

Modeling the impact of AI safety field-building programs

Dan H10 Jul 2023 18:27 UTC

21 points

0 comments1 min readLW link

I think Michael Bailey’s dismissal of my autogynephilia questions for Scott Alexander and Aella makes very little sense

tailcalled10 Jul 2023 17:39 UTC

45 points

45 comments2 min readLW link

Incentives from a causal perspective

tom4everitt, James Fox, RyanCarey, mattmacdermott, sbenthall and Jonathan Richens

10 Jul 2023 17:16 UTC

27 points

0 comments6 min readLW link

Is the Endowment Effect Due to Incomparability?

Kevin Dorst10 Jul 2023 16:26 UTC

21 points

10 comments7 min readLW link

(kevindorst.substack.com)

Frontier AI Regulation

Zach Stein-Perlman10 Jul 2023 14:30 UTC

21 points

4 comments8 min readLW link

(arxiv.org)

Why is it so hard to change people’s minds? Well, imagine if it wasn’t...

Celarix10 Jul 2023 13:55 UTC

6 points

9 comments6 min readLW link

Consider Joining the UK Foundation Model Taskforce

Zvi10 Jul 2023 13:50 UTC

105 points

12 comments1 min readLW link

(thezvi.wordpress.com)

“Reframing Superintelligence” + LLMs + 4 years

Eric Drexler10 Jul 2023 13:42 UTC

117 points

9 comments12 min readLW link

Open-minded updatelessness

Nicolas Macé, JesseClifton and SMK

10 Jul 2023 11:08 UTC

65 points

21 comments12 min readLW link

Consciousness as a conflationary alliance term for intrinsically valued internal experiences

Andrew_Critch10 Jul 2023 8:09 UTC

201 points

51 comments11 min readLW link

The world where LLMs are possible

Ape in the coat10 Jul 2023 8:00 UTC

20 points

10 comments3 min readLW link

The virtue of determination

Richard_Ngo10 Jul 2023 5:11 UTC

59 points

5 comments4 min readLW link

Some reasons to not say “Doomer”

Ruby9 Jul 2023 21:05 UTC

46 points

18 comments4 min readLW link

The Seeker’s Game – Vignettes from the Bay

Yulia9 Jul 2023 19:32 UTC

137 points

19 comments16 min readLW link

[Question] Why have exposure notification apps been (mostly) discontinued?

VipulNaik9 Jul 2023 19:07 UTC

10 points

5 comments2 min readLW link

[Question] The Necessity of Privacy: A Condition for Social Change and Experimentation?

Blake9 Jul 2023 18:42 UTC

−8 points

1 comment1 min readLW link

Attempting to Deconstruct “Real”

herschel9 Jul 2023 16:40 UTC

21 points

23 comments2 min readLW link

Quick proposal: Decision market regrantor using manifund (please improve)

Nathan Young9 Jul 2023 12:49 UTC

10 points

5 comments5 min readLW link

[Question] Where are the people building AGI in the non-dumb way?

Johannes C. Mayer9 Jul 2023 11:39 UTC

10 points

19 comments2 min readLW link

[Question] What to read on the “informal multi-world model”?

mishka9 Jul 2023 4:48 UTC

13 points

23 comments1 min readLW link

Whether LLMs “understand” anything is mostly a terminological dispute

RobertM9 Jul 2023 3:31 UTC

10 points

1 comment1 min readLW link

Taboo Truth

Tomás B.8 Jul 2023 23:23 UTC

36 points

16 comments2 min readLW link

“View”

herschel8 Jul 2023 23:19 UTC

6 points

0 comments2 min readLW link

[Question] H5N1. Just how bad is the situation?

Q Home8 Jul 2023 22:09 UTC

16 points

8 comments1 min readLW link

A Two-Part System for Practical Self-Care

Jonathan Moregård8 Jul 2023 21:23 UTC

11 points

0 comments3 min readLW link

(honestliving.substack.com)

Really Strong Features Found in Residual Stream

Logan Riggs8 Jul 2023 19:40 UTC

69 points

6 comments2 min readLW link

Eight Strategies for Tackling the Hard Part of the Alignment Problem

scasper8 Jul 2023 18:55 UTC

42 points

11 comments7 min readLW link

“Concepts of Agency in Biology” (Okasha, 2023) - Brief Paper Summary

Nora_Ammann8 Jul 2023 18:22 UTC

40 points

3 comments7 min readLW link

Blanchard’s Dangerous Idea and the Plight of the Lucid Crossdreamer

Zack_M_Davis8 Jul 2023 18:03 UTC

38 points

135 comments72 min readLW link

(unremediatedgender.space)

Continuous Adversarial Quality Assurance: Extending RLHF and Constitutional AI

Benaya Koren8 Jul 2023 17:32 UTC

6 points

0 comments9 min readLW link

Commentless downvoting is not a good way to fight infohazards

DirectedEvolution8 Jul 2023 17:29 UTC

6 points

9 comments3 min readLW link

[Question] Why does anxiety (?) make me dumb?

TeaTieAndHat8 Jul 2023 16:13 UTC

18 points

14 comments3 min readLW link