All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 20232024

All Jan Feb Mar Apr May Jun Jul Aug Sep OctNovDec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 192021 22 23 24 25 26 27 28 29 30

Value/Utility: A History

Lorec19 Nov 2024 23:01 UTC

9 points

0 comments10 min readLW link

Why Don’t We Just… Shoggoth+Face+Paraphraser?

Daniel Kokotajlo and abramdemski

19 Nov 2024 20:53 UTC

121 points

51 comments14 min readLW link

Every niche event should also be a meetup

DMMF19 Nov 2024 20:47 UTC

16 points

0 comments3 min readLW link

(danfrank.ca)

Root node of my posts

AtillaYasar19 Nov 2024 20:09 UTC

2 points

0 comments2 min readLW link

U.S.-China Economic and Security Review Commission pushes Manhattan Project-style AI initiative

Phib19 Nov 2024 18:42 UTC

56 points

7 comments1 min readLW link

Intrinsic Power-Seeking: AI Might Seek Power for Power’s Sake

TurnTrout19 Nov 2024 18:36 UTC

40 points

5 comments1 min readLW link

(turntrout.com)

Evolution’s selection target depends on your weighting

tailcalled19 Nov 2024 18:24 UTC

23 points

22 comments1 min readLW link

AISN #44: The Trump Circle on AI Safety Plus, Chinese researchers used Llama to create a military tool for the PLA, a Google AI system discovered a zero-day cybersecurity vulnerability, and Complex Systems

Corin Katzke, Julius, andrewz and Dan H

19 Nov 2024 16:36 UTC

9 points

0 comments5 min readLW link

(newsletter.safe.ai)

Jakarta ACX December 2024 Meetup

Aud19 Nov 2024 15:01 UTC

1 point

0 comments1 min readLW link

Visualizing small Attention-only Transformers

WCargo19 Nov 2024 9:37 UTC

4 points

0 comments8 min readLW link

Americans are fat and sick—and it’s their fault…right?

Declan Molony19 Nov 2024 6:41 UTC

6 points

3 comments7 min readLW link

Announcing the CLR Foundations Course and CLR S-Risk Seminars

JamesFaville19 Nov 2024 1:18 UTC

18 points

0 comments1 min readLW link

No Electricity in Manchuria

winstonBosan19 Nov 2024 1:11 UTC

25 points

0 comments5 min readLW link

Looking back on the Future of Humanity Institute—Asterisk

jakeeaton19 Nov 2024 0:44 UTC

48 points

0 comments1 min readLW link

Don’t Dismiss on Epistemics

ggex19 Nov 2024 0:44 UTC

9 points

3 comments2 min readLW link

Training AI agents to solve hard problems could lead to Scheming

Marius Hobbhahn and AlexMeinke

19 Nov 2024 0:10 UTC

61 points

12 comments28 min readLW link

Proactive ‘If-Then’ Safety Cases

Nathan Helm-Burger18 Nov 2024 21:16 UTC

8 points

0 comments4 min readLW link

[Question] Will Orion/Gemini 2/Llama-4 outperform o1

LuigiPagani18 Nov 2024 21:15 UTC

1 point

3 comments1 min readLW link

How to use bright light to improve your life.

Nat Martin18 Nov 2024 19:32 UTC

40 points

10 comments10 min readLW link

Social events with plausible deniability

Chipmonk18 Nov 2024 18:25 UTC

25 points

24 comments1 min readLW link

(chrislakin.blog)

How likely is brain preservation to work?

Andy_McKenzie18 Nov 2024 16:58 UTC

25 points

3 comments6 min readLW link

Why imperfect adversarial robustness doesn’t doom AI control

Buck and Claude+

18 Nov 2024 16:05 UTC

61 points

26 comments2 min readLW link

Ethical Implications of the Quantum Multiverse

Jonah Wilberg18 Nov 2024 16:00 UTC

7 points

22 comments6 min readLW link

Reducing x-risk might be actively harmful

MountainPath18 Nov 2024 14:25 UTC

3 points

5 comments1 min readLW link

Monthly Roundup #24: November 2024

Zvi18 Nov 2024 13:20 UTC

43 points

14 comments50 min readLW link

(thezvi.wordpress.com)

A Straightforward Explanation of the Good Regulator Theorem

Alfred Harwood18 Nov 2024 12:45 UTC

24 points

3 comments14 min readLW link

The Choice Transition

owencb and Raymond D

18 Nov 2024 12:30 UTC

44 points

4 comments15 min readLW link

(strangecities.substack.com)

Chat Bankman-Fried: an Exploration of LLM Alignment in Finance

claudia.biancotti18 Nov 2024 9:38 UTC

26 points

4 comments1 min readLW link

Proposal to increase fertility: University parent clubs

Fluffnutt18 Nov 2024 4:21 UTC

17 points

3 comments1 min readLW link

A small improvement to Wikipedia page on Pareto Efficiency

ektimo18 Nov 2024 2:13 UTC

7 points

0 comments1 min readLW link

[Question] Why is Gemini telling the user to die?

Burny18 Nov 2024 1:44 UTC

13 points

1 comment1 min readLW link

“It’s a 10% chance which I did 10 times, so it should be 100%”

egor.timatkov18 Nov 2024 1:14 UTC

149 points

57 comments2 min readLW link

The Catastrophe of Shiny Objects

mindprison18 Nov 2024 0:24 UTC

−12 points

0 comments3 min readLW link

Do Deep Neural Networks Have Brain-like Representations?: A Summary of Disagreements

Joseph Emerson18 Nov 2024 0:07 UTC

9 points

0 comments26 min readLW link

Truth Terminal: A reconstruction of events

crvr.fr and MTorrents

17 Nov 2024 23:51 UTC

1 point

1 comment7 min readLW link

Which AI Safety Benchmark Do We Need Most in 2025?

Loïc Cabannes and William Ludington

17 Nov 2024 23:50 UTC

2 points

2 comments8 min readLW link

“The Solomonoff Prior is Malign” is a special case of a simpler argument

David Matolcsi17 Nov 2024 21:32 UTC

124 points

44 comments12 min readLW link

Chess As The Model Game

criticalpoints17 Nov 2024 19:45 UTC

19 points

0 comments8 min readLW link

(eregis.github.io)

The grass is always greener in the environment that shaped your values

Karl Faulks17 Nov 2024 18:00 UTC

8 points

0 comments3 min readLW link

Announcing turntrout.com, my new digital home

TurnTrout17 Nov 2024 17:42 UTC

107 points

24 comments1 min readLW link

(turntrout.com)

Secular Solstice Songbook Update

jefftk17 Nov 2024 17:30 UTC

14 points

2 comments1 min readLW link

(www.jefftk.com)

Germany-wide ACX Meetup

Fernand017 Nov 2024 10:08 UTC

4 points

0 comments1 min readLW link

Project Adequate: Seeking Cofounders/Funders

Lorec17 Nov 2024 3:12 UTC

5 points

7 comments8 min readLW link

Trying Bluesky

jefftk17 Nov 2024 2:50 UTC

26 points

17 comments1 min readLW link

(www.jefftk.com)

AXRP Episode 38.1 - Alan Chan on Agent Infrastructure

DanielFilan16 Nov 2024 23:30 UTC

12 points

0 comments14 min readLW link

Cross-context abduction: LLMs make inferences about procedural training data leveraging declarative facts in earlier training data

Sohaib Imran16 Nov 2024 23:22 UTC

36 points

11 comments14 min readLW link

Why We Wouldn’t Build Aligned AI Even If We Could

Snowyiu16 Nov 2024 20:19 UTC

10 points

7 comments10 min readLW link

[Question] What (if anything) made your p(doom) go down in 2024?

Satron16 Nov 2024 16:46 UTC

4 points

6 comments1 min readLW link

Gwerns

Tomás B.16 Nov 2024 14:31 UTC

20 points

2 comments1 min readLW link

Which evals resources would be good?

Marius Hobbhahn16 Nov 2024 14:24 UTC

47 points

4 comments5 min readLW link