All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 20232024

All Jan Feb Mar Apr May JunJulAug Sep Oct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 131415 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Ice: The Penultimate Frontier

Roko13 Jul 2024 23:44 UTC

62 points

56 comments1 min readLW link

(transhumanaxiology.substack.com)

Trust as a bottleneck to growing teams quickly

benkuhn13 Jul 2024 18:00 UTC

42 points

3 comments5 min readLW link

(www.benkuhn.net)

Stitching SAEs of different sizes

Bart Bussmann, Patrick Leask, Joseph Bloom, Curt Tigges and Neel Nanda

13 Jul 2024 17:19 UTC

39 points

12 comments12 min readLW link

Kinds of Motivation

Sable13 Jul 2024 15:52 UTC

7 points

2 comments7 min readLW link

(affablyevil.substack.com)

A simple case for extreme inner misalignment

Richard_Ngo13 Jul 2024 15:40 UTC

85 points

41 comments7 min readLW link

Reality Testing

Ben Turtel13 Jul 2024 15:20 UTC

−2 points

1 comment6 min readLW link

(bturtel.substack.com)

The world is awful. The world is much better. The world can be much better: The Animation.

Writer13 Jul 2024 14:03 UTC

10 points

0 comments1 min readLW link

(youtu.be)

The Modern Problems with Conformity

Zero Contradictions13 Jul 2024 8:20 UTC

0 points

5 comments1 min readLW link

(expandingrationality.substack.com)

Designing Artificial Wisdom: GitWise and AlphaWise

Jordan Arel13 Jul 2024 6:46 UTC

2 points

0 comments7 min readLW link

OpenAI’s Intelligence Levels

infinibot2713 Jul 2024 6:25 UTC

1 point

0 comments1 min readLW link

(www.bloomberg.com)

Some desirable properties of automated wisdom

Marius Adrian Nicoară13 Jul 2024 6:05 UTC

3 points

2 comments6 min readLW link

Thought Experiments Website

minmi_drover13 Jul 2024 4:47 UTC

10 points

11 comments1 min readLW link

A Second Wetsuit Summer

jefftk13 Jul 2024 2:00 UTC

19 points

2 comments1 min readLW link

(www.jefftk.com)

Timaeus is hiring!

Jesse Hoogland, Stan van Wingerden, Alexander Gietelink Oldenziel and Daniel Murfet

12 Jul 2024 23:42 UTC

67 points

6 comments2 min readLW link

Consider attending the AI Security Forum ’24, a 1-day pre-DEFCON event

Charlie Rogers-Smith12 Jul 2024 23:01 UTC

21 points

0 comments1 min readLW link

Memorising molecular structures

dkl912 Jul 2024 22:40 UTC

6 points

0 comments2 min readLW link

(dkl9.net)

Robin Hanson AI X-Risk Debate — Highlights and Analysis

Liron12 Jul 2024 21:31 UTC

46 points

7 comments45 min readLW link

(www.youtube.com)

Designing Artificial Wisdom: The Wise Workflow Research Organization

Jordan Arel12 Jul 2024 19:18 UTC

2 points

0 comments8 min readLW link

Whiteboard Pen Magazines are Useful

Johannes C. Mayer12 Jul 2024 17:15 UTC

40 points

8 comments1 min readLW link

Alignment: “Do what I would have wanted you to do”

Oleg Trott12 Jul 2024 16:47 UTC

11 points

48 comments1 min readLW link

Virtue taxation

Dentosal12 Jul 2024 14:56 UTC

9 points

1 comment2 min readLW link

Most smart and skilled people are outside of the EA/rationalist community: an analysis

titotal12 Jul 2024 12:13 UTC

107 points

36 comments1 min readLW link

(open.substack.com)

2024 Freedom Communities Events

Tudor Iliescu12 Jul 2024 8:04 UTC

−6 points

1 comment1 min readLW link

Faithful vs Interpretable Sparse Autoencoder Evals

Louka Ewington-Pitsos12 Jul 2024 5:37 UTC

2 points

0 comments12 min readLW link

Moving away from physical continuity

ProgramCrafter12 Jul 2024 5:05 UTC

2 points

1 comment1 min readLW link

Transformer Circuit Faithfulness Metrics Are Not Robust

Joseph Miller, bilalchughtai and William_S

12 Jul 2024 3:47 UTC

104 points

5 comments7 min readLW link

(arxiv.org)

On Artificial Wisdom

Jordan Arel12 Jul 2024 0:20 UTC

3 points

0 comments14 min readLW link

Yoshua Bengio: Reasoning through arguments against taking AI safety seriously

Judd Rosenblatt11 Jul 2024 23:53 UTC

70 points

3 comments1 min readLW link

(yoshuabengio.org)

Podcast: “How the Smart Money teaches trading with Ricki Heicklen” (Patrick McKenzie interviewing)

rossry11 Jul 2024 22:49 UTC

20 points

2 comments1 min readLW link

(www.complexsystemspodcast.com)

Superbabies: Putting The Pieces Together

sarahconstantin11 Jul 2024 20:40 UTC

215 points

37 comments10 min readLW link

(sarahconstantin.substack.com)

Sherlockian Abduction Master List

Cole Wyeth11 Jul 2024 20:27 UTC

50 points

63 comments33 min readLW link

Thoughts to niplav on lie-detection, truthfwl mechanisms, and wealth-inequality

Emrik and niplav

11 Jul 2024 18:55 UTC

7 points

8 comments11 min readLW link

Games for AI Control

charlie_griffin and Buck

11 Jul 2024 18:40 UTC

43 points

0 comments5 min readLW link

Video Intro to Guaranteed Safe AI

Mike Vaiana, Diogo de Lucena and AE Studio

11 Jul 2024 17:53 UTC

27 points

0 comments1 min readLW link

(youtu.be)

Effective Empathy

Thac011 Jul 2024 15:14 UTC

4 points

1 comment1 min readLW link

AI #72: Denying the Future

Zvi11 Jul 2024 15:00 UTC

45 points

8 comments41 min readLW link

(thezvi.wordpress.com)

The Best Bits From Build, Baby, Build

Maxwell Tabarrok11 Jul 2024 14:09 UTC

13 points

0 comments4 min readLW link

(www.maximum-progress.com)

[Question] What Other Lines of Work are Safe from AI Automation?

RogerDearnaley11 Jul 2024 10:01 UTC

29 points

35 comments5 min readLW link

Decomposing Agency — capabilities without desires

owencb and Raymond D

11 Jul 2024 9:38 UTC

146 points

32 comments12 min readLW link

(strangecities.substack.com)

Reliable Sources: The Story of David Gerard

TracingWoodgrains10 Jul 2024 19:50 UTC

381 points

53 comments43 min readLW link

Managing Emotional Potential Energy

adamShimi10 Jul 2024 18:20 UTC

23 points

4 comments4 min readLW link

(epistemologicalfascinations.substack.com)

[EAForum xpost] A breakdown of OpenAI’s revenue

dschwarz and Lawrence Phillips

10 Jul 2024 18:09 UTC

57 points

5 comments1 min readLW link

(forum.effectivealtruism.org)

Solving Pascal’s Wager using dynamic programming

Paul Wilczewski10 Jul 2024 18:09 UTC

1 point

0 comments5 min readLW link

Fluent, Cruxy Predictions

Raemon10 Jul 2024 18:00 UTC

85 points

14 comments14 min readLW link

Antitrust as Controlled Creative Destruction

Martin Sustrik10 Jul 2024 16:40 UTC

14 points

2 comments2 min readLW link

(250bpm.substack.com)

New page: Integrity

Zach Stein-Perlman10 Jul 2024 15:00 UTC

91 points

3 comments1 min readLW link

AirBnB Baking

jefftk10 Jul 2024 12:50 UTC

7 points

1 comment1 min readLW link

(www.jefftk.com)

DIY RLHF: A simple implementation for hands on experience

Mike Vaiana and AE Studio

10 Jul 2024 12:07 UTC

28 points

0 comments6 min readLW link

Usefulness grounds truth

invertedpassion10 Jul 2024 7:58 UTC

0 points

0 comments4 min readLW link

On passing Complete and Honest Ideological Turing Tests (CHITTs)

Aryeh Englander10 Jul 2024 4:01 UTC

11 points

2 comments1 min readLW link