All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 20232024

All Jan Feb Mar Apr May Jun Jul AugSepOct Nov Dec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 212223 24 25 26 27 28 29 30

How Often Does Taking Away Options Help?

niplav21 Sep 2024 21:52 UTC

20 points

6 comments2 min readLW link

A Rational Company—Seeking Advisors

AlignmentOptimizer21 Sep 2024 19:51 UTC

0 points

1 comment1 min readLW link

Seeking mentorship

Kevin Afachao21 Sep 2024 16:54 UTC

5 points

0 comments1 min readLW link

Applications of Chaos: Saying No (with Hastings Greer)

Elizabeth21 Sep 2024 16:30 UTC

50 points

16 comments2 min readLW link

(acesounderglass.com)

Investigating an insurance-for-AI startup

L Rudolf L and Florence Hinder

21 Sep 2024 15:29 UTC

69 points

0 comments16 min readLW link

(www.strataoftheworld.com)

An Unmeasured Song of Measurement

jan Sijan21 Sep 2024 15:08 UTC

−3 points

0 comments4 min readLW link

Should Sports Betting Be Banned?

Maxwell Tabarrok21 Sep 2024 14:13 UTC

18 points

2 comments4 min readLW link

(www.maximum-progress.com)

Work with me on agent foundations: independent fellowship

Alex_Altair21 Sep 2024 13:59 UTC

49 points

5 comments3 min readLW link

Electric Mandola

jefftk21 Sep 2024 13:40 UTC

9 points

0 comments1 min readLW link

(www.jefftk.com)

Glitch Token Catalog - (Almost) a Full Clear

Lao Mein21 Sep 2024 12:22 UTC

38 points

3 comments37 min readLW link

The Other Existential Crisis

James Stephen Brown21 Sep 2024 1:16 UTC

9 points

24 comments2 min readLW link

Apply to MATS 7.0!

Ryan Kidd and K Richards

21 Sep 2024 0:23 UTC

31 points

0 comments5 min readLW link

Moscow – ACX Meetups Everywhere Fall 2024

red-hara20 Sep 2024 23:03 UTC

−1 points

0 comments1 min readLW link

Validating / finding alignment-relevant concepts using neural data

Bogdan Ionut Cirstea20 Sep 2024 21:12 UTC

7 points

0 comments1 min readLW link

(docs.google.com)

Augmenting Statistical Models with Natural Language Parameters

jsteinhardt20 Sep 2024 18:30 UTC

34 points

0 comments8 min readLW link

(bounded-regret.ghost.io)

Fun With The Tabula Muris (Senis)

sarahconstantin20 Sep 2024 18:20 UTC

25 points

0 comments8 min readLW link

(sarahconstantin.substack.com)

My Critique of Effective Altruism

Dylan Price20 Sep 2024 17:41 UTC

−10 points

7 comments4 min readLW link

[Question] Why be moral if we can’t measure how moral we are? Is it even possible to measure morality?

OKlogic20 Sep 2024 17:40 UTC

−2 points

0 comments3 min readLW link

On Measuring Intellectual Performance—personal experience and several thoughts

Alexander Gufan20 Sep 2024 17:21 UTC

3 points

2 comments8 min readLW link

Introduction to Super Powers (for kids!)

Shoshannah Tekofsky20 Sep 2024 17:17 UTC

25 points

0 comments3 min readLW link

(kidquest.substack.com)

Collapsing “Collapsing the Belief/Knowledge Distinction”

Jeremias20 Sep 2024 16:11 UTC

3 points

0 comments4 min readLW link

A New Class of Glitch Tokens—BPE Subtoken Artifacts (BSA)

Lao Mein20 Sep 2024 13:13 UTC

37 points

7 comments5 min readLW link

o1-preview is pretty good at doing ML on an unknown dataset

Håvard Tveit Ihle20 Sep 2024 8:39 UTC

67 points

1 comment2 min readLW link

Moral Trade, Impact Distributions and Large Worlds

Larks20 Sep 2024 3:45 UTC

7 points

0 comments1 min readLW link

Keyboard Gremlins

jefftk20 Sep 2024 2:30 UTC

10 points

0 comments2 min readLW link

(www.jefftk.com)

The case for more Alignment Target Analysis (ATA)

Chi Nguyen and ThomasCederborg

20 Sep 2024 1:14 UTC

25 points

13 comments17 min readLW link

Piling bounded arguments

momom219 Sep 2024 22:27 UTC

7 points

0 comments4 min readLW link

We Don’t Know Our Own Values, but Reward Bridges The Is-Ought Gap

johnswentworth and David Lorell

19 Sep 2024 22:22 UTC

47 points

47 comments5 min readLW link

Interested in Cognitive Bootcamp?

Raemon19 Sep 2024 22:12 UTC

48 points

0 comments2 min readLW link

Just How Good Are Modern Chess Computers?

nem19 Sep 2024 18:57 UTC

10 points

1 comment6 min readLW link

RLHF is the worst possible thing done when facing the alignment problem

tailcalled19 Sep 2024 18:56 UTC

32 points

10 comments6 min readLW link

AISafety.info: What are Inductive Biases?

Algon19 Sep 2024 17:26 UTC

11 points

4 comments2 min readLW link

(aisafety.info)

Physics of Language models (part 2.1)

Nathan Helm-Burger19 Sep 2024 16:48 UTC

9 points

2 comments1 min readLW link

(youtu.be)

Why good things often don’t lead to better outcomes

DMMF19 Sep 2024 16:37 UTC

16 points

1 comment4 min readLW link

(danfrank.ca)

To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning

Bogdan Ionut Cirstea19 Sep 2024 16:13 UTC

21 points

1 comment1 min readLW link

(arxiv.org)

Laziness death spirals

PatrickDFarley19 Sep 2024 15:58 UTC

247 points

35 comments8 min readLW link

[Intuitive self-models] 1. Preliminaries

Steven Byrnes19 Sep 2024 13:45 UTC

88 points

20 comments15 min readLW link

AI #82: The Governor Ponders

Zvi19 Sep 2024 13:30 UTC

50 points

8 comments27 min readLW link

(thezvi.wordpress.com)

Slave Morality: A place for every man and every man in his place

Martin Sustrik19 Sep 2024 4:20 UTC

16 points

7 comments2 min readLW link

(250bpm.substack.com)

Which LessWrong/Alignment topics would you like to be tutored in? [Poll]

Ruby19 Sep 2024 1:35 UTC

43 points

12 comments1 min readLW link

The Obliqueness Thesis

jessicata19 Sep 2024 0:26 UTC

77 points

17 comments17 min readLW link

How to choose what to work on

jasoncrawford18 Sep 2024 20:39 UTC

22 points

6 comments4 min readLW link

(blog.rootsofprogress.org)

Intention-to-Treat (Re: How harmful is music, really?)

kqr18 Sep 2024 18:44 UTC

11 points

0 comments5 min readLW link

(entropicthoughts.com)

The case for a negative alignment tax

Cameron Berg, Judd Rosenblatt, Diogo de Lucena and AE Studio

18 Sep 2024 18:33 UTC

74 points

20 comments7 min readLW link

Endogenous Growth and Human Intelligence

Nicholas D.18 Sep 2024 14:05 UTC

3 points

0 comments2 min readLW link

Inquisitive vs. adversarial rationality

gb18 Sep 2024 13:50 UTC

6 points

9 comments2 min readLW link

Pronouns are Annoying

ymeskhout18 Sep 2024 13:30 UTC

15 points

21 comments4 min readLW link

(www.ymeskhout.com)

Is “superhuman” AI forecasting BS? Some experiments on the “539″ bot from the Centre for AI Safety

titotal18 Sep 2024 13:07 UTC

78 points

3 comments1 min readLW link

(open.substack.com)

Knowledge’s practicability

Ted Nguyễn18 Sep 2024 2:31 UTC

−5 points

0 comments7 min readLW link

(tednguyen.substack.com)

Skills from a year of Purposeful Rationality Practice

Raemon18 Sep 2024 2:05 UTC

185 points

18 comments7 min readLW link