All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024

All Jan Feb Mar Apr May Jun Jul Aug Sep Oct NovDec

All12 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

FixDT

abramdemski30 Nov 2023 21:57 UTC

56 points

14 comments14 min readLW link

Generalization, from thermodynamics to statistical physics

Jesse Hoogland30 Nov 2023 21:28 UTC

63 points

9 comments28 min readLW link

What’s next for the field of Agent Foundations?

Nora_Ammann, Alexander Gietelink Oldenziel and mattmacdermott

30 Nov 2023 17:55 UTC

59 points

23 comments10 min readLW link

A Proposed Cure for Alzheimer’s Disease???

MadHatter30 Nov 2023 17:37 UTC

4 points

30 comments2 min readLW link

AI #40: A Vision from Vitalik

Zvi30 Nov 2023 17:30 UTC

53 points

12 comments42 min readLW link

(thezvi.wordpress.com)

Is scheming more likely in models trained to have long-term goals? (Sections 2.2.4.1-2.2.4.2 of “Scheming AIs”)

Joe Carlsmith30 Nov 2023 16:43 UTC

8 points

0 comments6 min readLW link

A Formula for Violence (and Its Antidote)

MadHatter30 Nov 2023 16:04 UTC

−22 points

6 comments1 min readLW link

(blog.simpleheart.org)

Enkrateia: a safe model-based reinforcement learning algorithm

MadHatter30 Nov 2023 15:51 UTC

−15 points

4 comments2 min readLW link

(github.com)

Normative Ethics vs Utilitarianism

Logan Zoellner30 Nov 2023 15:36 UTC

6 points

0 comments2 min readLW link

(midwitalignment.substack.com)

Information-Theoretic Boxing of Superintelligences

JustinShovelain and Elliot Mckernon

30 Nov 2023 14:31 UTC

30 points

0 comments7 min readLW link

OpenAI: Altman Returns

Zvi30 Nov 2023 14:10 UTC

66 points

12 comments11 min readLW link

(thezvi.wordpress.com)

[Linkpost] Remarks on the Convergence in Distribution of Random Neural Networks to Gaussian Processes in the Infinite Width Limit

carboniferous_umbraculum 30 Nov 2023 14:01 UTC

9 points

0 comments1 min readLW link

(drive.google.com)

[Question] Buy Nothing Day is a great idea with a terrible app— why has nobody built a killer app for crowdsourced ‘effective communism’ yet?

lillybaeum30 Nov 2023 13:47 UTC

8 points

17 comments1 min readLW link

[Question] Comprehensible Input is the only way people learn languages—is it the only way people learn?

lillybaeum30 Nov 2023 13:31 UTC

8 points

2 comments3 min readLW link

Some Intuitions for the Ethicophysics

MadHatter and mishka

30 Nov 2023 6:47 UTC

2 points

4 comments8 min readLW link

The Alignment Agenda THEY Don’t Want You to Know About

MadHatter30 Nov 2023 4:29 UTC

−18 points

16 comments1 min readLW link

Cis fragility

[deactivated]30 Nov 2023 4:14 UTC

−51 points

9 comments3 min readLW link

Homework Answer: Glicko Ratings for War

MadHatter30 Nov 2023 4:08 UTC

−29 points

1 comment77 min readLW link

(gist.github.com)

[Question] Feature Request for LessWrong

MadHatter30 Nov 2023 3:19 UTC

11 points

8 comments1 min readLW link

My Alignment Research Agenda (“the Ethicophysics”)

MadHatter30 Nov 2023 2:57 UTC

−13 points

0 comments1 min readLW link

[Question] Stupid Question: Why am I getting consistently downvoted?

MadHatter30 Nov 2023 0:21 UTC

18 points

124 comments1 min readLW link

Inositol Non-Results

Elizabeth29 Nov 2023 21:40 UTC

20 points

2 comments1 min readLW link

(acesounderglass.com)

Losing Metaphors: Zip and Paste

jefftk29 Nov 2023 20:31 UTC

26 points

6 comments1 min readLW link

(www.jefftk.com)

Preserving our heritage: Building a movement and a knowledge ark for current and future generations

rnk829 Nov 2023 19:20 UTC

0 points

5 comments12 min readLW link

AGI Alignment is Absurd

Youssef Mohamed29 Nov 2023 19:11 UTC

−9 points

4 comments3 min readLW link

The origins of the steam engine: An essay with interactive animated diagrams

jasoncrawford29 Nov 2023 18:30 UTC

30 points

1 comment1 min readLW link

(rootsofprogress.org)

ChatGPT 4 solved all the gotcha problems I posed that tripped ChatGPT 3.5

VipulNaik29 Nov 2023 18:11 UTC

33 points

16 comments14 min readLW link

“Clean” vs. “messy” goal-directedness (Section 2.2.3 of “Scheming AIs”)

Joe Carlsmith29 Nov 2023 16:32 UTC

29 points

1 comment11 min readLW link

Lying Alignment Chart

Zack_M_Davis29 Nov 2023 16:15 UTC

76 points

17 comments1 min readLW link

Rethink Priorities: Seeking Expressions of Interest for Special Projects Next Year

kierangreig29 Nov 2023 13:59 UTC

4 points

0 comments5 min readLW link

[Question] Thoughts on teletransportation with copies?

titotal29 Nov 2023 12:56 UTC

15 points

13 comments1 min readLW link

Interpretability with Sparse Autoencoders (Colab exercises)

CallumMcDougall29 Nov 2023 12:56 UTC

74 points

9 comments4 min readLW link

The 101 Space You Will Always Have With You

Screwtape29 Nov 2023 4:56 UTC

250 points

20 comments6 min readLW link

Trust your intuition—Kahneman’s book misses the forest for the trees

mnvr29 Nov 2023 4:37 UTC

−2 points

2 comments2 min readLW link

Process Substitution Without Shell?

jefftk29 Nov 2023 3:20 UTC

19 points

18 comments2 min readLW link

(www.jefftk.com)

Deception Chess: Game #2

Zane29 Nov 2023 2:43 UTC

29 points

17 comments2 min readLW link

Black Box Biology

GeneSmith29 Nov 2023 2:27 UTC

62 points

30 comments2 min readLW link

[Question] What would be the shelf life of nuclear weapon-secrecy if nuclear weapons had not immediately been used in combat?

Gram Stone29 Nov 2023 0:53 UTC

7 points

2 comments1 min readLW link

Scaling laws for dominant assurance contracts

jessicata28 Nov 2023 23:11 UTC

36 points

5 comments7 min readLW link

(unstableontology.com)

I’m confused about innate smell neuroanatomy

Steven Byrnes28 Nov 2023 20:49 UTC

39 points

2 comments9 min readLW link

How to Control an LLM’s Behavior (why my P(DOOM) went down)

RogerDearnaley28 Nov 2023 19:56 UTC

64 points

30 comments11 min readLW link

[Question] Is there a word for discrimination against A.I.?

Aaron Bohannon28 Nov 2023 19:03 UTC

1 point

4 comments1 min readLW link

Update #2 to “Dominant Assurance Contract Platform”: EnsureDone

moyamo28 Nov 2023 18:02 UTC

33 points

2 comments1 min readLW link

Ethicophysics II: Politics is the Mind-Savior

MadHatter28 Nov 2023 16:27 UTC

−9 points

9 comments4 min readLW link

(bittertruths.substack.com)

Neither EA nor e/acc is what we need to build the future

jasoncrawford28 Nov 2023 16:04 UTC

0 points

22 comments3 min readLW link

(rootsofprogress.org)

Agentic Growth

Logan Kieller28 Nov 2023 15:45 UTC

1 point

0 comments3 min readLW link

(logankieller.substack.com)

AISC project: How promising is automating alignment research? (literature review)

Bogdan Ionut Cirstea28 Nov 2023 14:47 UTC

4 points

1 comment1 min readLW link

(docs.google.com)

A day in the life of a mechanistic interpretability researcher

Bill Benzon28 Nov 2023 14:45 UTC

3 points

3 comments1 min readLW link

Two sources of beyond-episode goals (Section 2.2.2 of “Scheming AIs”)

Joe Carlsmith28 Nov 2023 13:49 UTC

11 points

1 comment15 min readLW link

Self-Referential Probabilistic Logic Admits the Payor’s Lemma

Yudhister Kumar28 Nov 2023 10:27 UTC

80 points

14 comments6 min readLW link