All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024

All Jan Feb Mar Apr May Jun Jul Aug Sep OctNovDec

All 1 2 3 4 5 6 7 8 91011 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

ACI#6: A Non-Dualistic ACI Model

Akira Pyinya9 Nov 2023 23:01 UTC

10 points

2 comments6 min readLW link

How I got so excited about HowTruthful

Bruce Lewis9 Nov 2023 18:49 UTC

17 points

3 comments5 min readLW link

The case for “Generous Tit for Tat” as the ultimate game theory strategy

positivesum9 Nov 2023 18:41 UTC

2 points

3 comments8 min readLW link

(tryingtruly.substack.com)

Text Posts from the Kids Group: 2021

jefftk9 Nov 2023 17:50 UTC

38 points

1 comment8 min readLW link

(www.jefftk.com)

AI #37: Moving Too Fast

Zvi9 Nov 2023 17:50 UTC

53 points

5 comments76 min readLW link

(thezvi.wordpress.com)

Learning-theoretic agenda reading list

Vanessa Kosoy9 Nov 2023 17:25 UTC

98 points

0 comments2 min readLW link

Open-ended/Phenomenal Ethics (TLDR)

Ryo 9 Nov 2023 16:58 UTC

3 points

0 comments1 min readLW link

Polysemantic Attention Head in a 4-Layer Transformer

Jett Janiak, cmathw and StefanHex

9 Nov 2023 16:16 UTC

51 points

0 comments6 min readLW link

On OpenAI Dev Day

Zvi9 Nov 2023 16:10 UTC

60 points

0 comments15 min readLW link

(thezvi.wordpress.com)

Antropical Probabilities Are Fully Explained by Difference in Possible Outcomes

Ape in the coat9 Nov 2023 15:34 UTC

19 points

7 comments5 min readLW link

A free to enter, 240 character, open-source iterated prisoner’s dilemma tournament

Isaac King9 Nov 2023 8:24 UTC

64 points

19 comments1 min readLW link

(manifold.markets)

Into AI Safety Episodes 1 & 2

jacobhaimes9 Nov 2023 4:36 UTC

2 points

0 comments1 min readLW link

(into-ai-safety.github.io)

Making Bad Decisions On Purpose

Screwtape9 Nov 2023 3:36 UTC

48 points

8 comments5 min readLW link

Metaculus’s New Sidebar Helps You Find Forecasts Faster

ChristianWilliams8 Nov 2023 20:56 UTC

15 points

0 comments1 min readLW link

(www.metaculus.com)

Open-ended ethics of phenomena (a desiderata with universal morality)

Ryo 8 Nov 2023 20:10 UTC

1 point

0 comments8 min readLW link

Deconfusing “ontology” in AI alignment

Dylan Bowman8 Nov 2023 20:03 UTC

28 points

3 comments7 min readLW link

Open Agency model can solve the AI regulation dilemma

Roman Leventov8 Nov 2023 20:00 UTC

22 points

1 comment2 min readLW link

Gothenburg LW / ACX meetup

Stefan8 Nov 2023 19:52 UTC

1 point

0 comments1 min readLW link

[Question] Why is lesswrong blocking wget and curl (scrape)?

nick lacombe8 Nov 2023 19:42 UTC

21 points

12 comments1 min readLW link

[Question] Is there a lesswrong archive of all public posts?

nick lacombe8 Nov 2023 19:26 UTC

12 points

7 comments1 min readLW link

Five projects from AI Safety Hub Labs 2023

charlie_griffin8 Nov 2023 19:19 UTC

47 points

1 comment6 min readLW link

(www.aisafetyhub.org)

[Question] Can a stupid person become intelligent?

A. T.8 Nov 2023 19:01 UTC

12 points

24 comments2 min readLW link

Prosthetic Intelligence

Krantz8 Nov 2023 19:01 UTC

4 points

9 comments2 min readLW link

[Question] Do you have a satisfactory workflow for learning about a line of research using GPT4, Claude, etc?

ryan_b8 Nov 2023 18:05 UTC

9 points

3 comments1 min readLW link

What’s going on? LLMs and IS-A sentences

Bill Benzon8 Nov 2023 16:58 UTC

6 points

15 comments4 min readLW link

[Question] What will happen with real estate prices during a slow takeoff?

Ricardo Meneghin8 Nov 2023 11:58 UTC

8 points

1 comment1 min readLW link

Tall Tales at Different Scales: Evaluating Scaling Trends For Deception In Language Models

Felix Hofstätter, Francis Rhys Ward, HarrietW, LAThomson, Ollie J, Patrik Bartak and Sam F. Brown

8 Nov 2023 11:37 UTC

49 points

0 comments18 min readLW link

How well does your research adress the theory-practice gap?

Jonas Hallgren8 Nov 2023 11:27 UTC

18 points

0 comments10 min readLW link

Growth and Form in a Toy Model of Superposition

Liam Carroll and Edmund Lau

8 Nov 2023 11:08 UTC

89 points

7 comments14 min readLW link

Running your own workshop on handling hostile disagreements

Camille Berger 8 Nov 2023 10:28 UTC

12 points

1 comment7 min readLW link

Thinking By The Clock

Screwtape8 Nov 2023 7:40 UTC

185 points

27 comments8 min readLW link

[Question] Impressions from base-GPT-4?

mishka8 Nov 2023 5:43 UTC

25 points

25 comments1 min readLW link

Quantopian contest, but for food intake and weight

Lucent8 Nov 2023 5:41 UTC

40 points

9 comments3 min readLW link

How I Think, Part Two: Distrusting Individuals

Richard Henage8 Nov 2023 4:06 UTC

4 points

6 comments3 min readLW link

How I Think, Part One: Investing in Fun

Richard Henage8 Nov 2023 4:00 UTC

5 points

2 comments5 min readLW link

Concrete positive visions for a future without AGI

Max H8 Nov 2023 3:12 UTC

41 points

28 comments8 min readLW link

South Bay ACX/LW/EA Meetup & Vegansgiving Potluck

IS8 Nov 2023 2:30 UTC

10 points

0 comments1 min readLW link

Progress links digest, 2023-11-07: Techno-optimism and more

jasoncrawford8 Nov 2023 2:05 UTC

17 points

7 comments11 min readLW link

(rootsofprogress.org)

Announcing Athena—Women in AI Alignment Research

Claire Short7 Nov 2023 21:46 UTC

80 points

2 comments3 min readLW link

Vote on Interesting Disagreements

Ben Pace7 Nov 2023 21:35 UTC

159 points

129 comments1 min readLW link

What is democracy for?

Johnstone7 Nov 2023 18:17 UTC

−5 points

10 comments7 min readLW link

Scalable And Transferable Black-Box Jailbreaks For Language Models Via Persona Modulation

Soroush Pour, rusheb, Quentin FEUILLADE--MONTIXI, Arush and scasper

7 Nov 2023 17:59 UTC

36 points

2 comments2 min readLW link

(arxiv.org)

Implementing Decision Theory

justinpombrio7 Nov 2023 17:55 UTC

22 points

12 comments3 min readLW link

Mirror, Mirror on the Wall: How Do Forecasters Fare by Their Own Call?

nikos7 Nov 2023 17:39 UTC

14 points

5 comments14 min readLW link

Symbiotic self-alignment of AIs.

Spiritus Dei7 Nov 2023 17:18 UTC

1 point

0 comments3 min readLW link

AMA: Earning to Give

jefftk7 Nov 2023 16:20 UTC

53 points

8 comments1 min readLW link

(www.jefftk.com)

The Stochastic Parrot Hypothesis is debatable for the last generation of LLMs

Quentin FEUILLADE--MONTIXI and Pierre Peigné

7 Nov 2023 16:12 UTC

52 points

20 comments6 min readLW link

Preface to the Sequence on LLM Psychology

Quentin FEUILLADE--MONTIXI7 Nov 2023 16:12 UTC

33 points

0 comments2 min readLW link

What I’ve been reading, November 2023

jasoncrawford7 Nov 2023 13:37 UTC

23 points

1 comment5 min readLW link

(rootsofprogress.org)

AI Alignment [Progress] this Week (11/05/2023)

Logan Zoellner7 Nov 2023 13:26 UTC

24 points

0 comments4 min readLW link

(midwitalignment.substack.com)