All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 202220232024

All Jan Feb Mar Apr May Jun Jul AugSepOct Nov Dec

All 1 234 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

The goal of physics

Jim Pivarski2 Sep 2023 23:08 UTC

46 points

4 comments5 min readLW link

Will value of paid sex drop right before the end of the world?

azamatvaliev2 Sep 2023 19:03 UTC

−13 points

0 comments4 min readLW link

PIBBSS Summer Symposium 2023

Nora_Ammann and DusanDNesic

2 Sep 2023 17:22 UTC

25 points

2 comments3 min readLW link

The smallest possible button (or: moth traps!)

Neil 2 Sep 2023 15:24 UTC

122 points

17 comments3 min readLW link

(neilwarren.substack.com)

Steven Harnad: Symbol grounding and the structure of dictionaries

Bill Benzon2 Sep 2023 12:28 UTC

5 points

3 comments2 min readLW link

Is Metaethics Unnecessary Given Intent-Aligned AI?

CBiddulph2 Sep 2023 9:48 UTC

10 points

0 comments7 min readLW link

Rational Agents Cooperate in the Prisoner’s Dilemma

Isaac King2 Sep 2023 6:15 UTC

17 points

66 comments12 min readLW link

[Linkpost] Large language models converge toward human-like concept organization

Bogdan Ionut Cirstea2 Sep 2023 6:00 UTC

22 points

1 comment1 min readLW link

Plum Cooking Temperature

jefftk2 Sep 2023 1:30 UTC

11 points

0 comments1 min readLW link

(www.jefftk.com)

[Question] What did you learn from leaked documents?

wassname2 Sep 2023 1:28 UTC

10 points

7 comments1 min readLW link

One Minute Every Moment

abramdemski1 Sep 2023 20:23 UTC

125 points

23 comments3 min readLW link

Tensor Trust: An online game to uncover prompt injection vulnerabilities

Luke Bailey and qxcv

1 Sep 2023 19:31 UTC

30 points

0 comments5 min readLW link

(tensortrust.ai)

Reproducing ARC Evals’ recent report on language model agents

Thomas Broadley1 Sep 2023 16:52 UTC

103 points

17 comments3 min readLW link

(thomasbroadley.com)

[Question] Why aren’t more people in AIS familiar with PDP?

Prometheus1 Sep 2023 15:27 UTC

4 points

9 comments1 min readLW link

AGI isn’t just a technology

Seth Herd1 Sep 2023 14:35 UTC

18 points

12 comments2 min readLW link

Can an LLM identify ring-composition in a literary text? [ChatGPT]

Bill Benzon1 Sep 2023 14:18 UTC

4 points

2 comments11 min readLW link

What is OpenAI’s plan for making AI Safer?

brook1 Sep 2023 11:15 UTC

6 points

0 comments4 min readLW link

(aisafetyexplained.substack.com)

Progress links digest, 2023-09-01: How ancient people manipulated water, and more

jasoncrawford1 Sep 2023 4:33 UTC

13 points

4 comments6 min readLW link

(rootsofprogress.org)

A Golden Age of Building? Excerpts and lessons from Empire State, Pentagon, Skunk Works and SpaceX

jacobjacob1 Sep 2023 4:03 UTC

188 points

26 comments24 min readLW link 1 review

[Question] Would AI experts ever agree that AGI systems have attained “consciousness”?

Super AGI1 Sep 2023 3:57 UTC

−16 points

6 comments1 min readLW link

Meta Questions about Metaphilosophy

Wei Dai1 Sep 2023 1:17 UTC

155 points

78 comments3 min readLW link

[Linkpost] Michael Nielsen remarks on ‘Oppenheimer’

22tom31 Aug 2023 15:46 UTC

78 points

7 comments2 min readLW link

(michaelnotebook.com)

My thoughts on AI and personal future plan after learning about AI Safety for 4 months

Ziyue Wang31 Aug 2023 15:32 UTC

7 points

0 comments4 min readLW link

Which Questions Are Anthropic Questions?

dadadarren31 Aug 2023 15:15 UTC

16 points

13 comments3 min readLW link

The Tree of Life, and a Note on Job

Bill Benzon31 Aug 2023 14:03 UTC

13 points

7 comments4 min readLW link

Cleaning a SoundCraft Mixer

jefftk31 Aug 2023 13:20 UTC

11 points

0 comments1 min readLW link

(www.jefftk.com)

AI #27: Portents of Gemini

Zvi31 Aug 2023 12:40 UTC

54 points

37 comments47 min readLW link

(thezvi.wordpress.com)

[CANCELLED DUE TO ILLNESS] San Francisco ACX Meetup “First Saturday”

guenael31 Aug 2023 12:34 UTC

1 point

0 comments1 min readLW link

Long-Term Future Fund Ask Us Anything (September 2023)

Linch, calebp99, abergal, habryka, Thomas Larsen, LawrenceC and Lauro Langosco

31 Aug 2023 0:28 UTC

33 points

6 comments1 min readLW link

(forum.effectivealtruism.org)

Responses to apparent rationalist confusions about game / decision theory

Anthony DiGiovanni30 Aug 2023 22:02 UTC

142 points

14 comments12 min readLW link

Invulnerable Incomplete Preferences: A Formal Statement

SCP30 Aug 2023 21:59 UTC

134 points

38 comments35 min readLW link

Report on Frontier Model Training

YafahEdelman30 Aug 2023 20:02 UTC

122 points

21 comments21 min readLW link

(docs.google.com)

An adversarial example for Direct Logit Attribution: memory management in gelu-4l

Can, Yeu-Tong Lau, James Dao and Jett Janiak

30 Aug 2023 17:36 UTC

17 points

0 comments8 min readLW link

(arxiv.org)

A Letter to the Editor of MIT Technology Review

Jeffs30 Aug 2023 16:59 UTC

0 points

0 comments2 min readLW link

Biosecurity Culture, Computer Security Culture

jefftk30 Aug 2023 16:40 UTC

103 points

11 comments2 min readLW link

(www.jefftk.com)

Why I hang out at LessWrong and why you should check-in there every now and then

Bill Benzon30 Aug 2023 15:20 UTC

16 points

5 comments5 min readLW link

“Wanting” and “liking”

Mateusz Bagiński30 Aug 2023 14:52 UTC

23 points

3 comments29 min readLW link

Open Call for Research Assistants in Developmental Interpretability

Jesse Hoogland, Daniel Murfet, Alexander Gietelink Oldenziel and Stan van Wingerden

30 Aug 2023 9:02 UTC

55 points

11 comments4 min readLW link

LTFF and EAIF are unusually funding-constrained right now

Linch and calebp99

30 Aug 2023 1:03 UTC

90 points

24 comments15 min readLW link

(forum.effectivealtruism.org)

Paper Walkthrough: Automated Circuit Discovery with Arthur Conmy

Neel Nanda29 Aug 2023 22:07 UTC

36 points

1 comment1 min readLW link

(www.youtube.com)

An OV-Coherent Toy Model of Attention Head Superposition

Lauren Greenspan and keith_wynroe

29 Aug 2023 19:44 UTC

26 points

2 comments6 min readLW link

The Economics of the Asteroid Deflection Problem (Dominant Assurance Contracts)

moyamo29 Aug 2023 18:28 UTC

78 points

71 comments15 min readLW link

The Epistemic Authority of Deep Learning Pioneers

Dylan Bowman29 Aug 2023 18:14 UTC

8 points

2 comments3 min readLW link

Democratic Fine-Tuning

Joe Edelman29 Aug 2023 18:13 UTC

22 points

2 comments1 min readLW link

(open.substack.com)

Should rationalists (be seen to) win?

Will_Pearson29 Aug 2023 18:13 UTC

6 points

7 comments1 min readLW link

Frankfurt meetup

sultan29 Aug 2023 18:10 UTC

2 points

0 comments1 min readLW link

Istanbul meetup

sultan29 Aug 2023 18:10 UTC

2 points

0 comments1 min readLW link

Broken Benchmark: MMLU

awg29 Aug 2023 18:09 UTC

24 points

5 comments1 min readLW link

(www.youtube.com)

AISN #20: LLM Proliferation, AI Deception, and Continuing Drivers of AI Capabilities

aogara and Dan H

29 Aug 2023 15:07 UTC

12 points

0 comments8 min readLW link

(newsletter.safe.ai)

Loft Bed Fan Guard

jefftk29 Aug 2023 13:30 UTC

16 points

3 comments1 min readLW link

(www.jefftk.com)