All 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 202120222023 2024 2025

All Jan Feb Mar Apr May Jun Jul Aug Sep Oct NovDec

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 192021 22 23 24 25 26 27 28 29 30 31

Introducing Shrubgrazer

jefftkDec 16, 2022, 2:50 PM

22 points

0 comments2 min readLW link

(www.jefftk.com)

Paper: Transformers learn in-context by gradient descent

LawrenceCDec 16, 2022, 11:10 AM

28 points

11 comments2 min readLW link

(arxiv.org)

Will Machines Ever Rule the World? MLAISU W50

Esben KranDec 16, 2022, 11:03 AM

12 points

7 comments4 min readLW link

(newsletter.apartresearch.com)

AI overhangs depend on whether algorithms, compute and data are substitutes or complements

NathanBarnardDec 16, 2022, 2:23 AM

4 points

0 comments3 min readLW link

AI Safety Movement Builders should help the community to optimise three factors: contributors, contributions and coordination

peterslatteryDec 15, 2022, 10:50 PM

4 points

0 comments6 min readLW link

Masking to Avoid Missing Things

jefftkDec 15, 2022, 9:00 PM

17 points

2 comments1 min readLW link

(www.jefftk.com)

Consider working more hours and taking more stimulants

Arjun PanicksseryDec 15, 2022, 8:38 PM

33 points

11 comments LW link

We’ve stepped over the threshold into the Fourth Arena, but don’t recognize it

Bill BenzonDec 15, 2022, 8:22 PM

2 points

0 comments7 min readLW link

[Question] How is ARC planning to use ELK?

jacquesthibsDec 15, 2022, 8:11 PM

24 points

5 comments1 min readLW link

How “Discovering Latent Knowledge in Language Models Without Supervision” Fits Into a Broader Alignment Scheme

CollinDec 15, 2022, 6:22 PM

244 points

39 comments16 min readLW link 1 review

High-level hopes for AI alignment

HoldenKarnofskyDec 15, 2022, 6:00 PM

58 points

3 comments19 min readLW link

(www.cold-takes.com)

Two Dogmas of LessWrong

omnizoidDec 15, 2022, 5:56 PM

−7 points

155 comments69 min readLW link

Covid 12/15/22: China’s Wave Begins

ZviDec 15, 2022, 4:20 PM

32 points

7 comments10 min readLW link

(thezvi.wordpress.com)

The next decades might be wild

Marius HobbhahnDec 15, 2022, 4:10 PM

175 points

42 comments41 min readLW link 1 review

Basic building blocks of dependent type theory

Thomas KehrenbergDec 15, 2022, 2:54 PM

49 points

9 comments13 min readLW link

AI Neorealism: a threat model & success criterion for existential safety

davidadDec 15, 2022, 1:42 PM

67 points

1 comment3 min readLW link

Who should write the definitive post on Ziz?

Nicholas / Heather KrossDec 15, 2022, 6:37 AM

4 points

45 comments3 min readLW link

[Question] Is Paul Christiano still as optimistic about Approval-Directed Agents as he was in 2018?

Chris_LeongDec 14, 2022, 11:28 PM

8 points

0 comments1 min readLW link

«Boundaries», Part 3b: Alignment problems in terms of boundaries

Andrew_CritchDec 14, 2022, 10:34 PM

72 points

7 comments13 min readLW link

Aligning alignment with performance

Marv KDec 14, 2022, 10:19 PM

2 points

0 comments2 min readLW link

Contrary to List of Lethality’s point 22, alignment’s door number 2

False NameDec 14, 2022, 10:01 PM

−2 points

5 comments22 min readLW link

Kolmogorov Complexity and Simulation Hypothesis

False NameDec 14, 2022, 10:01 PM

−3 points

0 comments7 min readLW link

[Question] Stanley Meyer’s water fuel cell

mikbpDec 14, 2022, 9:19 PM

2 points

6 comments1 min readLW link

[Question] Is the AI timeline too short to have children?

YorethDec 14, 2022, 6:32 PM

38 points

20 comments1 min readLW link

Predicting GPU performance

Marius Hobbhahn and Tamay

Dec 14, 2022, 4:27 PM

60 points

26 comments1 min readLW link

(epochai.org)

[Incomplete] What is Computation Anyway?

DragonGodDec 14, 2022, 4:17 PM

16 points

1 comment13 min readLW link

(arxiv.org)

Chair Hanging Peg

jefftkDec 14, 2022, 3:30 PM

11 points

0 comments1 min readLW link

(www.jefftk.com)

My AGI safety research—2022 review, ’23 plans

Steven ByrnesDec 14, 2022, 3:15 PM

51 points

10 comments7 min readLW link

Extracting and Evaluating Causal Direction in LLMs’ Activations

Fabien Roger and simeon_c

Dec 14, 2022, 2:33 PM

29 points

5 comments11 min readLW link

Key Mostly Outward-Facing Facts From the Story of VaccinateCA

ZviDec 14, 2022, 1:30 PM

61 points

2 comments23 min readLW link

(thezvi.wordpress.com)

Discovering Latent Knowledge in Language Models Without Supervision

XodarapDec 14, 2022, 12:32 PM

45 points

1 comment1 min readLW link

(arxiv.org)

[Question] COVID China Personal Advice (No mRNA vax, possible hospital overload, bug-chasing edition)

Lao MeinDec 14, 2022, 10:31 AM

20 points

11 comments1 min readLW link

Beyond a better world

DavidmanheimDec 14, 2022, 10:18 AM

14 points

7 comments4 min readLW link

(progressforum.org)

Proof as mere strong evidence

adamShimiDec 14, 2022, 8:56 AM

28 points

16 comments2 min readLW link

(epistemologicalvigilance.substack.com)

Trying to disambiguate different questions about whether RLHF is “good”

BuckDec 14, 2022, 4:03 AM

108 points

47 comments7 min readLW link 1 review

[Question] How can one literally buy time (from x-risk) with money?

Alex_AltairDec 13, 2022, 7:24 PM

24 points

3 comments1 min readLW link

[Question] Best introductory overviews of AGI safety?

JakubKDec 13, 2022, 7:01 PM

21 points

9 comments2 min readLW link

(forum.effectivealtruism.org)

Applications open for AGI Safety Fundamentals: Alignment Course

Richard_NgoDec 13, 2022, 6:31 PM

49 points

0 comments2 min readLW link

What Does It Mean to Align AI With Human Values?

AlgonDec 13, 2022, 4:56 PM

8 points

3 comments1 min readLW link

(www.quantamagazine.org)

It Takes Two Paracetamol?

Eli_Dec 13, 2022, 4:29 PM

33 points

10 comments2 min readLW link

[Interim research report] Taking features out of superposition with sparse autoencoders

Lee Sharkey, Dan Braun and beren

Dec 13, 2022, 3:41 PM

150 points

23 comments22 min readLW link 2 reviews

[Question] Is the ChatGPT-simulated Linux virtual machine real?

KenoubiDec 13, 2022, 3:41 PM

18 points

7 comments1 min readLW link

Existential AI Safety is NOT separate from near-term applications

scasperDec 13, 2022, 2:47 PM

37 points

17 comments3 min readLW link

What is the correlation between upvoting and benefit to readers of LW?

banevDec 13, 2022, 2:26 PM

7 points

15 comments1 min readLW link

Limits of Superintelligence

Aleksei PetrenkoDec 13, 2022, 12:19 PM

1 point

5 comments1 min readLW link

Bay 2022 Solstice

RaemonDec 13, 2022, 8:58 AM

17 points

0 comments1 min readLW link

Last day to nominate things for the Review. Also, 2019 books still exist.

RaemonDec 13, 2022, 8:53 AM

15 points

0 comments1 min readLW link

AI alignment is distinct from its near-term applications

paulfchristianoDec 13, 2022, 7:10 AM

255 points

21 comments2 min readLW link

(ai-alignment.com)

Take 10: Fine-tuning with RLHF is aesthetically unsatisfying.

Charlie SteinerDec 13, 2022, 7:04 AM

37 points

3 comments2 min readLW link

[Question] Are lawsuits against AGI companies extending AGI timelines?

SlowingAGIDec 13, 2022, 6:00 AM

1 point

1 comment1 min readLW link