Distillation & Pedagogy

TagLast edit: Aug 21, 2020, 10:31 PM by Raemon

Distillation is the process of taking a complex subject, and making it easier to understand. Pedagogy is the method and practice of teaching. A good intellectual pipeline requires not just discovering new ideas, but making it easier for newcomers to learn them, stand on the shoulders of giants, and discover even more ideas.

Chris Olah, founder of distill.pub, writes in his essay Research Debt:

Programmers talk about technical debt: there are ways to write software that are faster in the short run but problematic in the long run. Managers talk about institutional debt: institutions can grow quickly at the cost of bad practices creeping in. Both are easy to accumulate but hard to get rid of.
Research can also have debt. It comes in several forms:
Poor Exposition – Often, there is no good explanation of important ideas and one has to struggle to understand them. This problem is so pervasive that we take it for granted and don’t appreciate how much better things could be.
Undigested Ideas – Most ideas start off rough and hard to understand. They become radically easier as we polish them, developing the right analogies, language, and ways of thinking.
Bad abstractions and notation – Abstractions and notation are the user interface of research, shaping how we think and communicate. Unfortunately, we often get stuck with the first formalisms to develop even when they’re bad. For example, an object with extra electrons is negative, and pi is wrong.
Noise – Being a researcher is like standing in the middle of a construction site. Countless papers scream for your attention and there’s no easy way to filter or summarize them.Because most work is explained poorly, it takes a lot of energy to understand each piece of work. For many papers, one wants a simple one sentence explanation of it, but needs to fight with it to get that sentence. Because the simplest way to get the attention of interested parties is to get everyone’s attention, we get flooded with work. Because we incentivize people being “prolific,” we get flooded with a lot of work… We think noise is the main way experts experience research debt.
The insidious thing about research debt is that it’s normal. Everyone takes it for granted, and doesn’t realize that things could be different. For example, it’s normal to give very mediocre explanations of research, and people perceive that to be the ceiling of explanation quality. On the rare occasions that truly excellent explanations come along, people see them as one-off miracles rather than a sign that we could systematically be doing better.

How to teach things well

Neel NandaAug 28, 2020, 4:44 PM

110 points

17 comments15 min readLW link 1 review

(www.neelnanda.io)

Research Debt

ElizabethJul 15, 2018, 7:36 PM

25 points

2 comments1 min readLW link

(distill.pub)

Ironing Out the Squiggles

Zack_M_DavisApr 29, 2024, 4:13 PM

157 points

36 comments11 min readLW link

[Question] What are Examples of Great Distillers?

adamShimiNov 12, 2020, 2:09 PM

35 points

12 comments1 min readLW link

Call For Distillers

johnswentworthApr 4, 2022, 6:25 PM

207 points

43 comments3 min readLW link 1 review

Learning how to learn

Neel NandaSep 30, 2020, 4:50 PM

44 points

0 comments15 min readLW link

(www.neelnanda.io)

Explainers Shoot High. Aim Low!

Eliezer YudkowskyOct 24, 2007, 1:13 AM

101 points

35 comments1 min readLW link

Abstracts should be either Actually Short™, or broken into paragraphs

RaemonMar 24, 2023, 12:51 AM

93 points

27 comments5 min readLW link

The Cave Allegory Revisited: Understanding GPT’s Worldview

Jan_KulveitFeb 14, 2023, 4:00 PM

86 points

5 comments3 min readLW link

Infra-Bayesianism Unwrapped

adamShimiJan 20, 2021, 1:35 PM

58 points

0 comments24 min readLW link

Davidad’s Bold Plan for Alignment: An In-Depth Explanation

Charbel-Raphaël and Gabin

Apr 19, 2023, 4:09 PM

168 points

40 comments21 min readLW link 2 reviews

The 101 Space You Will Always Have With You

ScrewtapeNov 29, 2023, 4:56 AM

271 points

22 comments6 min readLW link 1 review

DARPA Digital Tutor: Four Months to Total Technical Expertise?

SebastianG Jul 6, 2020, 11:34 PM

220 points

22 comments7 min readLW link

Stampy’s AI Safety Info—New Distillations #3 [May 2023]

markovJun 6, 2023, 2:18 PM

16 points

0 comments2 min readLW link

(aisafety.info)

Stampy’s AI Safety Info—New Distillations #2 [April 2023]

markovMay 9, 2023, 1:31 PM

25 points

1 comment1 min readLW link

(aisafety.info)

TAPs for Tutoring

Mark XuDec 24, 2020, 8:46 PM

27 points

3 comments5 min readLW link

Stampy’s AI Safety Info—New Distillations #1 [March 2023]

markovApr 7, 2023, 11:06 AM

42 points

0 comments2 min readLW link

(aisafety.info)

(Summary) Sequence Highlights—Thinking Better on Purpose

qazzquimbyAug 2, 2022, 5:45 PM

33 points

3 comments11 min readLW link

Features that make a report especially helpful to me

lukeprogApr 14, 2022, 1:12 AM

40 points

0 comments2 min readLW link

Expansive translations: considerations and possibilities

ozziegooenSep 18, 2020, 3:39 PM

43 points

15 comments6 min readLW link

Natural Abstractions: Key claims, Theorems, and Critiques

LawrenceC, Leon Lang and Erik Jenner

Mar 16, 2023, 4:37 PM

241 points

23 comments45 min readLW link 3 reviews

A concise sum-up of the basic argument for AI doom

Mergimio H. DoefevmilApr 24, 2023, 5:37 PM

11 points

6 comments2 min readLW link

Learning-theoretic agenda reading list

Vanessa KosoyNov 9, 2023, 5:25 PM

103 points

1 comment2 min readLW link 1 review

Getting rational now or later: navigating procrastination and time-inconsistent preferences for new rationalists

milo_thoughtsFeb 26, 2024, 7:38 PM

1 point

0 comments8 min readLW link

Learning Math in Time for Alignment

Nicholas / Heather KrossJan 9, 2024, 1:02 AM

32 points

5 comments3 min readLW link

Uncertainty in all its flavours

Cleo NardoJan 9, 2024, 4:21 PM

34 points

6 comments35 min readLW link

Explaining Impact Markets

Saul MunnJan 31, 2024, 9:51 AM

95 points

2 comments3 min readLW link

(www.brasstacks.blog)

“Deep Learning” Is Function Approximation

Zack_M_DavisMar 21, 2024, 5:50 PM

98 points

28 comments10 min readLW link

(zackmdavis.net)

What does Yann LeCun think about AGI? A summary of his talk, “Mathematical Obstacles on the Way to Human-Level AI”

Adam JonesApr 5, 2025, 12:21 PM

11 points

0 comments2 min readLW link

Superposition is not “just” neuron polysemanticity

LawrenceCApr 26, 2024, 11:22 PM

66 points

4 comments13 min readLW link

AI Safety Strategies Landscape

Charbel-RaphaëlMay 9, 2024, 5:33 PM

34 points

1 comment42 min readLW link

How ARENA course material gets made

CallumMcDougallJul 2, 2024, 6:04 PM

41 points

2 comments7 min readLW link

An Extremely Opinionated Annotated List of My Favourite Mechanistic Interpretability Papers v2

Neel NandaJul 7, 2024, 5:39 PM

135 points

16 comments25 min readLW link

Dialogue introduction to Singular Learning Theory

Olli JärviniemiJul 8, 2024, 4:58 PM

100 points

15 comments8 min readLW link

Poker is a bad game for teaching epistemics. Figgie is a better one.

rossryJul 8, 2024, 6:05 AM

106 points

47 comments11 min readLW link

(blog.rossry.net)

Podcast: “How the Smart Money teaches trading with Ricki Heicklen” (Patrick McKenzie interviewing)

rossryJul 11, 2024, 10:49 PM

20 points

2 comments1 min readLW link

(www.complexsystemspodcast.com)

Insights from Euclid’s ‘Elements’

TurnTroutMay 4, 2020, 3:45 PM

126 points

17 comments4 min readLW link

Graceful Degradation

ScrewtapeNov 5, 2024, 11:57 PM

83 points

8 comments4 min readLW link

The Solomonoff Prior is Malign

Mark XuOct 14, 2020, 1:33 AM

179 points

52 comments16 min readLW link 3 reviews

Habermas Machine

NicholasKeesMar 13, 2025, 6:16 PM

47 points

7 comments6 min readLW link

(mosaic-labs.org)

Socially Graceful Degradation

ScrewtapeMar 20, 2025, 4:03 AM

57 points

9 comments9 min readLW link

Beren’s “Deconfusing Direct vs Amortised Optimisation”

DragonGodApr 7, 2023, 8:57 AM

52 points

10 comments3 min readLW link

But why would the AI kill us?

So8resApr 17, 2023, 6:42 PM

138 points

96 comments2 min readLW link

AGI ruin mostly rests on strong claims about alignment and deployment, not about society

Rob BensingerApr 24, 2023, 1:06 PM

70 points

8 comments6 min readLW link

Quick thoughts on the difficulty of widely conveying a non-stereotyped position

SniffnoyMar 27, 2025, 7:30 AM

12 points

0 comments5 min readLW link

Hiatus: EA and LW post summaries

Zoe WilliamsMay 17, 2023, 5:17 PM

14 points

0 comments1 min readLW link

Expertise and advice

John_MaxwellMay 27, 2012, 1:49 AM

25 points

4 comments1 min readLW link

Cheat sheet of AI X-risk

momom2Jun 29, 2023, 4:28 AM

19 points

1 comment7 min readLW link

Rationality, Pedagogy, and “Vibes”: Quick Thoughts

Nicholas / Heather KrossJul 15, 2023, 2:09 AM

14 points

1 comment4 min readLW link

Announcing AISafety.info’s Write-a-thon (June 16-18) and Second Distillation Fellowship (July 3-October 2)

steven0461Jun 3, 2023, 2:03 AM

33 points

1 comment2 min readLW link

Join AISafety.info’s Distillation Hackathon (Oct 6-9th)

smallsiloOct 1, 2023, 6:43 PM

21 points

0 comments2 min readLW link

(forum.effectivealtruism.org)

Avoid Unnecessarily Political Examples

RaemonJan 11, 2021, 5:41 AM

106 points

42 comments3 min readLW link

Discovery fiction for the Pythagorean theorem

riceissaJan 19, 2021, 2:09 AM

16 points

1 comment4 min readLW link

Inversion of theorems into definitions when generalizing

riceissaAug 4, 2019, 5:44 PM

25 points

3 comments5 min readLW link

Think like an educator about code quality

Adam ZernerMar 27, 2021, 5:43 AM

44 points

8 comments9 min readLW link

99% shorter

philhMay 27, 2021, 7:50 PM

16 points

0 comments6 min readLW link

(reasonableapproximation.net)

An Apprentice Experiment in Python Programming

konstell and gilch

Jul 4, 2021, 3:29 AM

67 points

4 comments9 min readLW link

An Apprentice Experiment in Python Programming, Part 2

konstell and gilch

Jul 29, 2021, 7:39 AM

30 points

18 comments10 min readLW link

Calibration proverbs

MalmesburyJan 11, 2022, 5:11 AM

76 points

19 comments1 min readLW link

[Closed] Job Offering: Help Communicate Infrabayesianism

abramdemski, Vanessa Kosoy and Diffractor

Mar 23, 2022, 6:35 PM

129 points

22 comments1 min readLW link

Summary: “How to Write Quickly...” by John Wentworth

Pablo RepettoApr 11, 2022, 11:26 PM

4 points

0 comments2 min readLW link

(pabloernesto.github.io)

[Question] What to include in a guest lecture on existential risks from AI?

Aryeh EnglanderApr 13, 2022, 5:03 PM

20 points

9 comments1 min readLW link

Rationality Dojo

lsusrApr 24, 2022, 12:53 AM

14 points

5 comments1 min readLW link

Calling for Student Submissions: AI Safety Distillation Contest

ArisApr 24, 2022, 1:53 AM

48 points

15 comments4 min readLW link

Infra-Bayesianism Distillation: Realizability and Decision Theory

Thomas LarsenMay 26, 2022, 9:57 PM

40 points

9 comments18 min readLW link

[Request for Distillation] Coherence of Distributed Decisions With Different Inputs Implies Conditioning

johnswentworthApr 25, 2022, 5:01 PM

22 points

14 comments2 min readLW link

How to get people to produce more great exposition? Some strategies and their assumptions

riceissaMay 25, 2022, 10:30 PM

26 points

10 comments3 min readLW link

Exposition as science: some ideas for how to make progress

riceissaJul 8, 2022, 1:29 AM

21 points

1 comment8 min readLW link

A distillation of Evan Hubinger’s training stories (for SERI MATS)

Daphne_WJul 18, 2022, 3:38 AM

15 points

1 comment10 min readLW link

Pitfalls with Proofs

scasperJul 19, 2022, 10:21 PM

19 points

21 comments8 min readLW link

Distillation Contest—Results and Recap

ArisJul 29, 2022, 5:40 PM

34 points

0 comments7 min readLW link

[Question] Which intro-to-AI-risk text would you recommend to...

SherrinfordAug 1, 2022, 9:36 AM

12 points

1 comment1 min readLW link

Seeking PCK (Pedagogical Content Knowledge)

CFAR!DuncanAug 12, 2022, 4:15 AM

62 points

11 comments5 min readLW link

AI alignment as “navigating the space of intelligent behaviour”

Nora_AmmannAug 23, 2022, 1:28 PM

18 points

0 comments6 min readLW link

Alignment is hard. Communicating that, might be harder

Eleni AngelouSep 1, 2022, 4:57 PM

7 points

8 comments3 min readLW link

How To Know What the AI Knows—An ELK Distillation

Fabien RogerSep 4, 2022, 12:46 AM

7 points

0 comments5 min readLW link

Summaries: Alignment Fundamentals Curriculum

Leon LangSep 18, 2022, 1:08 PM

44 points

3 comments1 min readLW link

(docs.google.com)

Power-Seeking AI and Existential Risk

Antonio FrancaOct 11, 2022, 10:50 PM

6 points

0 comments9 min readLW link

Real-Time Research Recording: Can a Transformer Re-Derive Positional Info?

Neel NandaNov 1, 2022, 11:56 PM

69 points

16 comments1 min readLW link

(youtu.be)

Distillation Experiment: Chunk-Knitting

DirectedEvolutionNov 7, 2022, 7:56 PM

10 points

3 comments6 min readLW link

The No Free Lunch theorem for dummies

Steven ByrnesDec 5, 2022, 9:46 PM

37 points

16 comments3 min readLW link

Shard Theory in Nine Theses: a Distillation and Critical Appraisal

LawrenceCDec 19, 2022, 10:52 PM

150 points

30 comments18 min readLW link

Summary of 80k’s AI problem profile

JakubKJan 1, 2023, 7:30 AM

7 points

0 comments5 min readLW link

(forum.effectivealtruism.org)

Induction heads—illustrated

CallumMcDougallJan 2, 2023, 3:35 PM

128 points

12 comments3 min readLW link

AI Safety Info Distillation Fellowship

Robert Miles and mwatkins

Feb 17, 2023, 4:16 PM

47 points

3 comments3 min readLW link

[Question] Is “Strong Coherence” Anti-Natural?

DragonGodApr 11, 2023, 6:22 AM

23 points

25 comments2 min readLW link

Advice for time management as a manager

benkuhnApr 2, 2025, 4:00 AM

16 points

1 comment5 min readLW link

(www.benkuhn.net)

AI Safety − 7 months of discussion in 17 minutes

Zoe WilliamsMar 15, 2023, 11:41 PM

25 points

0 comments1 min readLW link

An Elementary Introduction to Infra-Bayesianism

CharlesRWSep 20, 2023, 2:29 PM

16 points

0 comments1 min readLW link

Great Explanations

lukeprogOct 31, 2011, 11:58 PM

34 points

115 comments2 min readLW link

A LessWrong “rationality workbook” idea

jwhendyJan 9, 2011, 5:52 PM

26 points

26 comments3 min readLW link

Debugging the student

Adam ZernerDec 16, 2020, 7:07 AM

46 points

7 comments4 min readLW link

AI Safety 101 : Reward Misspecification

markovOct 18, 2023, 8:39 PM

32 points

4 comments31 min readLW link

Failure Modes of Teaching AI Safety

Eleni AngelouJun 25, 2024, 7:07 PM

20 points

0 comments1 min readLW link

Retrospective on Teaching Rationality Workshops

Neel NandaJan 3, 2021, 5:15 PM

66 points

2 comments31 min readLW link

[Question] What currents of thought on LessWrong do you want to see distilled?

ryan_bJan 8, 2021, 9:43 PM

48 points

19 comments1 min readLW link

The Benefits of Distillation in Research

Jonas HallgrenMar 4, 2023, 5:45 PM

15 points

2 comments5 min readLW link

A Pedagogical Guide to Corrigibility

A.H.Jan 17, 2024, 11:45 AM

6 points

3 comments16 min readLW link

Abram Demski’s ELK thoughts and proposal—distillation

Rubi J. HudsonJul 19, 2022, 6:57 AM

19 points

8 comments16 min readLW link

Distillation of “How Likely Is Deceptive Alignment?”

NickGabsNov 18, 2022, 4:31 PM

24 points

4 comments10 min readLW link

AI Safety Cheatsheet / Quick Reference

Zohar JacksonJul 20, 2022, 9:39 AM

3 points

0 comments1 min readLW link

(github.com)

AI Safety 101 : Capabilities—Human Level AI, What? How? and When?

markov and Charbel-Raphaël

Mar 7, 2024, 5:29 PM

46 points

8 comments54 min readLW link

MIRI’s “Death with Dignity” in 60 seconds.

Cleo NardoDec 6, 2022, 5:18 PM

58 points

4 comments1 min readLW link

An Apprentice Experiment in Python Programming, Part 3

gilch and konstell

Aug 16, 2021, 4:42 AM

14 points

10 comments22 min readLW link

Dreams of “Mathopedia”

Nicholas / Heather KrossJun 2, 2023, 1:30 AM

40 points

16 comments2 min readLW link

(www.thinkingmuchbetter.com)

Nothing Is Ever Taught Correctly

LVSNFeb 20, 2023, 10:31 PM

5 points

3 comments1 min readLW link

An Illustrated Summary of “Robust Agents Learn Causal World Model”

DalcyDec 14, 2024, 3:02 PM

66 points

2 comments10 min readLW link

Distilling and approaches to the determinant

AprilSRApr 6, 2022, 6:34 AM

6 points

0 comments6 min readLW link

Announcing the Distillation for Alignment Practicum (DAP)

Jonas Hallgren and CallumMcDougall

Aug 18, 2022, 7:50 PM

23 points

3 comments3 min readLW link

Epistemic Artefacts of (conceptual) AI alignment research

Nora_Ammann and particlemania

Aug 19, 2022, 5:18 PM

31 points

1 comment5 min readLW link

Observations on Teaching for Four Weeks

ClareChiaraVincentMay 6, 2024, 4:55 PM

50 points

14 comments3 min readLW link

Models Don’t “Get Reward”

Sam RingerDec 30, 2022, 10:37 AM

316 points

62 comments5 min readLW link 1 review

[Appendix] Natural Abstractions: Key Claims, Theorems, and Critiques

LawrenceC, Erik Jenner and Leon Lang

Mar 16, 2023, 4:38 PM

48 points

0 comments13 min readLW link

Deep Q-Networks Explained

Jay BaileySep 13, 2022, 12:01 PM

58 points

8 comments20 min readLW link

The AI Control Problem in a wider intellectual context

philosophybearJan 13, 2023, 12:28 AM

11 points

3 comments12 min readLW link

Deriving Conditional Expected Utility from Pareto-Efficient Decisions

Thomas KwaMay 5, 2022, 3:21 AM

24 points

1 comment6 min readLW link

How RL Agents Behave When Their Actions Are Modified? [Distillation post]

PabloAMCMay 20, 2022, 6:47 PM

22 points

0 comments8 min readLW link

Understanding Infra-Bayesianism: A Beginner-Friendly Video Series

Jack Parker and Connall Garrod

Sep 22, 2022, 1:25 PM

140 points

6 comments2 min readLW link

Universality Unwrapped

adamShimiAug 21, 2020, 6:53 PM

29 points

2 comments18 min readLW link

How I got so excited about HowTruthful

Bruce LewisNov 9, 2023, 6:49 PM

17 points

3 comments5 min readLW link

Imitative Generalisation (AKA ‘Learning the Prior’)

Beth BarnesJan 10, 2021, 12:30 AM

107 points

15 comments11 min readLW link 1 review

Does SGD Produce Deceptive Alignment?

Mark XuNov 6, 2020, 11:48 PM

96 points

9 comments16 min readLW link

Understanding Benchmarks and motivating Evaluations

markov and Charbel-Raphaël

Feb 6, 2025, 1:32 AM

9 points

0 comments11 min readLW link

(ai-safety-atlas.com)

Does anyone use advanced media projects?

ryan_bJun 20, 2018, 11:33 PM

33 points

5 comments1 min readLW link

Teaching the Unteachable

Eliezer YudkowskyMar 3, 2009, 11:14 PM

55 points

18 comments6 min readLW link

The Fundamental Question—Rationality computer game design

Kaj_SotalaFeb 13, 2013, 1:45 PM

61 points

68 comments9 min readLW link

Concrete Methods for Heuristic Estimation on Neural Networks

Oliver DanielsNov 14, 2024, 5:07 AM

28 points

0 comments27 min readLW link

Zetetic explanation

BenquoAug 27, 2018, 12:12 AM

95 points

138 comments6 min readLW link

(benjaminrosshoffman.com)

Paternal Formats

abramdemskiJun 9, 2019, 1:26 AM

51 points

35 comments2 min readLW link

The Stanley Parable: Making philosophy fun

Nathan1123May 22, 2023, 2:15 AM

6 points

3 comments3 min readLW link

Explaining inner alignment to myself

Jeremy GillenMay 24, 2022, 11:10 PM

9 points

2 comments10 min readLW link

Teachable Rationality Skills

Eliezer YudkowskyMay 27, 2011, 9:57 PM

74 points

263 comments1 min readLW link

Five-minute rationality techniques

sketerpotAug 10, 2010, 2:24 AM

72 points

237 comments2 min readLW link

Just One Sentence

Eliezer YudkowskyJan 5, 2013, 1:27 AM

97 points

143 comments1 min readLW link

Croesus, Cerberus, and the magpies: a gentle introduction to Eliciting Latent Knowledge

Alexandre VariengienMay 27, 2022, 5:58 PM

17 points

0 comments16 min readLW link

Media bias

PhilGoetzJul 5, 2009, 4:54 PM

39 points

47 comments1 min readLW link

The RAIN Framework for Informational Effectiveness

ozziegooenFeb 13, 2019, 12:54 PM

37 points

16 comments6 min readLW link

The Up-Goer Five Game: Explaining hard ideas with simple words

Rob BensingerSep 5, 2013, 5:54 AM

44 points

82 comments2 min readLW link

Tutor-GPT & Pedagogical Reasoning

courtlandleerJun 5, 2023, 5:53 PM

26 points

3 comments4 min readLW link

A comparison of causal scrubbing, causal abstractions, and related methods

Erik Jenner, Adrià Garriga-alonso and Egor Zverev

Jun 8, 2023, 11:40 PM

73 points

3 comments22 min readLW link

Rationality Games & Apps Brainstorming

lukeprogJul 9, 2012, 3:04 AM

42 points

59 comments2 min readLW link

Distillation Of DeepSeek-Prover V1.5

IvanLinOct 15, 2024, 6:53 PM

4 points

1 comment3 min readLW link

[Question] Is Local Order a Clue to Universal Entropy? How a Failed Professor Searches for a ‘Sacred Motivational Order’

P. JoãoApr 12, 2025, 1:39 PM

2 points

2 comments2 min readLW link

Deconfusing Landauer’s Principle

EuanMcLeanMay 27, 2022, 5:58 PM

58 points

15 comments15 min readLW link

How not to be a Naïve Computationalist

diegocaleiroApr 13, 2011, 7:45 PM

39 points

36 comments2 min readLW link

Proof Explained for “Robust Agents Learn Causal World Model”

DalcyDec 22, 2024, 3:06 PM

25 points

0 comments15 min readLW link

Dense Math Notation

JK_RavenclawApr 1, 2011, 3:37 AM

33 points

23 comments1 min readLW link

Understanding Selection Theorems

adamkMay 28, 2022, 1:49 AM

41 points

3 comments7 min readLW link

AIS 101: Task decomposition for scalable oversight

Charbel-RaphaëlJul 25, 2023, 1:34 PM

27 points

0 comments19 min readLW link

(docs.google.com)

Video Intro to Guaranteed Safe AI

Mike Vaiana, Diogo de Lucena and AE Studio

Jul 11, 2024, 5:53 PM

27 points

0 comments1 min readLW link

(youtu.be)

Numeracy neglect—A personal postmortem

vlad.proexSep 27, 2020, 3:12 PM

81 points

29 comments9 min readLW link

Paper digestion: “May We Have Your Attention Please? Human-Rights NGOs and the Problem of Global Communication”

Klara Helene NielsenJul 20, 2023, 5:08 PM

4 points

1 comment2 min readLW link

(journals.sagepub.com)

Subdivisions for Useful Distillations?

Sharat Jacob JacobJul 24, 2023, 6:55 PM

9 points

2 comments2 min readLW link

DIY RLHF: A simple implementation for hands on experience

Mike Vaiana and AE Studio

Jul 10, 2024, 12:07 PM

28 points

0 comments6 min readLW link

Stampy’s AI Safety Info—New Distillations #4 [July 2023]

markovAug 16, 2023, 7:03 PM

22 points

10 comments1 min readLW link

(aisafety.info)

[Question] What AI Posts Do You Want Distilled?

brookAug 25, 2023, 9:01 AM

11 points

2 comments1 min readLW link

(forum.effectivealtruism.org)

Jan Kulveit’s Corrigibility Thoughts Distilled

brookAug 20, 2023, 5:52 PM

22 points

1 comment5 min readLW link

Mesa-Optimization: Explain it like I’m 10 Edition

brookAug 26, 2023, 11:04 PM

20 points

1 comment6 min readLW link

Moved from Moloch’s Toolbox: Discussion re style of latest Eliezer sequence

habrykaNov 5, 2017, 2:22 AM

7 points

2 comments3 min readLW link

Distillation of ‘Do language models plan for future tokens’

TheManxLoinerJun 27, 2024, 8:57 PM

26 points

2 comments6 min readLW link

Short Primers on Crucial Topics

lukeprogMay 31, 2012, 12:46 AM

35 points

24 comments1 min readLW link

Distilled—AGI Safety from First Principles

Harrison GMay 29, 2022, 12:57 AM

11 points

1 comment14 min readLW link

Graphical tensor notation for interpretability

Jordan TaylorOct 4, 2023, 8:04 AM

141 points

11 comments19 min readLW link

No comments.

Distil­la­tion & Pedagogy

Distillation & Pedagogy