RSS

Distil­la­tion & Pedagogy

TagLast edit: Aug 21, 2020, 10:31 PM by Raemon

Distillation is the process of taking a complex subject, and making it easier to understand. Pedagogy is the method and practice of teaching. A good intellectual pipeline requires not just discovering new ideas, but making it easier for newcomers to learn them, stand on the shoulders of giants, and discover even more ideas.

Chris Olah, founder of distill.pub, writes in his essay Research Debt:

Programmers talk about technical debt: there are ways to write software that are faster in the short run but problematic in the long run. Managers talk about institutional debt: institutions can grow quickly at the cost of bad practices creeping in. Both are easy to accumulate but hard to get rid of.

Research can also have debt. It comes in several forms:

  • Poor Exposition – Often, there is no good explanation of important ideas and one has to struggle to understand them. This problem is so pervasive that we take it for granted and don’t appreciate how much better things could be.

  • Undigested Ideas – Most ideas start off rough and hard to understand. They become radically easier as we polish them, developing the right analogies, language, and ways of thinking.

  • Bad abstractions and notation – Abstractions and notation are the user interface of research, shaping how we think and communicate. Unfortunately, we often get stuck with the first formalisms to develop even when they’re bad. For example, an object with extra electrons is negative, and pi is wrong.

  • Noise – Being a researcher is like standing in the middle of a construction site. Countless papers scream for your attention and there’s no easy way to filter or summarize them.Because most work is explained poorly, it takes a lot of energy to understand each piece of work. For many papers, one wants a simple one sentence explanation of it, but needs to fight with it to get that sentence. Because the simplest way to get the attention of interested parties is to get everyone’s attention, we get flooded with work. Because we incentivize people being “prolific,” we get flooded with a lot of work… We think noise is the main way experts experience research debt.

The insidious thing about research debt is that it’s normal. Everyone takes it for granted, and doesn’t realize that things could be different. For example, it’s normal to give very mediocre explanations of research, and people perceive that to be the ceiling of explanation quality. On the rare occasions that truly excellent explanations come along, people see them as one-off miracles rather than a sign that we could systematically be doing better.

See also Scholarship and Learning, and Good Explanations.

How to teach things well

Neel NandaAug 28, 2020, 4:44 PM
110 points
17 comments15 min readLW link1 review
(www.neelnanda.io)

Re­search Debt

ElizabethJul 15, 2018, 7:36 PM
25 points
2 comments1 min readLW link
(distill.pub)

Iron­ing Out the Squiggles

Zack_M_DavisApr 29, 2024, 4:13 PM
157 points
36 comments11 min readLW link

[Question] What are Ex­am­ples of Great Distil­lers?

adamShimiNov 12, 2020, 2:09 PM
35 points
12 comments1 min readLW link

Call For Distillers

johnswentworthApr 4, 2022, 6:25 PM
207 points
43 comments3 min readLW link1 review

Learn­ing how to learn

Neel NandaSep 30, 2020, 4:50 PM
44 points
0 comments15 min readLW link
(www.neelnanda.io)

Ex­plain­ers Shoot High. Aim Low!

Eliezer YudkowskyOct 24, 2007, 1:13 AM
101 points
35 comments1 min readLW link

Ab­stracts should be ei­ther Ac­tu­ally Short™, or bro­ken into paragraphs

RaemonMar 24, 2023, 12:51 AM
93 points
27 comments5 min readLW link

The Cave Alle­gory Re­vis­ited: Un­der­stand­ing GPT’s Worldview

Jan_KulveitFeb 14, 2023, 4:00 PM
86 points
5 comments3 min readLW link

In­fra-Bayesi­anism Unwrapped

adamShimiJan 20, 2021, 1:35 PM
58 points
0 comments24 min readLW link

Davi­dad’s Bold Plan for Align­ment: An In-Depth Explanation

Apr 19, 2023, 4:09 PM
168 points
40 comments21 min readLW link2 reviews

The 101 Space You Will Always Have With You

ScrewtapeNov 29, 2023, 4:56 AM
271 points
22 comments6 min readLW link1 review

DARPA Digi­tal Tu­tor: Four Months to To­tal Tech­ni­cal Ex­per­tise?

SebastianG Jul 6, 2020, 11:34 PM
220 points
22 comments7 min readLW link

Stampy’s AI Safety Info—New Distil­la­tions #3 [May 2023]

markovJun 6, 2023, 2:18 PM
16 points
0 comments2 min readLW link
(aisafety.info)

Stampy’s AI Safety Info—New Distil­la­tions #2 [April 2023]

markovMay 9, 2023, 1:31 PM
25 points
1 comment1 min readLW link
(aisafety.info)

TAPs for Tutoring

Mark XuDec 24, 2020, 8:46 PM
27 points
3 comments5 min readLW link

Stampy’s AI Safety Info—New Distil­la­tions #1 [March 2023]

markovApr 7, 2023, 11:06 AM
42 points
0 comments2 min readLW link
(aisafety.info)

(Sum­mary) Se­quence High­lights—Think­ing Bet­ter on Purpose

qazzquimbyAug 2, 2022, 5:45 PM
33 points
3 comments11 min readLW link

Fea­tures that make a re­port es­pe­cially helpful to me

lukeprogApr 14, 2022, 1:12 AM
40 points
0 comments2 min readLW link

Ex­pan­sive trans­la­tions: con­sid­er­a­tions and possibilities

ozziegooenSep 18, 2020, 3:39 PM
43 points
15 comments6 min readLW link

Nat­u­ral Ab­strac­tions: Key claims, The­o­rems, and Critiques

Mar 16, 2023, 4:37 PM
241 points
23 comments45 min readLW link3 reviews

A con­cise sum-up of the ba­sic ar­gu­ment for AI doom

Mergimio H. DoefevmilApr 24, 2023, 5:37 PM
11 points
6 comments2 min readLW link

Learn­ing-the­o­retic agenda read­ing list

Vanessa KosoyNov 9, 2023, 5:25 PM
103 points
1 comment2 min readLW link1 review

Get­ting ra­tio­nal now or later: nav­i­gat­ing pro­cras­ti­na­tion and time-in­con­sis­tent prefer­ences for new ra­tio­nal­ists

milo_thoughtsFeb 26, 2024, 7:38 PM
1 point
0 comments8 min readLW link

Learn­ing Math in Time for Alignment

Nicholas / Heather KrossJan 9, 2024, 1:02 AM
32 points
5 comments3 min readLW link

Uncer­tainty in all its flavours

Cleo NardoJan 9, 2024, 4:21 PM
34 points
6 comments35 min readLW link

Ex­plain­ing Im­pact Markets

Saul MunnJan 31, 2024, 9:51 AM
95 points
2 comments3 min readLW link
(www.brasstacks.blog)

“Deep Learn­ing” Is Func­tion Approximation

Zack_M_DavisMar 21, 2024, 5:50 PM
98 points
28 comments10 min readLW link
(zackmdavis.net)

What does Yann LeCun think about AGI? A sum­mary of his talk, “Math­e­mat­i­cal Ob­sta­cles on the Way to Hu­man-Level AI”

Adam JonesApr 5, 2025, 12:21 PM
11 points
0 comments2 min readLW link

Su­per­po­si­tion is not “just” neu­ron polysemanticity

LawrenceCApr 26, 2024, 11:22 PM
66 points
4 comments13 min readLW link

AI Safety Strate­gies Landscape

Charbel-RaphaëlMay 9, 2024, 5:33 PM
34 points
1 comment42 min readLW link

How ARENA course ma­te­rial gets made

CallumMcDougallJul 2, 2024, 6:04 PM
41 points
2 comments7 min readLW link

An Ex­tremely Opinionated An­no­tated List of My Favourite Mechanis­tic In­ter­pretabil­ity Papers v2

Neel NandaJul 7, 2024, 5:39 PM
135 points
16 comments25 min readLW link

Dialogue in­tro­duc­tion to Sin­gu­lar Learn­ing Theory

Olli JärviniemiJul 8, 2024, 4:58 PM
100 points
15 comments8 min readLW link

Poker is a bad game for teach­ing epistemics. Fig­gie is a bet­ter one.

rossryJul 8, 2024, 6:05 AM
106 points
47 comments11 min readLW link
(blog.rossry.net)

Pod­cast: “How the Smart Money teaches trad­ing with Ricki He­ick­len” (Pa­trick McKen­zie in­ter­view­ing)

rossryJul 11, 2024, 10:49 PM
20 points
2 comments1 min readLW link
(www.complexsystemspodcast.com)

In­sights from Eu­clid’s ‘Ele­ments’

TurnTroutMay 4, 2020, 3:45 PM
126 points
17 comments4 min readLW link

Grace­ful Degradation

ScrewtapeNov 5, 2024, 11:57 PM
83 points
8 comments4 min readLW link

The Solomonoff Prior is Malign

Mark XuOct 14, 2020, 1:33 AM
179 points
52 comments16 min readLW link3 reviews

Haber­mas Machine

NicholasKeesMar 13, 2025, 6:16 PM
47 points
7 comments6 min readLW link
(mosaic-labs.org)

So­cially Grace­ful Degradation

ScrewtapeMar 20, 2025, 4:03 AM
57 points
9 comments9 min readLW link

Beren’s “De­con­fus­ing Direct vs Amor­tised Op­ti­mi­sa­tion”

DragonGodApr 7, 2023, 8:57 AM
52 points
10 comments3 min readLW link

But why would the AI kill us?

So8resApr 17, 2023, 6:42 PM
138 points
96 comments2 min readLW link

AGI ruin mostly rests on strong claims about al­ign­ment and de­ploy­ment, not about society

Rob BensingerApr 24, 2023, 1:06 PM
70 points
8 comments6 min readLW link

Quick thoughts on the difficulty of widely con­vey­ing a non-stereo­typed position

SniffnoyMar 27, 2025, 7:30 AM
12 points
0 comments5 min readLW link

Hi­a­tus: EA and LW post summaries

Zoe WilliamsMay 17, 2023, 5:17 PM
14 points
0 comments1 min readLW link

Ex­per­tise and advice

John_MaxwellMay 27, 2012, 1:49 AM
25 points
4 comments1 min readLW link

Cheat sheet of AI X-risk

momom2Jun 29, 2023, 4:28 AM
19 points
1 comment7 min readLW link

Ra­tion­al­ity, Ped­a­gogy, and “Vibes”: Quick Thoughts

Nicholas / Heather KrossJul 15, 2023, 2:09 AM
14 points
1 comment4 min readLW link

An­nounc­ing AISafety.info’s Write-a-thon (June 16-18) and Se­cond Distil­la­tion Fel­low­ship (July 3-Oc­to­ber 2)

steven0461Jun 3, 2023, 2:03 AM
33 points
1 comment2 min readLW link

Join AISafety.info’s Distil­la­tion Hackathon (Oct 6-9th)

smallsiloOct 1, 2023, 6:43 PM
21 points
0 comments2 min readLW link
(forum.effectivealtruism.org)

Avoid Un­nec­es­sar­ily Poli­ti­cal Examples

RaemonJan 11, 2021, 5:41 AM
106 points
42 comments3 min readLW link

Dis­cov­ery fic­tion for the Pythagorean theorem

riceissaJan 19, 2021, 2:09 AM
16 points
1 comment4 min readLW link

In­ver­sion of the­o­rems into defi­ni­tions when generalizing

riceissaAug 4, 2019, 5:44 PM
25 points
3 comments5 min readLW link

Think like an ed­u­ca­tor about code quality

Adam ZernerMar 27, 2021, 5:43 AM
44 points
8 comments9 min readLW link

99% shorter

philhMay 27, 2021, 7:50 PM
16 points
0 comments6 min readLW link
(reasonableapproximation.net)

An Ap­pren­tice Ex­per­i­ment in Python Programming

Jul 4, 2021, 3:29 AM
67 points
4 comments9 min readLW link

An Ap­pren­tice Ex­per­i­ment in Python Pro­gram­ming, Part 2

Jul 29, 2021, 7:39 AM
30 points
18 comments10 min readLW link

Cal­ibra­tion proverbs

MalmesburyJan 11, 2022, 5:11 AM
76 points
19 comments1 min readLW link

[Closed] Job Offer­ing: Help Com­mu­ni­cate Infrabayesianism

Mar 23, 2022, 6:35 PM
129 points
22 comments1 min readLW link

Sum­mary: “How to Write Quickly...” by John Wentworth

Pablo RepettoApr 11, 2022, 11:26 PM
4 points
0 comments2 min readLW link
(pabloernesto.github.io)

[Question] What to in­clude in a guest lec­ture on ex­is­ten­tial risks from AI?

Aryeh EnglanderApr 13, 2022, 5:03 PM
20 points
9 comments1 min readLW link

Ra­tion­al­ity Dojo

lsusrApr 24, 2022, 12:53 AM
14 points
5 comments1 min readLW link

Cal­ling for Stu­dent Sub­mis­sions: AI Safety Distil­la­tion Contest

ArisApr 24, 2022, 1:53 AM
48 points
15 comments4 min readLW link

In­fra-Bayesi­anism Distil­la­tion: Real­iz­abil­ity and De­ci­sion Theory

Thomas LarsenMay 26, 2022, 9:57 PM
40 points
9 comments18 min readLW link

[Re­quest for Distil­la­tion] Co­her­ence of Distributed De­ci­sions With Differ­ent In­puts Im­plies Conditioning

johnswentworthApr 25, 2022, 5:01 PM
22 points
14 comments2 min readLW link

How to get peo­ple to pro­duce more great ex­po­si­tion? Some strate­gies and their assumptions

riceissaMay 25, 2022, 10:30 PM
26 points
10 comments3 min readLW link

Ex­po­si­tion as sci­ence: some ideas for how to make progress

riceissaJul 8, 2022, 1:29 AM
21 points
1 comment8 min readLW link

A dis­til­la­tion of Evan Hub­inger’s train­ing sto­ries (for SERI MATS)

Daphne_WJul 18, 2022, 3:38 AM
15 points
1 comment10 min readLW link

Pit­falls with Proofs

scasperJul 19, 2022, 10:21 PM
19 points
21 comments8 min readLW link

Distil­la­tion Con­test—Re­sults and Recap

ArisJul 29, 2022, 5:40 PM
34 points
0 comments7 min readLW link

[Question] Which in­tro-to-AI-risk text would you recom­mend to...

SherrinfordAug 1, 2022, 9:36 AM
12 points
1 comment1 min readLW link

Seek­ing PCK (Ped­a­gog­i­cal Con­tent Knowl­edge)

CFAR!DuncanAug 12, 2022, 4:15 AM
62 points
11 comments5 min readLW link

AI al­ign­ment as “nav­i­gat­ing the space of in­tel­li­gent be­havi­our”

Nora_AmmannAug 23, 2022, 1:28 PM
18 points
0 comments6 min readLW link

Align­ment is hard. Com­mu­ni­cat­ing that, might be harder

Eleni AngelouSep 1, 2022, 4:57 PM
7 points
8 comments3 min readLW link

How To Know What the AI Knows—An ELK Distillation

Fabien RogerSep 4, 2022, 12:46 AM
7 points
0 comments5 min readLW link

Sum­maries: Align­ment Fun­da­men­tals Curriculum

Leon LangSep 18, 2022, 1:08 PM
44 points
3 comments1 min readLW link
(docs.google.com)

Power-Seek­ing AI and Ex­is­ten­tial Risk

Antonio FrancaOct 11, 2022, 10:50 PM
6 points
0 comments9 min readLW link

Real-Time Re­search Record­ing: Can a Trans­former Re-Derive Po­si­tional Info?

Neel NandaNov 1, 2022, 11:56 PM
69 points
16 comments1 min readLW link
(youtu.be)

Distil­la­tion Ex­per­i­ment: Chunk-Knitting

DirectedEvolutionNov 7, 2022, 7:56 PM
10 points
3 comments6 min readLW link

The No Free Lunch the­o­rem for dummies

Steven ByrnesDec 5, 2022, 9:46 PM
37 points
16 comments3 min readLW link

Shard The­ory in Nine Th­e­ses: a Distil­la­tion and Crit­i­cal Appraisal

LawrenceCDec 19, 2022, 10:52 PM
150 points
30 comments18 min readLW link

Sum­mary of 80k’s AI prob­lem profile

JakubKJan 1, 2023, 7:30 AM
7 points
0 comments5 min readLW link
(forum.effectivealtruism.org)

In­duc­tion heads—illustrated

CallumMcDougallJan 2, 2023, 3:35 PM
128 points
12 comments3 min readLW link

AI Safety Info Distil­la­tion Fellowship

Feb 17, 2023, 4:16 PM
47 points
3 comments3 min readLW link

[Question] Is “Strong Co­her­ence” Anti-Nat­u­ral?

DragonGodApr 11, 2023, 6:22 AM
23 points
25 comments2 min readLW link

Ad­vice for time man­age­ment as a manager

benkuhnApr 2, 2025, 4:00 AM
16 points
1 comment5 min readLW link
(www.benkuhn.net)

AI Safety − 7 months of dis­cus­sion in 17 minutes

Zoe WilliamsMar 15, 2023, 11:41 PM
25 points
0 comments1 min readLW link

An Ele­men­tary In­tro­duc­tion to In­fra-Bayesianism

CharlesRWSep 20, 2023, 2:29 PM
16 points
0 comments1 min readLW link

Great Explanations

lukeprogOct 31, 2011, 11:58 PM
34 points
115 comments2 min readLW link

A LessWrong “ra­tio­nal­ity work­book” idea

jwhendyJan 9, 2011, 5:52 PM
26 points
26 comments3 min readLW link

De­bug­ging the student

Adam ZernerDec 16, 2020, 7:07 AM
46 points
7 comments4 min readLW link

AI Safety 101 : Re­ward Misspecification

markovOct 18, 2023, 8:39 PM
32 points
4 comments31 min readLW link

Failure Modes of Teach­ing AI Safety

Eleni AngelouJun 25, 2024, 7:07 PM
20 points
0 comments1 min readLW link

Ret­ro­spec­tive on Teach­ing Ra­tion­al­ity Workshops

Neel NandaJan 3, 2021, 5:15 PM
66 points
2 comments31 min readLW link

[Question] What cur­rents of thought on LessWrong do you want to see dis­til­led?

ryan_bJan 8, 2021, 9:43 PM
48 points
19 comments1 min readLW link

The Benefits of Distil­la­tion in Research

Jonas HallgrenMar 4, 2023, 5:45 PM
15 points
2 comments5 min readLW link

A Ped­a­gog­i­cal Guide to Corrigibility

A.H.Jan 17, 2024, 11:45 AM
6 points
3 comments16 min readLW link

Abram Dem­ski’s ELK thoughts and pro­posal—distillation

Rubi J. HudsonJul 19, 2022, 6:57 AM
19 points
8 comments16 min readLW link

Distil­la­tion of “How Likely Is De­cep­tive Align­ment?”

NickGabsNov 18, 2022, 4:31 PM
24 points
4 comments10 min readLW link

AI Safety Cheat­sheet /​ Quick Reference

Zohar JacksonJul 20, 2022, 9:39 AM
3 points
0 comments1 min readLW link
(github.com)

AI Safety 101 : Ca­pa­bil­ities—Hu­man Level AI, What? How? and When?

Mar 7, 2024, 5:29 PM
46 points
8 comments54 min readLW link

MIRI’s “Death with Dig­nity” in 60 sec­onds.

Cleo NardoDec 6, 2022, 5:18 PM
58 points
4 comments1 min readLW link

An Ap­pren­tice Ex­per­i­ment in Python Pro­gram­ming, Part 3

Aug 16, 2021, 4:42 AM
14 points
10 comments22 min readLW link

Dreams of “Matho­pe­dia”

Nicholas / Heather KrossJun 2, 2023, 1:30 AM
40 points
16 comments2 min readLW link
(www.thinkingmuchbetter.com)

Noth­ing Is Ever Taught Correctly

LVSNFeb 20, 2023, 10:31 PM
5 points
3 comments1 min readLW link

An Illus­trated Sum­mary of “Ro­bust Agents Learn Causal World Model”

DalcyDec 14, 2024, 3:02 PM
66 points
2 comments10 min readLW link

Distill­ing and ap­proaches to the determinant

AprilSRApr 6, 2022, 6:34 AM
6 points
0 comments6 min readLW link

An­nounc­ing the Distil­la­tion for Align­ment Practicum (DAP)

Aug 18, 2022, 7:50 PM
23 points
3 comments3 min readLW link

Epistemic Arte­facts of (con­cep­tual) AI al­ign­ment research

Aug 19, 2022, 5:18 PM
31 points
1 comment5 min readLW link

Ob­ser­va­tions on Teach­ing for Four Weeks

ClareChiaraVincentMay 6, 2024, 4:55 PM
50 points
14 comments3 min readLW link

Models Don’t “Get Re­ward”

Sam RingerDec 30, 2022, 10:37 AM
316 points
62 comments5 min readLW link1 review

[Ap­pendix] Nat­u­ral Ab­strac­tions: Key Claims, The­o­rems, and Critiques

Mar 16, 2023, 4:38 PM
48 points
0 comments13 min readLW link

Deep Q-Net­works Explained

Jay BaileySep 13, 2022, 12:01 PM
58 points
8 comments20 min readLW link

The AI Con­trol Prob­lem in a wider in­tel­lec­tual context

philosophybearJan 13, 2023, 12:28 AM
11 points
3 comments12 min readLW link

Deriv­ing Con­di­tional Ex­pected Utility from Pareto-Effi­cient Decisions

Thomas KwaMay 5, 2022, 3:21 AM
24 points
1 comment6 min readLW link

How RL Agents Be­have When Their Ac­tions Are Mod­ified? [Distil­la­tion post]

PabloAMCMay 20, 2022, 6:47 PM
22 points
0 comments8 min readLW link

Un­der­stand­ing In­fra-Bayesi­anism: A Begin­ner-Friendly Video Series

Sep 22, 2022, 1:25 PM
140 points
6 comments2 min readLW link

Univer­sal­ity Unwrapped

adamShimiAug 21, 2020, 6:53 PM
29 points
2 comments18 min readLW link

How I got so ex­cited about HowTruthful

Bruce LewisNov 9, 2023, 6:49 PM
17 points
3 comments5 min readLW link

Imi­ta­tive Gen­er­al­i­sa­tion (AKA ‘Learn­ing the Prior’)

Beth BarnesJan 10, 2021, 12:30 AM
107 points
15 comments11 min readLW link1 review

Does SGD Pro­duce De­cep­tive Align­ment?

Mark XuNov 6, 2020, 11:48 PM
96 points
9 comments16 min readLW link

Un­der­stand­ing Bench­marks and mo­ti­vat­ing Evaluations

Feb 6, 2025, 1:32 AM
9 points
0 comments11 min readLW link
(ai-safety-atlas.com)

Does any­one use ad­vanced me­dia pro­jects?

ryan_bJun 20, 2018, 11:33 PM
33 points
5 comments1 min readLW link

Teach­ing the Unteachable

Eliezer YudkowskyMar 3, 2009, 11:14 PM
55 points
18 comments6 min readLW link

The Fun­da­men­tal Ques­tion—Ra­tion­al­ity com­puter game design

Kaj_SotalaFeb 13, 2013, 1:45 PM
61 points
68 comments9 min readLW link

Con­crete Meth­ods for Heuris­tic Es­ti­ma­tion on Neu­ral Networks

Oliver DanielsNov 14, 2024, 5:07 AM
28 points
0 comments27 min readLW link

Zetetic explanation

BenquoAug 27, 2018, 12:12 AM
95 points
138 comments6 min readLW link
(benjaminrosshoffman.com)

Pa­ter­nal Formats

abramdemskiJun 9, 2019, 1:26 AM
51 points
35 comments2 min readLW link

The Stan­ley Parable: Mak­ing philos­o­phy fun

Nathan1123May 22, 2023, 2:15 AM
6 points
3 comments3 min readLW link

Ex­plain­ing in­ner al­ign­ment to myself

Jeremy GillenMay 24, 2022, 11:10 PM
9 points
2 comments10 min readLW link

Teach­able Ra­tion­al­ity Skills

Eliezer YudkowskyMay 27, 2011, 9:57 PM
74 points
263 comments1 min readLW link

Five-minute ra­tio­nal­ity techniques

sketerpotAug 10, 2010, 2:24 AM
72 points
237 comments2 min readLW link

Just One Sentence

Eliezer YudkowskyJan 5, 2013, 1:27 AM
97 points
143 comments1 min readLW link

Croe­sus, Cer­berus, and the mag­pies: a gen­tle in­tro­duc­tion to Elic­it­ing La­tent Knowledge

Alexandre VariengienMay 27, 2022, 5:58 PM
17 points
0 comments16 min readLW link

Me­dia bias

PhilGoetzJul 5, 2009, 4:54 PM
39 points
47 comments1 min readLW link

The RAIN Frame­work for In­for­ma­tional Effectiveness

ozziegooenFeb 13, 2019, 12:54 PM
37 points
16 comments6 min readLW link

The Up-Goer Five Game: Ex­plain­ing hard ideas with sim­ple words

Rob BensingerSep 5, 2013, 5:54 AM
44 points
82 comments2 min readLW link

Tu­tor-GPT & Ped­a­gog­i­cal Reasoning

courtlandleerJun 5, 2023, 5:53 PM
26 points
3 comments4 min readLW link

A com­par­i­son of causal scrub­bing, causal ab­strac­tions, and re­lated methods

Jun 8, 2023, 11:40 PM
73 points
3 comments22 min readLW link

Ra­tion­al­ity Games & Apps Brainstorming

lukeprogJul 9, 2012, 3:04 AM
42 points
59 comments2 min readLW link

Distil­la­tion Of Deep­Seek-Prover V1.5

IvanLinOct 15, 2024, 6:53 PM
4 points
1 comment3 min readLW link

[Question] Is Lo­cal Order a Clue to Univer­sal En­tropy? How a Failed Pro­fes­sor Searches for a ‘Sa­cred Mo­ti­va­tional Order’

P. JoãoApr 12, 2025, 1:39 PM
2 points
2 comments2 min readLW link

De­con­fus­ing Lan­dauer’s Principle

EuanMcLeanMay 27, 2022, 5:58 PM
58 points
15 comments15 min readLW link

How not to be a Naïve Computationalist

diegocaleiroApr 13, 2011, 7:45 PM
39 points
36 comments2 min readLW link

Proof Ex­plained for “Ro­bust Agents Learn Causal World Model”

DalcyDec 22, 2024, 3:06 PM
25 points
0 comments15 min readLW link

Dense Math Notation

JK_RavenclawApr 1, 2011, 3:37 AM
33 points
23 comments1 min readLW link

Un­der­stand­ing Selec­tion Theorems

adamkMay 28, 2022, 1:49 AM
41 points
3 comments7 min readLW link

AIS 101: Task de­com­po­si­tion for scal­able oversight

Charbel-RaphaëlJul 25, 2023, 1:34 PM
27 points
0 comments19 min readLW link
(docs.google.com)

Video In­tro to Guaran­teed Safe AI

Jul 11, 2024, 5:53 PM
27 points
0 comments1 min readLW link
(youtu.be)

Numer­acy ne­glect—A per­sonal postmortem

vlad.proexSep 27, 2020, 3:12 PM
81 points
29 comments9 min readLW link

Paper di­ges­tion: “May We Have Your At­ten­tion Please? Hu­man-Rights NGOs and the Prob­lem of Global Com­mu­ni­ca­tion”

Klara Helene NielsenJul 20, 2023, 5:08 PM
4 points
1 comment2 min readLW link
(journals.sagepub.com)

Sub­di­vi­sions for Use­ful Distil­la­tions?

Sharat Jacob JacobJul 24, 2023, 6:55 PM
9 points
2 comments2 min readLW link

DIY RLHF: A sim­ple im­ple­men­ta­tion for hands on experience

Jul 10, 2024, 12:07 PM
28 points
0 comments6 min readLW link

Stampy’s AI Safety Info—New Distil­la­tions #4 [July 2023]

markovAug 16, 2023, 7:03 PM
22 points
10 comments1 min readLW link
(aisafety.info)

[Question] What AI Posts Do You Want Distil­led?

brookAug 25, 2023, 9:01 AM
11 points
2 comments1 min readLW link
(forum.effectivealtruism.org)

Jan Kul­veit’s Cor­rigi­bil­ity Thoughts Distilled

brookAug 20, 2023, 5:52 PM
22 points
1 comment5 min readLW link

Mesa-Op­ti­miza­tion: Ex­plain it like I’m 10 Edition

brookAug 26, 2023, 11:04 PM
20 points
1 comment6 min readLW link

Moved from Moloch’s Toolbox: Dis­cus­sion re style of lat­est Eliezer sequence

habrykaNov 5, 2017, 2:22 AM
7 points
2 comments3 min readLW link

Distil­la­tion of ‘Do lan­guage mod­els plan for fu­ture to­kens’

TheManxLoinerJun 27, 2024, 8:57 PM
26 points
2 comments6 min readLW link

Short Primers on Cru­cial Topics

lukeprogMay 31, 2012, 12:46 AM
35 points
24 comments1 min readLW link

Distil­led—AGI Safety from First Principles

Harrison GMay 29, 2022, 12:57 AM
11 points
1 comment14 min readLW link

Graph­i­cal ten­sor no­ta­tion for interpretability

Jordan TaylorOct 4, 2023, 8:04 AM
141 points
11 comments19 min readLW link
No comments.