Shard The­ory in Nine Th­e­ses: a Distil­la­tion and Crit­i­cal Appraisal

LawrenceCDec 19, 2022, 10:52 PM
150 points
30 comments18 min readLW link

[Question] Will re­search in AI risk jinx it? Con­se­quences of train­ing AI on AI risk arguments

Yann DuboisDec 19, 2022, 10:42 PM
5 points
6 comments1 min readLW link

AGI Timelines in Gover­nance: Differ­ent Strate­gies for Differ­ent Timeframes

Dec 19, 2022, 9:31 PM
65 points
28 comments10 min readLW link

Towards Hodge-podge Alignment

Cleo NardoDec 19, 2022, 8:12 PM
95 points
30 comments9 min readLW link

Com­pu­ta­tional sig­na­tures of psychopathy

Cameron BergDec 19, 2022, 5:01 PM
30 points
3 comments20 min readLW link

Re­sults from a sur­vey on tool use and work­flows in al­ign­ment research

Dec 19, 2022, 3:19 PM
79 points
2 comments19 min readLW link

Does ChatGPT’s perfor­mance war­rant work­ing on a tu­tor for chil­dren? [It’s time to take it to the lab.]

Bill BenzonDec 19, 2022, 3:12 PM
13 points
5 comments4 min readLW link
(new-savanna.blogspot.com)

Con­di­tions for Su­per­ra­tional­ity-mo­ti­vated Co­op­er­a­tion in a one-shot Pri­soner’s Dilemma

Jim BuhlerDec 19, 2022, 3:00 PM
24 points
4 comments5 min readLW link

Next Level Seinfeld

ZviDec 19, 2022, 1:30 PM
50 points
8 comments1 min readLW link
(thezvi.wordpress.com)

CEA Disambiguation

jefftkDec 19, 2022, 1:20 PM
25 points
0 comments1 min readLW link
(www.jefftk.com)

Why mechanis­tic in­ter­pretabil­ity does not and can­not con­tribute to long-term AGI safety (from mes­sages with a friend)

RemmeltDec 19, 2022, 12:02 PM
−3 points
9 comments31 min readLW link

Hacker-AI and Cy­ber­war 2.0+

Erland WittkotterDec 19, 2022, 11:46 AM
2 points
0 comments15 min readLW link

Non-Tech­ni­cal Prepa­ra­tion for Hacker-AI and Cy­ber­war 2.0+

Erland WittkotterDec 19, 2022, 11:42 AM
2 points
0 comments25 min readLW link

An Effec­tive Grab Bag

stavrosDec 19, 2022, 10:29 AM
28 points
2 comments7 min readLW link

Slick hy­per­finite Ram­sey the­ory proof

Alok SinghDec 19, 2022, 8:40 AM
8 points
3 comments1 min readLW link
(alok.github.io)

The True Spirit of Sols­tice?

RaemonDec 19, 2022, 8:00 AM
69 points
31 comments9 min readLW link

The Risk of Or­bital De­bris and One (Cheap) Way to Miti­gate It

clansDec 19, 2022, 3:16 AM
13 points
1 comment4 min readLW link
(locationtbd.home.blog)

Why I think that teach­ing philos­o­phy is high impact

Eleni AngelouDec 19, 2022, 3:11 AM
5 points
0 comments2 min readLW link

A tem­plate for do­ing an­nual reviews

peterslatteryDec 19, 2022, 3:09 AM
2 points
0 comments1 min readLW link

Event [Berkeley]: Align­ment Col­lab­o­ra­tor Speed-Meeting

Dec 19, 2022, 2:24 AM
18 points
2 comments1 min readLW link

An eas­ier(?) end to the elec­toral college

ejacobDec 19, 2022, 2:09 AM
2 points
2 comments2 min readLW link

How Death Feels

sisyphusDec 18, 2022, 11:47 PM
−7 points
9 comments1 min readLW link

Why Are Women Hot?

Jacob FalkovichDec 18, 2022, 11:20 PM
17 points
19 comments11 min readLW link

[Question] Can we, in prin­ci­ple, know the mea­sure of coun­ter­fac­tual quan­tum branches?

sisyphusDec 18, 2022, 10:07 PM
1 point
15 comments1 min readLW link

Bos­ton Sols­tice 2022 Retrospective

jefftkDec 18, 2022, 7:00 PM
19 points
3 comments5 min readLW link
(www.jefftk.com)

Take 11: “Align­ing lan­guage mod­els” should be weirder.

Charlie SteinerDec 18, 2022, 2:14 PM
34 points
0 comments2 min readLW link

Bad at Arith­metic, Promis­ing at Math

cohenmacaulayDec 18, 2022, 5:40 AM
100 points
19 comments20 min readLW link1 review

Over­con­fi­dence bubbles

kaputmiDec 18, 2022, 2:07 AM
3 points
0 comments2 min readLW link

Pos­i­tive val­ues seem more ro­bust and last­ing than prohibitions

TurnTroutDec 17, 2022, 9:43 PM
52 points
13 comments2 min readLW link

What we owe the microbiome

weverkaDec 17, 2022, 7:40 PM
2 points
0 comments1 min readLW link
(forum.effectivealtruism.org)

Why write more: im­prove your epistemics, self-care, & 28 other reasons

KatWoodsDec 17, 2022, 7:25 PM
24 points
1 comment6 min readLW link

Look­ing for an al­ign­ment tutor

JanBDec 17, 2022, 7:08 PM
15 points
2 comments1 min readLW link

[Question] How to Con­vince my Son that Drugs are Bad

concerned_dadDec 17, 2022, 6:47 PM
140 points
84 comments2 min readLW link

Or­di­nary hu­man life

David Hugh-JonesDec 17, 2022, 4:46 PM
24 points
3 comments14 min readLW link
(wyclif.substack.com)

Pre­dic­tive Pro­cess­ing, Hetero­sex­u­al­ity and Delu­sions of Grandeur

lsusrDec 17, 2022, 7:37 AM
37 points
13 comments5 min readLW link

[Link] Es­cape the Echo Cham­ber (2018)

CronoDASDec 17, 2022, 6:14 AM
13 points
0 comments2 min readLW link
(aeon.co)

“Starry Night” Sols­tice Cookies

maiaDec 17, 2022, 5:31 AM
26 points
7 comments1 min readLW link

There have been 3 planes (billion­aire donors) and 2 have crashed

trevorDec 17, 2022, 3:58 AM
16 points
10 comments2 min readLW link

[Question] What about non-de­gree seek­ing?

Lao MeinDec 17, 2022, 2:22 AM
5 points
5 comments1 min readLW link

Us­ing In­for­ma­tion The­ory to tackle AI Align­ment: A Prac­ti­cal Approach

Daniel SalamiDec 17, 2022, 1:37 AM
10 points
4 comments7 min readLW link

Paper: Con­sti­tu­tional AI: Harm­less­ness from AI Feed­back (An­thropic)

LawrenceCDec 16, 2022, 10:12 PM
68 points
11 comments1 min readLW link
(www.anthropic.com)

Vaguely in­ter­ested in Effec­tive Altru­ism? Please Take the Offi­cial 2022 EA Survey

Peter WildefordDec 16, 2022, 9:07 PM
22 points
4 comments1 min readLW link
(rethinkpriorities.qualtrics.com)

Ab­stract con­cepts and met­al­in­gual defi­ni­tion: Does ChatGPT un­der­stand jus­tice and char­ity?

Bill BenzonDec 16, 2022, 9:01 PM
2 points
0 comments13 min readLW link

Beyond the mo­ment of invention

jasoncrawfordDec 16, 2022, 8:18 PM
35 points
0 comments2 min readLW link
(rootsofprogress.org)

[Question] What’s the best time-effi­cient al­ter­na­tive to the Se­quences?

trevorDec 16, 2022, 8:17 PM
7 points
7 comments1 min readLW link

Can we effi­ciently ex­plain model be­hav­iors?

paulfchristianoDec 16, 2022, 7:40 PM
64 points
3 comments9 min readLW link
(ai-alignment.com)

Proper scor­ing rules don’t guaran­tee pre­dict­ing fixed points

Dec 16, 2022, 6:22 PM
79 points
8 comments21 min readLW link

A learned agent is not the same as a learn­ing agent

Ben AmitayDec 16, 2022, 5:27 PM
4 points
5 comments4 min readLW link

[Question] Col­lege Selec­tion Ad­vice for Tech­ni­cal Alignment

TempCollegeAskDec 16, 2022, 5:11 PM
11 points
8 comments1 min readLW link

How im­por­tant are ac­cu­rate AI timelines for the op­ti­mal spend­ing sched­ule on AI risk in­ter­ven­tions?

Tristan CookDec 16, 2022, 4:05 PM
27 points
2 commentsLW link