Broad Pic­ture of Hu­man Values

Thane RuthenisAug 20, 2022, 7:42 PM
42 points
6 comments10 min readLW link

What’s up with the bad Meta pro­jects?

YitzAug 18, 2022, 5:34 AM
42 points
29 comments1 min readLW link

In­ter­pretabil­ity Tools Are an At­tack Channel

Thane RuthenisAug 17, 2022, 6:47 PM
42 points
14 comments1 min readLW link

Area un­der the curve, Eat Dirt, Broc­coli Er­rors, Coper­ni­cus & Chaos

CFAR!DuncanAug 8, 2022, 8:17 AM
41 points
0 comments7 min readLW link

Ap­pendix: Ham­ming Questions

CFAR!DuncanAug 13, 2022, 8:07 AM
41 points
0 comments2 min readLW link

[Question] Why are some prob­lems Su­per Hard?

Gabriel AlfourAug 24, 2022, 5:58 PM
41 points
34 comments3 min readLW link

Se­quenc­ing Intro

jefftkAug 29, 2022, 5:50 PM
39 points
3 comments5 min readLW link
(www.jefftk.com)

Clap­ping Lower

jefftkAug 4, 2022, 2:10 AM
38 points
7 comments1 min readLW link
(www.jefftk.com)

“Just hiring peo­ple” is some­times still ac­tu­ally possible

lcAug 5, 2022, 9:44 PM
38 points
11 comments5 min readLW link

Ex­treme Security

lcAug 15, 2022, 12:11 PM
38 points
6 comments5 min readLW link

Team Shard Sta­tus Report

David UdellAug 9, 2022, 5:33 AM
38 points
8 comments3 min readLW link

Con­ver­gence Towards World-Models: A Gears-Level Model

Thane RuthenisAug 4, 2022, 11:31 PM
38 points
1 comment13 min readLW link

Con­di­tion­ing, Prompts, and Fine-Tuning

Adam JermynAug 17, 2022, 8:52 PM
38 points
9 comments4 min readLW link

Dwarves & D.Sci: Data Fortress

aphyerAug 6, 2022, 6:24 PM
38 points
26 comments3 min readLW link

How much al­ign­ment data will we need in the long run?

Jacob_HiltonAug 10, 2022, 9:39 PM
37 points
15 comments4 min readLW link

On akra­sia: start­ing at the bottom

seecrowAug 1, 2022, 4:08 AM
37 points
2 comments3 min readLW link

What Makes A Good Mea­sure­ment De­vice?

johnswentworthAug 24, 2022, 10:45 PM
37 points
7 comments2 min readLW link

In­ner Align­ment via Superpowers

Aug 30, 2022, 8:01 PM
37 points
13 comments4 min readLW link

Basin broad­ness de­pends on the size and num­ber of or­thog­o­nal features

Aug 27, 2022, 5:29 PM
36 points
21 comments6 min readLW link

What if we ap­proach AI safety like a tech­ni­cal en­g­ineer­ing safety problem

zeshenAug 20, 2022, 10:29 AM
36 points
4 comments7 min readLW link

Covid 8/​4/​22: Rebound

ZviAug 4, 2022, 11:20 AM
36 points
0 comments11 min readLW link
(thezvi.wordpress.com)

Dou­ble Crux In A Box

ScrewtapeAug 26, 2022, 3:24 AM
35 points
6 comments6 min readLW link

Cul­ti­vat­ing Valiance

Shoshannah TekofskyAug 13, 2022, 6:47 PM
35 points
4 comments4 min readLW link

[Re­view] The Prob­lem of Poli­ti­cal Author­ity by Michael Huemer

Arjun PanicksseryAug 25, 2022, 5:18 AM
35 points
22 comments12 min readLW link
(arjunpanickssery.substack.com)

My ad­vice on find­ing your own path

A RayAug 6, 2022, 4:57 AM
35 points
3 comments3 min readLW link

Ap­pendix: Jar­gon Dictionary

CFAR!DuncanAug 13, 2022, 8:09 AM
34 points
5 comments21 min readLW link

(Sum­mary) Se­quence High­lights—Think­ing Bet­ter on Purpose

qazzquimbyAug 2, 2022, 5:45 PM
33 points
3 comments11 min readLW link

Broad Bas­ins and Data Compression

Aug 8, 2022, 8:33 PM
33 points
6 comments7 min readLW link

Shapes of Mind and Plu­ral­ism in Alignment

adamShimiAug 13, 2022, 10:01 AM
33 points
2 comments2 min readLW link

Covid 8/​25/​22: What We Owe

ZviAug 25, 2022, 2:40 PM
33 points
3 comments19 min readLW link
(thezvi.wordpress.com)

En­cul­tured AI Pre-plan­ning, Part 2: Pro­vid­ing a Service

Aug 11, 2022, 8:11 PM
33 points
4 comments3 min readLW link

What does moral progress con­sist of?

jasoncrawfordAug 19, 2022, 12:22 AM
32 points
23 comments2 min readLW link
(forum.effectivealtruism.org)

How I think about alignment

Linda LinseforsAug 13, 2022, 10:01 AM
31 points
11 comments5 min readLW link

Epistemic Arte­facts of (con­cep­tual) AI al­ign­ment research

Aug 19, 2022, 5:18 PM
31 points
1 comment5 min readLW link

Why I Am Skep­ti­cal of AI Reg­u­la­tion as an X-Risk Miti­ga­tion Strategy

A RayAug 6, 2022, 5:46 AM
31 points
14 comments2 min readLW link

Dissent Collusion

ScrewtapeAug 10, 2022, 2:43 AM
30 points
7 comments3 min readLW link

Gears-Level Un­der­stand­ing, De­liber­ate Perfor­mance, The Strate­gic Level

CFAR!DuncanAug 5, 2022, 5:11 PM
30 points
3 comments5 min readLW link

Break­ing down the train­ing/​de­ploy­ment dichotomy

Erik JennerAug 28, 2022, 9:45 PM
30 points
3 comments3 min readLW link

Con­crete Ad­vice for Form­ing In­side Views on AI Safety

Neel NandaAug 17, 2022, 10:02 PM
30 points
6 comments10 min readLW link

More Clothes Over Time?

jefftkAug 28, 2022, 8:30 PM
30 points
1 comment1 min readLW link
(www.jefftk.com)

[Question] Can we get full au­dio for Eliezer’s con­ver­sa­tion with Sam Har­ris?

JakubKAug 7, 2022, 8:35 PM
30 points
8 comments1 min readLW link

AI Risk in Terms of Un­sta­ble Nu­clear Software

Thane RuthenisAug 26, 2022, 6:49 PM
30 points
1 comment6 min readLW link

Pen­du­lums, Policy-Level De­ci­sion­mak­ing, Sav­ing State

CFAR!DuncanAug 11, 2022, 4:47 PM
30 points
3 comments8 min readLW link

Against pop­u­la­tion ethics

jasoncrawfordAug 16, 2022, 5:19 AM
29 points
39 comments3 min readLW link

[Question] Ways to in­crease work­ing mem­ory, and/​or cope with low work­ing mem­ory?

Nicholas / Heather KrossAug 21, 2022, 10:31 PM
29 points
18 comments1 min readLW link

Robert Long On Why Ar­tifi­cial Sen­tience Might Matter

Michaël TrazziAug 28, 2022, 5:30 PM
29 points
5 comments5 min readLW link
(theinsideview.ai)

Troll Timers

ScrewtapeAug 12, 2022, 12:55 AM
29 points
13 comments4 min readLW link

How Do We Align an AGI Without Get­ting So­cially Eng­ineered? (Hint: Box It)

Aug 10, 2022, 6:14 PM
28 points
30 comments11 min readLW link

Seek­ing Stu­dent Sub­mis­sions: Edit Your Source Code Contest

ArisAug 26, 2022, 2:08 AM
28 points
5 comments2 min readLW link

Pivotal acts us­ing an un­al­igned AGI?

Simon FischerAug 21, 2022, 5:13 PM
28 points
3 comments7 min readLW link