Simulators

janusSep 2, 2022, 12:45 PM
631 points
168 comments41 min readLW link8 reviews
(generative.ink)

The Redac­tion Machine

BenSep 20, 2022, 10:03 PM
504 points
48 comments27 min readLW link1 review

Los­ing the root for the tree

Adam ZernerSep 20, 2022, 4:53 AM
480 points
31 comments9 min readLW link1 review

You Are Not Mea­sur­ing What You Think You Are Measuring

johnswentworthSep 20, 2022, 8:04 PM
407 points
44 comments8 min readLW link2 reviews

Why I think strong gen­eral AI is com­ing soon

porbySep 28, 2022, 5:40 AM
337 points
141 comments34 min readLW link1 review

The shard the­ory of hu­man values

Sep 4, 2022, 4:28 AM
255 points
67 comments24 min readLW link2 reviews

An­nounc­ing Balsa Research

ZviSep 25, 2022, 10:50 PM
235 points
64 comments2 min readLW link1 review
(thezvi.wordpress.com)

How I buy things when Light­cone wants them fast

Bird ConceptSep 26, 2022, 5:02 AM
223 points
21 comments8 min readLW link

How my team at Light­cone some­times gets stuff done

Bird ConceptSep 19, 2022, 5:47 AM
192 points
43 comments7 min readLW link1 review

7 traps that (we think) new al­ign­ment re­searchers of­ten fall into

Sep 27, 2022, 11:13 PM
176 points
10 comments4 min readLW link

Do bam­boos set them­selves on fire?

MalmesburySep 19, 2022, 3:34 PM
170 points
14 comments6 min readLW link1 review

Most Peo­ple Start With The Same Few Bad Ideas

johnswentworthSep 9, 2022, 12:29 AM
165 points
30 comments3 min readLW link

Threat-Re­sis­tant Bar­gain­ing Me­ga­post: In­tro­duc­ing the ROSE Value

DiffractorSep 28, 2022, 1:20 AM
162 points
19 comments53 min readLW link2 reviews

The Onion Test for Per­sonal and In­sti­tu­tional Honesty

Sep 27, 2022, 3:26 PM
162 points
31 comments3 min readLW link3 reviews

Public-fac­ing Cen­sor­ship Is Safety Theater, Caus­ing Rep­u­ta­tional Da­m­age

YitzSep 23, 2022, 5:08 AM
149 points
42 comments6 min readLW link

AI co­or­di­na­tion needs clear wins

evhubSep 1, 2022, 11:41 PM
147 points
16 comments2 min readLW link1 review

In­ter­pret­ing Neu­ral Net­works through the Poly­tope Lens

Sep 23, 2022, 5:58 PM
144 points
29 comments33 min readLW link

Take­aways from our ro­bust in­jury clas­sifier pro­ject [Red­wood Re­search]

dmzSep 17, 2022, 3:55 AM
143 points
12 comments6 min readLW link1 review

Un­der­stand­ing In­fra-Bayesi­anism: A Begin­ner-Friendly Video Series

Sep 22, 2022, 1:25 PM
140 points
6 comments2 min readLW link

Mon­i­tor­ing for de­cep­tive alignment

evhubSep 8, 2022, 11:07 PM
135 points
8 comments9 min readLW link

Orexin and the quest for more wak­ing hours

ChristianKlSep 24, 2022, 7:54 PM
131 points
39 comments5 min readLW link

Gene drives: why the wait?

MetacelsusSep 19, 2022, 11:37 PM
125 points
50 comments3 min readLW link
(denovo.substack.com)

An Up­date on Academia vs. In­dus­try (one year into my fac­ulty job)

David Scott Krueger (formerly: capybaralet)Sep 3, 2022, 8:43 PM
122 points
18 comments4 min readLW link

Quintin’s al­ign­ment pa­pers roundup—week 1

Quintin PopeSep 10, 2022, 6:39 AM
120 points
6 comments9 min readLW link

LW Petrov Day 2022 (Mon­day, 9/​26)

RubySep 22, 2022, 2:56 AM
120 points
111 comments5 min readLW link

An­nounc­ing $5,000 bounty for (re­spon­si­bly) end­ing malaria

lcSep 24, 2022, 4:28 AM
116 points
40 comments4 min readLW link

Re­jected Early Drafts of New­comb’s Problem

zahmahkiboSep 6, 2022, 7:04 PM
113 points
5 comments3 min readLW link

Un­der­stand­ing Con­jec­ture: Notes from Con­nor Leahy interview

Orpheus16Sep 15, 2022, 6:37 PM
107 points
23 comments15 min readLW link

Petrov Day Ret­ro­spec­tive: 2022

RubySep 28, 2022, 10:16 PM
107 points
41 comments4 min readLW link

Fund­ing is All You Need: Get­ting into Grad School by Hack­ing the NSF GRFP Fellowship

hapaninSep 22, 2022, 9:39 PM
106 points
9 comments12 min readLW link

My emo­tional re­ac­tion to the cur­rent fund­ing situation

Sam F. BrownSep 9, 2022, 10:02 PM
105 points
36 comments5 min readLW link
(sambrown.eu)

Ukraine Post #12

ZviSep 22, 2022, 2:40 PM
104 points
3 comments16 min readLW link
(thezvi.wordpress.com)

Eval­u­a­tions pro­ject @ ARC is hiring a re­searcher and a web­dev/​engineer

Beth BarnesSep 9, 2022, 10:46 PM
99 points
7 comments10 min readLW link

[Linkpost] A sur­vey on over 300 works about in­ter­pretabil­ity in deep networks

scasperSep 12, 2022, 7:07 PM
97 points
7 comments2 min readLW link
(arxiv.org)

In­verse Scal­ing Prize: Round 1 Winners

Sep 26, 2022, 7:57 PM
93 points
16 comments4 min readLW link
(irmckenzie.co.uk)

The ethics of re­clin­ing air­plane seats

bracesSep 4, 2022, 5:59 PM
93 points
70 comments1 min readLW link

Linkpost: Github Copi­lot pro­duc­tivity experiment

Daniel KokotajloSep 8, 2022, 4:41 AM
88 points
4 comments1 min readLW link
(github.blog)

Why we’re not found­ing a hu­man-data-for-al­ign­ment org

Sep 27, 2022, 8:14 PM
88 points
6 comments29 min readLW link
(forum.effectivealtruism.org)

Let’s Ter­raform West Texas

blackstampedeSep 4, 2022, 4:24 PM
87 points
33 comments5 min readLW link

Nearcast-based “de­ploy­ment prob­lem” analysis

HoldenKarnofskySep 21, 2022, 6:52 PM
85 points
2 comments26 min readLW link

Towards de­con­fus­ing wire­head­ing and re­ward maximization

leogaoSep 21, 2022, 12:36 AM
81 points
7 comments4 min readLW link

Dath Ilan’s Views on Stop­gap Corrigibility

David UdellSep 22, 2022, 4:16 PM
78 points
19 comments13 min readLW link
(www.glowfic.com)

AI Safety and Neigh­bor­ing Com­mu­ni­ties: A Quick-Start Guide, as of Sum­mer 2022

Sam BowmanSep 1, 2022, 7:15 PM
76 points
2 comments7 min readLW link

Bugs or Fea­tures?

qbolecSep 3, 2022, 7:04 AM
73 points
9 comments2 min readLW link

Builder/​Breaker for Deconfusion

abramdemskiSep 29, 2022, 5:36 PM
72 points
9 comments9 min readLW link

Align­ment Org Cheat Sheet

Sep 20, 2022, 5:36 PM
70 points
8 comments4 min readLW link

Toy Models of Superposition

evhubSep 21, 2022, 11:48 PM
69 points
4 comments5 min readLW link1 review
(transformer-circuits.pub)

So­lar Black­out Resistance

jefftkSep 8, 2022, 1:30 PM
69 points
32 comments3 min readLW link
(www.jefftk.com)

Am­bi­guity in Pre­dic­tion Mar­ket Re­s­olu­tion is Harmful

aphyerSep 26, 2022, 4:22 PM
69 points
17 comments5 min readLW link

Path de­pen­dence in ML in­duc­tive biases

Sep 10, 2022, 1:38 AM
68 points
13 comments10 min readLW link