Take­aways from a sur­vey on AI al­ign­ment resources

DanielFilanNov 5, 2022, 11:40 PM
73 points
10 comments6 min readLW link1 review
(danielfilan.com)

Un­pri­ca­ble In­for­ma­tion and Cer­tifi­cate Hell

eva_Nov 5, 2022, 10:56 PM
13 points
2 comments6 min readLW link

Recom­mend HAIST re­sources for as­sess­ing the value of RLHF-re­lated al­ign­ment research

Nov 5, 2022, 8:58 PM
26 points
9 comments3 min readLW link

In­stead of tech­ni­cal re­search, more peo­ple should fo­cus on buy­ing time

Nov 5, 2022, 8:43 PM
100 points
45 comments14 min readLW link

Prov­ably Hon­est—A First Step

Srijanak DeNov 5, 2022, 7:18 PM
10 points
2 comments8 min readLW link

Should AI fo­cus on prob­lem-solv­ing or strate­gic plan­ning? Why not both?

Oliver SiegelNov 5, 2022, 7:17 PM
−12 points
3 comments1 min readLW link

How to store hu­man val­ues on a computer

Oliver SiegelNov 5, 2022, 7:17 PM
−12 points
17 comments1 min readLW link

The Slip­pery Slope from DALLE-2 to Deep­fake Anarchy

scasperNov 5, 2022, 2:53 PM
17 points
9 comments11 min readLW link

When can a mimic sur­prise you? Why gen­er­a­tive mod­els han­dle seem­ingly ill-posed problems

David JohnstonNov 5, 2022, 1:19 PM
8 points
4 comments16 min readLW link

My sum­mary of “Prag­matic AI Safety”

Eleni AngelouNov 5, 2022, 12:54 PM
3 points
0 comments5 min readLW link

Re­view of the Challenge

SD MarlowNov 5, 2022, 6:38 AM
−14 points
5 comments2 min readLW link

Spec­trum of Independence

jefftkNov 5, 2022, 2:40 AM
43 points
7 comments1 min readLW link
(www.jefftk.com)

[pa­per link] In­ter­pret­ing sys­tems as solv­ing POMDPs: a step to­wards a for­mal un­der­stand­ing of agency

the gears to ascensionNov 5, 2022, 1:06 AM
13 points
2 comments1 min readLW link
(www.semanticscholar.org)

Me­tac­u­lus is seek­ing Soft­ware Engineers

dschwarzNov 5, 2022, 12:42 AM
18 points
0 comments1 min readLW link
(apply.workable.com)

Should we “go against na­ture”?

jasoncrawfordNov 4, 2022, 10:14 PM
10 points
3 comments2 min readLW link
(rootsofprogress.org)

How much should we care about non-hu­man an­i­mals?

bokovNov 4, 2022, 9:36 PM
16 points
8 comments2 min readLW link
(www.lesswrong.com)

For ELK truth is mostly a distraction

c.troutNov 4, 2022, 9:14 PM
44 points
0 comments21 min readLW link

Toy Models and Tegum Products

Adam JermynNov 4, 2022, 6:51 PM
28 points
7 comments5 min readLW link

Ethan Ca­ballero on Bro­ken Neu­ral Scal­ing Laws, De­cep­tion, and Re­cur­sive Self Improvement

Nov 4, 2022, 6:09 PM
16 points
11 comments10 min readLW link
(theinsideview.ai)

Fol­low up to med­i­cal miracle

ElizabethNov 4, 2022, 6:00 PM
75 points
5 comments6 min readLW link
(acesounderglass.com)

Cross-Void Optimization

pneumynymNov 4, 2022, 5:47 PM
1 point
1 comment8 min readLW link

Monthly Shorts 10/​22

CelerNov 4, 2022, 4:30 PM
12 points
0 comments6 min readLW link
(keller.substack.com)

Weekly Roundup #4

ZviNov 4, 2022, 3:00 PM
42 points
1 comment6 min readLW link
(thezvi.wordpress.com)

A new place to dis­cuss cog­ni­tive sci­ence, ethics and hu­man alignment

Daniel_FriedrichNov 4, 2022, 2:34 PM
3 points
4 comments1 min readLW link

A new­comer’s guide to the tech­ni­cal AI safety field

zeshenNov 4, 2022, 2:29 PM
42 points
3 comments10 min readLW link

[Question] Are al­ign­ment re­searchers de­vot­ing enough time to im­prov­ing their re­search ca­pac­ity?

Carson JonesNov 4, 2022, 12:58 AM
13 points
3 comments3 min readLW link

[Question] Don’t you think RLHF solves outer al­ign­ment?

Charbel-RaphaëlNov 4, 2022, 12:36 AM
9 points
23 comments1 min readLW link

Mechanis­tic In­ter­pretabil­ity as Re­v­erse Eng­ineer­ing (fol­low-up to “cars and elephants”)

David Scott Krueger (formerly: capybaralet)Nov 3, 2022, 11:19 PM
28 points
3 comments1 min readLW link

[Question] Could a Supreme Court suit work to solve NEPA prob­lems?

ChristianKlNov 3, 2022, 9:10 PM
15 points
0 comments1 min readLW link

[Video] How hav­ing Fast Fourier Trans­forms sooner could have helped with Nu­clear Disar­ma­ment—Veritaserum

mako yassNov 3, 2022, 9:04 PM
17 points
1 comment1 min readLW link

Fur­ther con­sid­er­a­tions on the Ev­i­den­tial­ist’s Wager

Martín SotoNov 3, 2022, 8:06 PM
3 points
9 comments8 min readLW link

AI as a Civ­i­liza­tional Risk Part 6/​6: What can be done

PashaKamyshevNov 3, 2022, 7:48 PM
2 points
4 comments4 min readLW link

A Mys­tery About High Di­men­sional Con­cept Encoding

Fabien RogerNov 3, 2022, 5:05 PM
46 points
13 comments7 min readLW link

Why do we post our AI safety plans on the In­ter­net?

Peter S. ParkNov 3, 2022, 4:02 PM
4 points
4 comments11 min readLW link

Mul­ti­ple De­ploy-Key Repos

jefftkNov 3, 2022, 3:10 PM
15 points
0 comments1 min readLW link
(www.jefftk.com)

Covid 11/​3/​22: Ask­ing Forgiveness

ZviNov 3, 2022, 1:50 PM
23 points
3 comments6 min readLW link
(thezvi.wordpress.com)

Ad­ver­sar­ial Poli­cies Beat Pro­fes­sional-Level Go AIs

sanxiynNov 3, 2022, 1:27 PM
31 points
35 comments1 min readLW link
(goattack.alignmentfund.org)

K-types vs T-types — what pri­ors do you have?

Cleo NardoNov 3, 2022, 11:29 AM
74 points
25 comments7 min readLW link

In­for­ma­tion Mar­kets 2: Op­ti­mally Shaped Re­ward Bets

eva_Nov 3, 2022, 11:08 AM
9 points
0 comments3 min readLW link

The Ra­tional Utili­tar­ian Love Move­ment (A His­tor­i­cal Ret­ro­spec­tive)

CBiddulphNov 3, 2022, 7:11 AM
3 points
0 comments1 min readLW link

The Mir­ror Cham­ber: A short story ex­plor­ing the an­thropic mea­sure func­tion and why it can matter

mako yassNov 3, 2022, 6:47 AM
30 points
13 comments10 min readLW link

Open Let­ter Against Reck­less Nu­clear Es­ca­la­tion and Use

Max TegmarkNov 3, 2022, 5:34 AM
27 points
25 comments1 min readLW link

Lazy Python Ar­gu­ment Parsing

jefftkNov 3, 2022, 2:20 AM
20 points
3 comments1 min readLW link
(www.jefftk.com)

AI as a Civ­i­liza­tional Risk Part 5/​6: Re­la­tion­ship be­tween C-risk and X-risk

PashaKamyshevNov 3, 2022, 2:19 AM
2 points
0 comments7 min readLW link

[Question] Is there a good way to award a fixed prize in a pre­dic­tion con­test?

jchanNov 2, 2022, 9:37 PM
18 points
5 comments1 min readLW link

“Are Ex­per­i­ments Pos­si­ble?” Seeds of Science call for reviewers

rogersbaconNov 2, 2022, 8:05 PM
8 points
0 comments1 min readLW link

Hu­mans do acausal co­or­di­na­tion all the time

Adam JermynNov 2, 2022, 2:40 PM
57 points
35 comments3 min readLW link

Far-UVC Light Up­date: No, LEDs are not around the cor­ner (tweet­storm)

DavidmanheimNov 2, 2022, 12:57 PM
71 points
27 comments4 min readLW link
(twitter.com)

Hous­ing and Tran­sit Thoughts #1

ZviNov 2, 2022, 12:10 PM
35 points
5 comments16 min readLW link
(thezvi.wordpress.com)

Mind is uncountable

Filip SondejNov 2, 2022, 11:51 AM
18 points
22 comments1 min readLW link