How could we know that an AGI sys­tem will have good con­se­quences?

So8resNov 7, 2022, 10:42 PM
111 points
25 comments5 min readLW link

A Walk­through of In­ter­pretabil­ity in the Wild (w/​ au­thors Kevin Wang, Arthur Conmy & Alexan­dre Variengien)

Neel NandaNov 7, 2022, 10:39 PM
30 points
15 comments3 min readLW link
(youtu.be)

In­ter­cept ar­ti­cle about lab accidents

ChristianKlNov 7, 2022, 9:10 PM
23 points
9 comments1 min readLW link
(theintercept.com)

The biolog­i­cal func­tion of love for non-kin is to gain the trust of peo­ple we can­not deceive

chaosmageNov 7, 2022, 8:26 PM
43 points
3 comments8 min readLW link

Distil­la­tion Ex­per­i­ment: Chunk-Knitting

DirectedEvolutionNov 7, 2022, 7:56 PM
10 points
3 comments6 min readLW link

Think­ing About Mastodon

jefftkNov 7, 2022, 7:40 PM
33 points
17 comments1 min readLW link
(www.jefftk.com)

[Question] Ideas for tiny re­search pro­jects re­lated to ra­tio­nal­ity?

FrejNov 7, 2022, 6:45 PM
3 points
1 comment1 min readLW link

Loss of con­trol of AI is not a likely source of AI x-risk

squekNov 7, 2022, 6:44 PM
−6 points
0 comments5 min readLW link

AI Safety Un­con­fer­ence NeurIPS 2022

OrpheusNov 7, 2022, 3:39 PM
25 points
0 comments1 min readLW link
(aisafetyevents.org)

Hacker-AI – Does it already ex­ist?

Erland WittkotterNov 7, 2022, 2:01 PM
3 points
13 comments11 min readLW link

What’s the Deal with Elon Musk and Twit­ter?

ZviNov 7, 2022, 1:50 PM
60 points
13 comments31 min readLW link
(thezvi.wordpress.com)

How to Make Easy De­ci­sions

lynettebyeNov 7, 2022, 1:17 PM
17 points
3 comments2 min readLW link

Op­por­tu­ni­ties that sur­prised us dur­ing our Clearer Think­ing Re­grants program

spencergNov 7, 2022, 1:09 PM
20 points
0 comments1 min readLW link

4 Key As­sump­tions in AI Safety

PrometheusNov 7, 2022, 10:50 AM
20 points
5 comments7 min readLW link

Google Search as a Washed Up Ser­vice Dog: “I HALP!”

ShmiNov 7, 2022, 7:02 AM
20 points
8 comments1 min readLW link

[Book Re­view] “Sta­tion Eleven” by Emily St. John Mandel

lsusrNov 7, 2022, 5:56 AM
17 points
1 comment1 min readLW link

Counterfactability

Scott GarrabrantNov 7, 2022, 5:39 AM
40 points
5 comments11 min readLW link

2022 LessWrong Cen­sus?

SurfingOrcaNov 7, 2022, 5:16 AM
67 points
13 comments1 min readLW link

A philoso­pher’s cri­tique of RLHF

TW123Nov 7, 2022, 2:42 AM
55 points
8 comments2 min readLW link

[Question] Is there any dis­cus­sion on avoid­ing be­ing Dutch-booked or oth­er­wise taken ad­van­tage of one’s bounded ra­tio­nal­ity by re­fus­ing to en­gage?

ShmiNov 7, 2022, 2:36 AM
38 points
29 comments1 min readLW link

Ex­ams-Only Universities

Mati_RoyNov 6, 2022, 10:05 PM
80 points
40 comments2 min readLW link

Democ­racy Is in Danger, but Not for the Rea­sons You Think

ExCephNov 6, 2022, 9:15 PM
−7 points
4 comments12 min readLW link
(ginnungagapfoundation.wordpress.com)

Play­ground Game: Monster

jefftkNov 6, 2022, 4:00 PM
14 points
4 comments1 min readLW link
(www.jefftk.com)

[Question] Has Pas­cal’s Mug­ging prob­lem been com­pletely solved yet?

EniScienNov 6, 2022, 12:52 PM
3 points
11 comments1 min readLW link

[Question] Should I Pur­sue a PhD?

DragonGodNov 6, 2022, 10:58 AM
8 points
8 comments2 min readLW link

You won’t solve al­ign­ment with­out agent foundations

Mikhail SaminNov 6, 2022, 8:07 AM
27 points
3 comments8 min readLW link

Word-Dis­tance vs Idea-Dis­tance: The Case for Lanoitaring

SableNov 6, 2022, 5:25 AM
7 points
7 comments7 min readLW link
(affablyevil.substack.com)

Ap­ple Cider Syrup

jefftkNov 6, 2022, 2:10 AM
11 points
6 comments1 min readLW link
(www.jefftk.com)

What is epi­ge­net­ics?

MetacelsusNov 6, 2022, 1:24 AM
78 points
4 comments6 min readLW link
(denovo.substack.com)

Response

Jarred FilmerNov 6, 2022, 1:03 AM
29 points
2 comments12 min readLW link

[Question] Has any­one in­creased their AGI timelines?

Darren McKeeNov 6, 2022, 12:03 AM
38 points
12 comments1 min readLW link

Take­aways from a sur­vey on AI al­ign­ment resources

DanielFilanNov 5, 2022, 11:40 PM
73 points
10 comments6 min readLW link1 review
(danielfilan.com)

Un­pri­ca­ble In­for­ma­tion and Cer­tifi­cate Hell

eva_Nov 5, 2022, 10:56 PM
13 points
2 comments6 min readLW link

Recom­mend HAIST re­sources for as­sess­ing the value of RLHF-re­lated al­ign­ment research

Nov 5, 2022, 8:58 PM
26 points
9 comments3 min readLW link

In­stead of tech­ni­cal re­search, more peo­ple should fo­cus on buy­ing time

Nov 5, 2022, 8:43 PM
100 points
45 comments14 min readLW link

Prov­ably Hon­est—A First Step

Srijanak DeNov 5, 2022, 7:18 PM
10 points
2 comments8 min readLW link

Should AI fo­cus on prob­lem-solv­ing or strate­gic plan­ning? Why not both?

Oliver SiegelNov 5, 2022, 7:17 PM
−12 points
3 comments1 min readLW link

How to store hu­man val­ues on a computer

Oliver SiegelNov 5, 2022, 7:17 PM
−12 points
17 comments1 min readLW link

The Slip­pery Slope from DALLE-2 to Deep­fake Anarchy

scasperNov 5, 2022, 2:53 PM
17 points
9 comments11 min readLW link

When can a mimic sur­prise you? Why gen­er­a­tive mod­els han­dle seem­ingly ill-posed problems

David JohnstonNov 5, 2022, 1:19 PM
8 points
4 comments16 min readLW link

My sum­mary of “Prag­matic AI Safety”

Eleni AngelouNov 5, 2022, 12:54 PM
3 points
0 comments5 min readLW link

Re­view of the Challenge

SD MarlowNov 5, 2022, 6:38 AM
−14 points
5 comments2 min readLW link

Spec­trum of Independence

jefftkNov 5, 2022, 2:40 AM
43 points
7 comments1 min readLW link
(www.jefftk.com)

[pa­per link] In­ter­pret­ing sys­tems as solv­ing POMDPs: a step to­wards a for­mal un­der­stand­ing of agency

the gears to ascensionNov 5, 2022, 1:06 AM
13 points
2 comments1 min readLW link
(www.semanticscholar.org)

Me­tac­u­lus is seek­ing Soft­ware Engineers

dschwarzNov 5, 2022, 12:42 AM
18 points
0 comments1 min readLW link
(apply.workable.com)

Should we “go against na­ture”?

jasoncrawfordNov 4, 2022, 10:14 PM
10 points
3 comments2 min readLW link
(rootsofprogress.org)

How much should we care about non-hu­man an­i­mals?

bokovNov 4, 2022, 9:36 PM
16 points
8 comments2 min readLW link
(www.lesswrong.com)

For ELK truth is mostly a distraction

c.troutNov 4, 2022, 9:14 PM
44 points
0 comments21 min readLW link

Toy Models and Tegum Products

Adam JermynNov 4, 2022, 6:51 PM
28 points
7 comments5 min readLW link

Ethan Ca­ballero on Bro­ken Neu­ral Scal­ing Laws, De­cep­tion, and Re­cur­sive Self Improvement

Nov 4, 2022, 6:09 PM
16 points
11 comments10 min readLW link
(theinsideview.ai)