Ad­ver­sar­ial train­ing, im­por­tance sam­pling, and anti-ad­ver­sar­ial train­ing for AI whistleblowing

BuckJun 2, 2022, 11:48 PM
42 points
0 comments3 min readLW link

The pro­to­typ­i­cal catas­trophic AI ac­tion is get­ting root ac­cess to its datacenter

BuckJun 2, 2022, 11:46 PM
180 points
13 comments2 min readLW link1 review

Con­fused why a “ca­pa­bil­ities re­search is good for al­ign­ment progress” po­si­tion isn’t dis­cussed more

Kaj_SotalaJun 2, 2022, 9:41 PM
130 points
27 comments4 min readLW link

An­nounc­ing a con­test: EA Crit­i­cism and Red Teaming

finJun 2, 2022, 8:27 PM
17 points
1 comment14 min readLW link
(forum.effectivealtruism.org)

Fact post: pro­ject-based learning

dominicqJun 2, 2022, 8:18 PM
12 points
4 comments3 min readLW link

The case for us­ing the term ‘steel­man­ning’ in­stead of ‘prin­ci­ple of char­ity’

ChristianKlJun 2, 2022, 7:24 PM
26 points
7 comments3 min readLW link

Covid 6/​2/​22: De­clin­ing to Respond

ZviJun 2, 2022, 1:50 PM
55 points
10 comments7 min readLW link
(thezvi.wordpress.com)

The hor­ror of what must, yet can­not, be true

Kaj_SotalaJun 2, 2022, 10:20 AM
52 points
18 comments2 min readLW link
(kajsotala.fi)

Paradigms of AI al­ign­ment: com­po­nents and enablers

VikaJun 2, 2022, 6:19 AM
53 points
4 comments8 min readLW link

The Bio An­chors Forecast

Ansh RadhakrishnanJun 2, 2022, 1:32 AM
13 points
0 comments3 min readLW link

**Venue Changed** ACX Mon­treal Meetup Jun 18 2022

EJun 2, 2022, 12:43 AM
10 points
0 comments1 min readLW link

Public be­liefs vs. Pri­vate beliefs

Eli TyreJun 1, 2022, 9:33 PM
146 points
30 comments5 min readLW link

[Question] Prob­a­bil­ity that the Pres­i­dent would win elec­tion against a ran­dom adult cit­i­zen?

Daniel KokotajloJun 1, 2022, 8:38 PM
15 points
26 comments1 min readLW link

Re­vis­it­ing “Why Global Poverty”

jefftkJun 1, 2022, 8:20 PM
20 points
2 comments3 min readLW link
(www.jefftk.com)

[Question] What will hap­pen when an all-reach­ing AGI starts at­tempt­ing to fix hu­man char­ac­ter flaws?

Michael BrightJun 1, 2022, 6:45 PM
1 point
6 comments1 min readLW link

[Question] Any prior work on mu­ti­a­gent dy­nam­ics for con­tin­u­ous dis­tri­bu­tions over agents?

Quintin PopeJun 1, 2022, 6:12 PM
15 points
2 comments1 min readLW link

[Question] For­ma­tion via nu­cle­ation of boltz­mann brains

Zeruel017Jun 1, 2022, 6:05 PM
0 points
9 comments1 min readLW link

Hal­i­fax Ra­tion­al­ity /​ EA Cowork­ing Day

Jun 1, 2022, 5:47 PM
9 points
0 comments1 min readLW link

Machines vs Memes Part 3: Imi­ta­tion and Memes

ceru23Jun 1, 2022, 1:36 PM
7 points
0 comments7 min readLW link

Ra­tion­al­ism in an Age of Egregores

David UdellJun 1, 2022, 7:29 AM
14 points
11 comments2 min readLW link

Wield­ing civilization

dominicqJun 1, 2022, 7:11 AM
29 points
2 comments2 min readLW link

Machines vs. Memes 2: Memet­i­cally-Mo­ti­vated Model Extensions

naterushMay 31, 2022, 10:03 PM
6 points
0 comments4 min readLW link

Machines vs Memes Part 1: AI Align­ment and Memetics

Harriet FarlowMay 31, 2022, 10:03 PM
19 points
1 comment6 min readLW link

The Hard In­tel­li­gence Hy­poth­e­sis and Its Bear­ing on Suc­ces­sion In­duced Foom

DragonGodMay 31, 2022, 7:04 PM
10 points
7 comments4 min readLW link

Paper: Teach­ing GPT3 to ex­press un­cer­tainty in words

Owain_EvansMay 31, 2022, 1:27 PM
97 points
7 comments4 min readLW link

Effec­tive Altru­ism Vir­tual Pro­grams Jul-Aug 2022

Yve Nichols-EvansMay 31, 2022, 12:56 PM
1 point
0 comments1 min readLW link

[Question] What is the state of Chi­nese AI re­search?

RatiosMay 31, 2022, 10:05 AM
34 points
16 comments1 min readLW link

The Brain That Builds Itself

JanMay 31, 2022, 9:42 AM
57 points
6 comments8 min readLW link
(universalprior.substack.com)

[Question] Is there any for­mal ar­gu­ment that cli­mate change needs to more ex­treme weather events?

ChristianKlMay 31, 2022, 9:01 AM
8 points
8 comments1 min readLW link

Progress links and tweets, 2022-05-30

jasoncrawfordMay 30, 2022, 11:20 PM
18 points
0 comments1 min readLW link
(rootsofprogress.org)

The Re­v­erse Basilisk

Dunning K.May 30, 2022, 11:10 PM
17 points
23 comments2 min readLW link

De­liber­ate Grieving

RaemonMay 30, 2022, 8:49 PM
188 points
16 comments9 min readLW link2 reviews

Perform Tractable Re­search While Avoid­ing Ca­pa­bil­ities Ex­ter­nal­ities [Prag­matic AI Safety #4]

May 30, 2022, 8:25 PM
51 points
3 comments25 min readLW link

[Question] A ter­rify­ing var­i­ant of Boltz­mann’s brains problem

Zeruel017May 30, 2022, 8:08 PM
5 points
12 comments4 min readLW link

Ceiling Air Purifier

jefftkMay 30, 2022, 7:20 PM
87 points
11 comments2 min readLW link
(www.jefftk.com)

No­tion tem­plate for per­sonal predictions

Arjun YadavMay 30, 2022, 5:47 PM
1 point
0 comments1 min readLW link

Six Di­men­sions of Oper­a­tional Ad­e­quacy in AGI Projects

Eliezer YudkowskyMay 30, 2022, 5:00 PM
310 points
66 comments13 min readLW link1 review

My SERI MATS Application

Daniel PalekaMay 30, 2022, 2:04 AM
16 points
0 comments8 min readLW link

Re­shap­ing the AI Industry

Thane RuthenisMay 29, 2022, 10:54 PM
147 points
35 comments21 min readLW link

The Un­bear­able Light­ness of Web Vulnerabilities

aiiixiiiMay 29, 2022, 9:13 PM
29 points
2 comments1 min readLW link
(www.theoreticalstructures.io)

Find­ing the Right Problem

tobotMay 29, 2022, 5:52 PM
8 points
0 comments2 min readLW link

The im­pact you might have work­ing on AI safety

Fabien RogerMay 29, 2022, 4:31 PM
5 points
1 comment4 min readLW link

The Prob­lem With The Cur­rent State of AGI Definitions

YitzMay 29, 2022, 1:58 PM
40 points
22 comments8 min readLW link

[Question] Re­quest for nice ques­tions to think about while try­ing to sleep

oh54321May 29, 2022, 1:47 PM
9 points
2 comments1 min readLW link

Will work­ing here ad­vance AGI? Help us not de­stroy the world!

Yonatan CaleMay 29, 2022, 11:42 AM
30 points
47 comments1 min readLW link

Pass­able Puppet

burmesetheaterMay 29, 2022, 11:07 AM
6 points
1 comment3 min readLW link

Mul­ti­ple AIs in boxes, eval­u­at­ing each other’s alignment

Moebius314May 29, 2022, 8:36 AM
8 points
0 comments14 min readLW link

[Question] How would you build Dath Ilan on earth?

Yair HalberstadtMay 29, 2022, 7:26 AM
35 points
29 comments1 min readLW link

Distributed Decisions

johnswentworthMay 29, 2022, 2:43 AM
66 points
6 comments6 min readLW link

Distil­led—AGI Safety from First Principles

Harrison GMay 29, 2022, 12:57 AM
11 points
1 comment14 min readLW link