Defin­ing Op­ti­miza­tion in a Deeper Way Part 3

J BostockJul 20, 2022, 10:06 PM
8 points
0 comments2 min readLW link

Cog­ni­tive Risks of Ado­les­cent Binge Drinking

Jul 20, 2022, 9:10 PM
70 points
12 comments10 min readLW link
(acesounderglass.com)

Why AGI Timeline Re­search/​Dis­course Might Be Overrated

Noosphere89Jul 20, 2022, 8:26 PM
5 points
0 comments1 min readLW link
(forum.effectivealtruism.org)

En­light­en­ment Values in a Vuln­er­a­ble World

Maxwell TabarrokJul 20, 2022, 7:52 PM
15 points
6 comments31 min readLW link
(maximumprogress.substack.com)

Coun­ter­ing ar­gu­ments against work­ing on AI safety

Rauno ArikeJul 20, 2022, 6:23 PM
7 points
2 comments7 min readLW link

A Short In­tro to Humans

Ben AmitayJul 20, 2022, 3:28 PM
1 point
1 comment7 min readLW link

How to Diver­sify Con­cep­tual Align­ment: the Model Be­hind Refine

adamShimiJul 20, 2022, 10:44 AM
87 points
11 comments8 min readLW link

[Question] What are the sim­plest ques­tions in ap­plied ra­tio­nal­ity where you don’t know the an­swer to?

ChristianKlJul 20, 2022, 9:53 AM
26 points
11 comments1 min readLW link

AI Safety Cheat­sheet /​ Quick Reference

Zohar JacksonJul 20, 2022, 9:39 AM
3 points
0 comments1 min readLW link
(github.com)

Get­ting Un­stuck on Counterfactuals

Chris_LeongJul 20, 2022, 5:31 AM
7 points
1 comment2 min readLW link

Pit­falls with Proofs

scasperJul 19, 2022, 10:21 PM
19 points
21 comments8 min readLW link

A daily rou­tine I do for my AI safety re­search work

scasperJul 19, 2022, 9:58 PM
22 points
7 comments1 min readLW link

Progress links and tweets, 2022-07-19

jasoncrawfordJul 19, 2022, 8:50 PM
11 points
1 comment1 min readLW link
(rootsofprogress.org)

Ap­pli­ca­tions are open for CFAR work­shops in Prague this fall!

John SteidleyJul 19, 2022, 6:29 PM
64 points
3 comments2 min readLW link

Sex­ual Abuse at­ti­tudes might be infohazardous

Pseudonymous OtterJul 19, 2022, 6:06 PM
256 points
72 comments1 min readLW link

Spend­ing Up­date 2022

jefftkJul 19, 2022, 2:10 PM
28 points
0 comments3 min readLW link
(www.jefftk.com)

Abram Dem­ski’s ELK thoughts and pro­posal—distillation

Rubi J. HudsonJul 19, 2022, 6:57 AM
19 points
8 comments16 min readLW link

Bounded com­plex­ity of solv­ing ELK and its implications

Rubi J. HudsonJul 19, 2022, 6:56 AM
11 points
4 comments18 min readLW link

Help ARC eval­u­ate ca­pa­bil­ities of cur­rent lan­guage mod­els (still need peo­ple)

Beth BarnesJul 19, 2022, 4:55 AM
95 points
6 comments2 min readLW link

A Cri­tique of AI Align­ment Pessimism

ExCephJul 19, 2022, 2:28 AM
9 points
1 comment9 min readLW link

Ars D&D.Sci: Mys­ter­ies of Mana Eval­u­a­tion & Ruleset

aphyerJul 19, 2022, 2:06 AM
33 points
4 comments5 min readLW link

Mar­burg Virus Pan­demic Pre­dic­tion Checklist

DirectedEvolutionJul 18, 2022, 11:15 PM
30 points
0 comments5 min readLW link

At what point will we know if Eliezer’s pre­dic­tions are right or wrong?

anonymous123456Jul 18, 2022, 10:06 PM
5 points
6 comments1 min readLW link

Model­ling Deception

Garrett BakerJul 18, 2022, 9:21 PM
15 points
0 comments7 min readLW link

Are In­tel­li­gence and Gen­er­al­ity Orthog­o­nal?

cubefoxJul 18, 2022, 8:07 PM
18 points
16 comments1 min readLW link

Without spe­cific coun­ter­mea­sures, the eas­iest path to trans­for­ma­tive AI likely leads to AI takeover

Ajeya CotraJul 18, 2022, 7:06 PM
368 points
95 comments75 min readLW link1 review

Turn­ing Some In­con­sis­tent Prefer­ences into Con­sis­tent Ones

niplavJul 18, 2022, 6:40 PM
23 points
5 comments12 min readLW link

Ad­den­dum: A non-mag­i­cal ex­pla­na­tion of Jeffrey Epstein

lcJul 18, 2022, 5:40 PM
81 points
21 comments11 min readLW link

Launch­ing a new progress in­sti­tute, seek­ing a CEO

jasoncrawfordJul 18, 2022, 4:58 PM
25 points
2 comments3 min readLW link
(rootsofprogress.org)

Ma­chine Learn­ing Model Sizes and the Pa­ram­e­ter Gap [abridged]

Pablo VillalobosJul 18, 2022, 4:51 PM
20 points
0 comments1 min readLW link
(epochai.org)

Quan­tiliz­ers and Gen­er­a­tive Models

Adam JermynJul 18, 2022, 4:32 PM
24 points
5 comments4 min readLW link

AI Hiroshima (Does A Vivid Ex­am­ple Of Destruc­tion Fore­stall Apoca­lypse?)

SableJul 18, 2022, 12:06 PM
4 points
4 comments2 min readLW link

How the ---- did Feyn­man Get Here !?

George3d6Jul 18, 2022, 9:43 AM
8 points
8 comments3 min readLW link
(www.epistem.ink)

Con­di­tion­ing Gen­er­a­tive Models for Alignment

JozdienJul 18, 2022, 7:11 AM
60 points
8 comments20 min readLW link

Train­ing goals for large lan­guage models

Johannes TreutleinJul 18, 2022, 7:09 AM
28 points
5 comments19 min readLW link

A dis­til­la­tion of Evan Hub­inger’s train­ing sto­ries (for SERI MATS)

Daphne_WJul 18, 2022, 3:38 AM
15 points
1 comment10 min readLW link

Fore­cast­ing ML Bench­marks in 2023

jsteinhardtJul 18, 2022, 2:50 AM
36 points
20 comments12 min readLW link
(bounded-regret.ghost.io)

What should you change in re­sponse to an “emer­gency”? And AI risk

AnnaSalamonJul 18, 2022, 1:11 AM
338 points
60 comments6 min readLW link1 review

De­cep­tion?! I ain’t got time for that!

Paul CologneseJul 18, 2022, 12:06 AM
55 points
5 comments13 min readLW link

How In­ter­pretabil­ity can be Impactful

Connall GarrodJul 18, 2022, 12:06 AM
18 points
0 comments37 min readLW link

Why you might ex­pect ho­mo­ge­neous take-off: ev­i­dence from ML research

Andrei AlexandruJul 17, 2022, 8:31 PM
24 points
0 comments10 min readLW link

Ex­am­ples of AI In­creas­ing AI Progress

TW123Jul 17, 2022, 8:06 PM
107 points
14 comments1 min readLW link

Four ques­tions I ask AI safety researchers

Orpheus16Jul 17, 2022, 5:25 PM
17 points
0 comments1 min readLW link

Why I Think Abrupt AI Takeoff

lincolnquirkJul 17, 2022, 5:04 PM
14 points
6 comments1 min readLW link

Cul­ture wars in rid­dle format

MalmesburyJul 17, 2022, 2:51 PM
7 points
28 comments3 min readLW link

Ban­ga­lore LW/​ACX Meetup in person

VyakartJul 17, 2022, 6:53 AM
1 point
0 comments1 min readLW link

Re­solve Cycles

CFAR!DuncanJul 16, 2022, 11:17 PM
140 points
8 comments10 min readLW link

Align­ment as Game Design

Shoshannah TekofskyJul 16, 2022, 10:36 PM
11 points
7 comments2 min readLW link

Risk Man­age­ment from a Clim­bers Perspective

AnnapurnaJul 16, 2022, 9:14 PM
5 points
0 comments6 min readLW link
(jorgevelez.substack.com)

Cog­ni­tive In­sta­bil­ity, Phys­i­cal­ism, and Free Will

dadadarrenJul 16, 2022, 1:13 PM
5 points
27 comments2 min readLW link
(www.sleepingbeautyproblem.com)