Con­di­tion­ing Gen­er­a­tive Models with Restrictions

Adam JermynJul 21, 2022, 8:33 PM
18 points
4 comments8 min readLW link

Our Ex­ist­ing Solu­tions to AGI Align­ment (semi-safe)

Michael SoareverixJul 21, 2022, 7:00 PM
12 points
1 comment3 min readLW link

Chang­ing the world through slack & hobbies

Steven ByrnesJul 21, 2022, 6:11 PM
261 points
13 comments10 min readLW link

Which per­son­al­ities do we find in­tol­er­able?

weathersystemsJul 21, 2022, 3:56 PM
10 points
3 comments6 min readLW link

YouTubeTV and Spoilers

ZviJul 21, 2022, 1:50 PM
16 points
6 comments8 min readLW link
(thezvi.wordpress.com)

Covid 7/​21/​22: Fea­tur­ing ASPR

ZviJul 21, 2022, 1:50 PM
27 points
0 comments14 min readLW link
(thezvi.wordpress.com)

[Question] How much to op­ti­mize for the short-timelines sce­nario?

SoerenMindJul 21, 2022, 10:47 AM
20 points
3 comments1 min readLW link

Is Gas Green?

ChristianKlJul 21, 2022, 10:30 AM
19 points
19 comments1 min readLW link

Why are poli­ti­ci­ans po­larized?

ErnestScribblerJul 21, 2022, 8:17 AM
15 points
24 comments7 min readLW link

[AN #173] Re­cent lan­guage model re­sults from DeepMind

Rohin ShahJul 21, 2022, 2:30 AM
37 points
9 comments8 min readLW link
(mailchi.mp)

Don’t take the or­ga­ni­za­tional chart literally

lcJul 21, 2022, 12:56 AM
54 points
21 comments4 min readLW link

Per­sonal fore­cast­ing ret­ro­spec­tive: 2020-2022

eliflandJul 21, 2022, 12:07 AM
35 points
3 comments8 min readLW link
(www.foxy-scout.com)

Defin­ing Op­ti­miza­tion in a Deeper Way Part 3

J BostockJul 20, 2022, 10:06 PM
8 points
0 comments2 min readLW link

Cog­ni­tive Risks of Ado­les­cent Binge Drinking

Jul 20, 2022, 9:10 PM
70 points
12 comments10 min readLW link
(acesounderglass.com)

Why AGI Timeline Re­search/​Dis­course Might Be Overrated

Noosphere89Jul 20, 2022, 8:26 PM
5 points
0 comments1 min readLW link
(forum.effectivealtruism.org)

En­light­en­ment Values in a Vuln­er­a­ble World

Maxwell TabarrokJul 20, 2022, 7:52 PM
15 points
6 comments31 min readLW link
(maximumprogress.substack.com)

Coun­ter­ing ar­gu­ments against work­ing on AI safety

Rauno ArikeJul 20, 2022, 6:23 PM
7 points
2 comments7 min readLW link

A Short In­tro to Humans

Ben AmitayJul 20, 2022, 3:28 PM
1 point
1 comment7 min readLW link

How to Diver­sify Con­cep­tual Align­ment: the Model Be­hind Refine

adamShimiJul 20, 2022, 10:44 AM
87 points
11 comments8 min readLW link

[Question] What are the sim­plest ques­tions in ap­plied ra­tio­nal­ity where you don’t know the an­swer to?

ChristianKlJul 20, 2022, 9:53 AM
26 points
11 comments1 min readLW link

AI Safety Cheat­sheet /​ Quick Reference

Zohar JacksonJul 20, 2022, 9:39 AM
3 points
0 comments1 min readLW link
(github.com)

Get­ting Un­stuck on Counterfactuals

Chris_LeongJul 20, 2022, 5:31 AM
7 points
1 comment2 min readLW link

Pit­falls with Proofs

scasperJul 19, 2022, 10:21 PM
19 points
21 comments8 min readLW link

A daily rou­tine I do for my AI safety re­search work

scasperJul 19, 2022, 9:58 PM
21 points
7 comments1 min readLW link

Progress links and tweets, 2022-07-19

jasoncrawfordJul 19, 2022, 8:50 PM
11 points
1 comment1 min readLW link
(rootsofprogress.org)

Ap­pli­ca­tions are open for CFAR work­shops in Prague this fall!

John SteidleyJul 19, 2022, 6:29 PM
64 points
3 comments2 min readLW link

Sex­ual Abuse at­ti­tudes might be infohazardous

Pseudonymous OtterJul 19, 2022, 6:06 PM
256 points
72 comments1 min readLW link

Spend­ing Up­date 2022

jefftkJul 19, 2022, 2:10 PM
28 points
0 comments3 min readLW link
(www.jefftk.com)

Abram Dem­ski’s ELK thoughts and pro­posal—distillation

Rubi J. HudsonJul 19, 2022, 6:57 AM
19 points
8 comments16 min readLW link

Bounded com­plex­ity of solv­ing ELK and its implications

Rubi J. HudsonJul 19, 2022, 6:56 AM
11 points
4 comments18 min readLW link

Help ARC eval­u­ate ca­pa­bil­ities of cur­rent lan­guage mod­els (still need peo­ple)

Beth BarnesJul 19, 2022, 4:55 AM
95 points
6 comments2 min readLW link

A Cri­tique of AI Align­ment Pessimism

ExCephJul 19, 2022, 2:28 AM
9 points
1 comment9 min readLW link

Ars D&D.Sci: Mys­ter­ies of Mana Eval­u­a­tion & Ruleset

aphyerJul 19, 2022, 2:06 AM
33 points
4 comments5 min readLW link

Mar­burg Virus Pan­demic Pre­dic­tion Checklist

DirectedEvolutionJul 18, 2022, 11:15 PM
30 points
0 comments5 min readLW link

At what point will we know if Eliezer’s pre­dic­tions are right or wrong?

anonymous123456Jul 18, 2022, 10:06 PM
5 points
6 comments1 min readLW link

Model­ling Deception

Garrett BakerJul 18, 2022, 9:21 PM
15 points
0 comments7 min readLW link

Are In­tel­li­gence and Gen­er­al­ity Orthog­o­nal?

cubefoxJul 18, 2022, 8:07 PM
18 points
16 comments1 min readLW link

Without spe­cific coun­ter­mea­sures, the eas­iest path to trans­for­ma­tive AI likely leads to AI takeover

Ajeya CotraJul 18, 2022, 7:06 PM
368 points
95 comments75 min readLW link1 review

Turn­ing Some In­con­sis­tent Prefer­ences into Con­sis­tent Ones

niplavJul 18, 2022, 6:40 PM
23 points
5 comments12 min readLW link

Ad­den­dum: A non-mag­i­cal ex­pla­na­tion of Jeffrey Epstein

lcJul 18, 2022, 5:40 PM
81 points
21 comments11 min readLW link

Launch­ing a new progress in­sti­tute, seek­ing a CEO

jasoncrawfordJul 18, 2022, 4:58 PM
25 points
2 comments3 min readLW link
(rootsofprogress.org)

Ma­chine Learn­ing Model Sizes and the Pa­ram­e­ter Gap [abridged]

Pablo VillalobosJul 18, 2022, 4:51 PM
20 points
0 comments1 min readLW link
(epochai.org)

Quan­tiliz­ers and Gen­er­a­tive Models

Adam JermynJul 18, 2022, 4:32 PM
24 points
5 comments4 min readLW link

AI Hiroshima (Does A Vivid Ex­am­ple Of Destruc­tion Fore­stall Apoca­lypse?)

SableJul 18, 2022, 12:06 PM
4 points
4 comments2 min readLW link

How the ---- did Feyn­man Get Here !?

George3d6Jul 18, 2022, 9:43 AM
8 points
8 comments3 min readLW link
(www.epistem.ink)

Con­di­tion­ing Gen­er­a­tive Models for Alignment

JozdienJul 18, 2022, 7:11 AM
59 points
8 comments20 min readLW link

Train­ing goals for large lan­guage models

Johannes TreutleinJul 18, 2022, 7:09 AM
28 points
5 comments19 min readLW link

A dis­til­la­tion of Evan Hub­inger’s train­ing sto­ries (for SERI MATS)

Daphne_WJul 18, 2022, 3:38 AM
15 points
1 comment10 min readLW link

Fore­cast­ing ML Bench­marks in 2023

jsteinhardtJul 18, 2022, 2:50 AM
36 points
20 comments12 min readLW link
(bounded-regret.ghost.io)

What should you change in re­sponse to an “emer­gency”? And AI risk

AnnaSalamonJul 18, 2022, 1:11 AM
336 points
60 comments6 min readLW link1 review