RSS

Logan Riggs

Karma: 3,047

[Si­mu­la­tors sem­i­nar se­quence] #1 Back­ground & shared assumptions

Jan 2, 2023, 11:48 PM
50 points
4 comments3 min readLW link

Re­sults from a sur­vey on tool use and work­flows in al­ign­ment research

Dec 19, 2022, 3:19 PM
79 points
2 comments19 min readLW link

A de­scrip­tive, not pre­scrip­tive, overview of cur­rent AI Align­ment Research

Jun 6, 2022, 9:59 PM
139 points
21 comments7 min readLW link

Frame for Take-Off Speeds to in­form com­pute gov­er­nance & scal­ing alignment

Logan RiggsMay 13, 2022, 10:23 PM
15 points
2 comments2 min readLW link

Align­ment as Constraints

Logan RiggsMay 13, 2022, 10:07 PM
10 points
0 comments2 min readLW link

Make a Movie Show­ing Align­ment Failures

Logan RiggsApr 13, 2022, 9:54 PM
75 points
11 comments2 min readLW link

Con­vinc­ing Peo­ple of Align­ment with Street Epistemology

Logan RiggsApr 12, 2022, 11:43 PM
54 points
4 comments3 min readLW link

Roam Re­search Mo­bile is Out!

Logan RiggsApr 8, 2022, 7:05 PM
12 points
0 comments1 min readLW link

Con­vinc­ing All Ca­pa­bil­ity Researchers

Logan RiggsApr 8, 2022, 5:40 PM
120 points
70 comments3 min readLW link

Lan­guage Model Tools for Align­ment Research

Logan RiggsApr 8, 2022, 5:32 PM
28 points
0 comments2 min readLW link

5-Minute Ad­vice for EA Global

Logan RiggsApr 5, 2022, 10:33 PM
16 points
2 comments2 min readLW link

A sur­vey of tool use and work­flows in al­ign­ment research

Mar 23, 2022, 11:44 PM
45 points
4 comments1 min readLW link

Some (po­ten­tially) fund­able AI Safety Ideas

Logan RiggsMar 16, 2022, 12:48 PM
22 points
5 comments5 min readLW link

Solv­ing In­ter­pretabil­ity Week

Logan RiggsDec 13, 2021, 5:09 PM
11 points
5 comments1 min readLW link

Solve Cor­rigi­bil­ity Week

Logan RiggsNov 28, 2021, 5:00 PM
39 points
21 comments1 min readLW link

[Question] What Heuris­tics Do You Use to Think About Align­ment Topics?

Logan RiggsSep 29, 2021, 2:31 AM
5 points
3 comments1 min readLW link

Want­ing to Suc­ceed on Every Met­ric Presented

Logan RiggsApr 12, 2021, 8:43 PM
72 points
25 comments3 min readLW link

Us­ing GPT-N to Solve In­ter­pretabil­ity of Neu­ral Net­works: A Re­search Agenda

Sep 3, 2020, 6:27 PM
68 points
11 comments2 min readLW link

[Question] What’s a De­com­pos­able Align­ment Topic?

Logan RiggsAug 21, 2020, 10:57 PM
26 points
16 comments1 min readLW link

Map­ping Out Alignment

Aug 15, 2020, 1:02 AM
43 points
0 comments5 min readLW link