In­ward and out­ward steelmanning

Q HomeJul 14, 2022, 11:32 PM
13 points
6 comments18 min readLW link

Po­tato diet: A post mortem and an an­swer to SMTM’s article

Épiphanie GédéonJul 14, 2022, 11:18 PM
48 points
34 comments16 min readLW link

Pro­posed Orthog­o­nal­ity Th­e­ses #2-5

rjbgJul 14, 2022, 10:59 PM
8 points
0 comments2 min readLW link

Bet­ter Quiddler

jefftkJul 14, 2022, 5:40 PM
17 points
0 comments1 min readLW link
(www.jefftk.com)

Cir­cum­vent­ing in­ter­pretabil­ity: How to defeat mind-readers

Lee SharkeyJul 14, 2022, 4:59 PM
114 points
15 comments33 min readLW link

Covid 7/​14/​22: BA.2.75 Plus Tax

ZviJul 14, 2022, 2:40 PM
39 points
9 comments8 min readLW link
(thezvi.wordpress.com)

Crit­i­cism of EA Crit­i­cism Contest

ZviJul 14, 2022, 2:30 PM
108 points
17 comments31 min readLW link1 review
(thezvi.wordpress.com)

Hu­mans provide an un­tapped wealth of ev­i­dence about alignment

Jul 14, 2022, 2:31 AM
211 points
94 comments9 min readLW link1 review

[Question] Wacky, risky, anti-in­duc­tive in­tel­li­gence-en­hance­ment meth­ods?

Nicholas / Heather KrossJul 14, 2022, 1:40 AM
20 points
27 comments1 min readLW link

[Question] How to im­press stu­dents with re­cent ad­vances in ML?

Charbel-RaphaëlJul 14, 2022, 12:03 AM
12 points
2 comments1 min readLW link

Notes on Love

David GrossJul 13, 2022, 11:35 PM
18 points
3 comments29 min readLW link

Deep learn­ing cur­ricu­lum for large lan­guage model alignment

Jacob_HiltonJul 13, 2022, 9:58 PM
57 points
3 comments1 min readLW link
(github.com)

Ar­tifi­cial Sand­wich­ing: When can we test scal­able al­ign­ment pro­to­cols with­out hu­mans?

Sam BowmanJul 13, 2022, 9:14 PM
42 points
6 comments5 min readLW link

[Question] Any tips for elic­it­ing one’s own la­tent knowl­edge?

MSRayneJul 13, 2022, 9:12 PM
16 points
20 comments2 min readLW link

Goal Align­ment Is Ro­bust To the Sharp Left Turn

Thane RuthenisJul 13, 2022, 8:23 PM
43 points
16 comments4 min readLW link

Mak­ing de­ci­sions us­ing mul­ti­ple worldviews

Richard_NgoJul 13, 2022, 7:15 PM
50 points
10 comments11 min readLW link

[Question] App idea to help with read­ing STEM text­books (feed­back re­quest)

DirectedEvolutionJul 13, 2022, 6:28 PM
16 points
8 comments2 min readLW link

MIRI Con­ver­sa­tions: Tech­nol­ogy Fore­cast­ing & Grad­u­al­ism (Distil­la­tion)

CallumMcDougallJul 13, 2022, 3:55 PM
31 points
1 comment20 min readLW link

Pass­ing Up Pay

jefftkJul 13, 2022, 2:10 PM
29 points
8 comments5 min readLW link
(www.jefftk.com)

[Question] How could the uni­verse be in­finitely large?

amaraiJul 13, 2022, 1:45 PM
0 points
8 comments1 min readLW link

John von Neu­mann on how to safely progress with technology

Dalton MaberyJul 13, 2022, 11:07 AM
14 points
0 comments1 min readLW link

Every­one is an Im­poster

TharinJul 13, 2022, 8:46 AM
19 points
1 comment9 min readLW link
(echoesandchimes.com)

[Question] Which AI Safety re­search agen­das are the most promis­ing?

Chris_LeongJul 13, 2022, 7:54 AM
27 points
5 comments1 min readLW link

Straw-Steelmanning

Chris van MerwijkJul 13, 2022, 5:48 AM
29 points
2 comments1 min readLW link

Alien Mes­sage Con­test: Solution

DaemonicSigilJul 13, 2022, 4:07 AM
29 points
2 comments4 min readLW link

[Question] What is wrong with this ap­proach to cor­rigi­bil­ity?

Rafael CosmanJul 12, 2022, 10:55 PM
7 points
8 comments1 min readLW link

Ac­cept­abil­ity Ver­ifi­ca­tion: A Re­search Agenda

Jul 12, 2022, 8:11 PM
50 points
0 comments1 min readLW link
(docs.google.com)

Progress links and tweets, 2022-07-12

jasoncrawfordJul 12, 2022, 3:30 PM
12 points
0 comments1 min readLW link
(rootsofprogress.org)

Re­sponse to Blake Richards: AGI, gen­er­al­ity, al­ign­ment, & loss functions

Steven ByrnesJul 12, 2022, 1:56 PM
62 points
9 comments15 min readLW link

Three Min­i­mum Pivotal Acts Pos­si­ble by Nar­row AI

Michael SoareverixJul 12, 2022, 9:51 AM
0 points
4 comments2 min readLW link

Mo­saic and Pal­impsests: Two Shapes of Research

adamShimiJul 12, 2022, 9:05 AM
39 points
3 comments9 min readLW link

[Question] How do you con­cisely com­mu­ni­cate & nav­i­gate the poli­tics /​ cul­ture at your job work­ing at a large cor­po­ra­tion or in­sti­tu­tion?

WillaJul 12, 2022, 3:22 AM
10 points
6 comments1 min readLW link

On how var­i­ous plans miss the hard bits of the al­ign­ment challenge

So8resJul 12, 2022, 2:49 AM
305 points
89 comments29 min readLW link3 reviews

Rainmaking

WalterLJul 12, 2022, 12:42 AM
26 points
5 comments1 min readLW link
(www.youtube.com)

Book Re­view: Neal Stephen­son’s “Ter­mi­na­tion Shock”

Tyler SimmonsJul 12, 2022, 12:07 AM
13 points
0 comments30 min readLW link
(www.words-and-dirt.com)

An­nounc­ing Fu­ture Fo­rum—Ap­ply Now

Jul 11, 2022, 10:57 PM
8 points
0 comments4 min readLW link
(forum.effectivealtruism.org)

Defin­ing Op­ti­miza­tion in a Deeper Way Part 2

J BostockJul 11, 2022, 8:29 PM
7 points
0 comments4 min readLW link

Mar­riage, the Giv­ing What We Can Pledge, and the dam­age caused by vague pub­lic commitments

Jeffrey LadishJul 11, 2022, 7:38 PM
98 points
27 comments6 min readLW link1 review

Systemization

CFAR!DuncanJul 11, 2022, 6:39 PM
42 points
5 comments12 min readLW link

[Question] How do AI timelines af­fect how you live your life?

Quadratic ReciprocityJul 11, 2022, 1:54 PM
80 points
50 comments1 min readLW link

Cam­bridge LW Meetup: Free Speech

DarmaniJul 11, 2022, 4:36 AM
7 points
0 comments1 min readLW link

Check­sum Sen­sor Alignment

lsusrJul 11, 2022, 3:31 AM
12 points
2 comments1 min readLW link

The Align­ment Problem

lsusrJul 11, 2022, 3:03 AM
46 points
18 comments3 min readLW link

Im­manuel Kant and the De­ci­sion The­ory App Store

Daniel KokotajloJul 10, 2022, 4:04 PM
92 points
12 comments5 min readLW link

Me­tac­u­lus is seek­ing ex­pe­rienced lead­ers, re­searchers & op­er­a­tors for high-im­pact roles

ChristianWilliamsJul 10, 2022, 2:27 PM
9 points
0 comments1 min readLW link
(apply.workable.com)

Avoid the ab­bre­vi­a­tion “FLOPs” – use “FLOP” or “FLOP/​s” instead

Daniel_EthJul 10, 2022, 10:44 AM
70 points
13 comments1 min readLW link

My Op­por­tu­nity Costs

abstractapplicJul 10, 2022, 10:14 AM
22 points
3 comments3 min readLW link

Why Portland

Adam ZernerJul 10, 2022, 7:20 AM
25 points
18 comments9 min readLW link

Hes­sian and Basin volume

Vivek HebbarJul 10, 2022, 6:59 AM
35 points
10 comments4 min readLW link

Taste & Shaping

CFAR!DuncanJul 10, 2022, 5:50 AM
67 points
1 comment16 min readLW link