Do­ing over­sight from the very start of train­ing seems hard

peterbarnettSep 20, 2022, 5:21 PM
14 points
3 comments3 min readLW link

$13,000 of prizes for chang­ing our mind about who to fund (Clearer Think­ing Re­grants Fore­cast­ing Tour­na­ment)

spencergSep 20, 2022, 4:06 PM
14 points
3 comments1 min readLW link
(manifold.markets)

Progress links and tweets, 2022-09-20

jasoncrawfordSep 20, 2022, 2:07 PM
7 points
1 comment1 min readLW link
(rootsofprogress.org)

[Question] If we have Hu­man-level chat­bots, won’t we end up be­ing ruled by pos­si­ble peo­ple?

Erlja Jkdf.Sep 20, 2022, 1:59 PM
5 points
13 comments1 min readLW link

Twit­ter Polls: Ev­i­dence is Evidence

ZviSep 20, 2022, 12:30 PM
34 points
8 comments7 min readLW link
(thezvi.wordpress.com)

Some of the most im­por­tant en­trepreneur­ship skills are tacit knowledge

RuhulSep 20, 2022, 12:06 PM
20 points
0 comments7 min readLW link

Char­ac­ter alignment

p.b.Sep 20, 2022, 8:27 AM
22 points
0 comments2 min readLW link

Los­ing the root for the tree

Adam ZernerSep 20, 2022, 4:53 AM
482 points
31 comments9 min readLW link1 review

Failed Ad­ven­tures in Delay

jefftkSep 20, 2022, 2:20 AM
8 points
0 comments2 min readLW link
(www.jefftk.com)

Gene drives: why the wait?

MetacelsusSep 19, 2022, 11:37 PM
125 points
50 comments3 min readLW link
(denovo.substack.com)

Prize idea: Trans­mit MIRI and Eliezer’s worldviews

eliflandSep 19, 2022, 9:21 PM
47 points
18 comments2 min readLW link

Ra­tion­al­ity Dojo Ber­lin Handout

UnplannedCauliflowerSep 19, 2022, 8:11 PM
19 points
0 comments7 min readLW link

A noob goes to the SERI MATS presentations

Lowell DenningsSep 19, 2022, 5:35 PM
27 points
0 comments5 min readLW link

Do bam­boos set them­selves on fire?

MalmesburySep 19, 2022, 3:34 PM
170 points
14 comments6 min readLW link1 review

Cam­bridge LW Meetup: Authen­tic Re­lat­ing Games

Tony WangSep 19, 2022, 2:51 PM
1 point
0 comments1 min readLW link

PIBBSS (AI al­ign­ment) is hiring for a Pro­ject Manager

Nora_AmmannSep 19, 2022, 1:54 PM
9 points
0 comments1 min readLW link

Quintin’s al­ign­ment pa­pers roundup—week 2

Quintin PopeSep 19, 2022, 1:41 PM
67 points
2 comments10 min readLW link

Some notes on solv­ing hard problems

Joe RoccaSep 19, 2022, 12:58 PM
50 points
8 comments29 min readLW link

Safety timelines: How long will it take to solve al­ign­ment?

Sep 19, 2022, 12:53 PM
37 points
7 comments6 min readLW link
(forum.effectivealtruism.org)

Bel­grade, Ser­bia—LW Meetup

игорь тимофеевSep 19, 2022, 12:47 PM
3 points
0 comments1 min readLW link

The ELK Fram­ing I’ve Used

sudoSep 19, 2022, 10:28 AM
5 points
1 comment1 min readLW link

Quick Book Re­view: Cru­cial Conversations

Gordon Seidoh WorleySep 19, 2022, 6:25 AM
28 points
2 comments2 min readLW link

How my team at Light­cone some­times gets stuff done

Bird ConceptSep 19, 2022, 5:47 AM
193 points
43 comments7 min readLW link1 review

EA & LW Fo­rums Weekly Sum­mary (12 − 18 Sep ’22)

Zoe WilliamsSep 19, 2022, 5:08 AM
11 points
0 comments13 min readLW link

Book Swap

ScrewtapeSep 19, 2022, 2:33 AM
11 points
0 comments2 min readLW link

Pre­tend­ing not to Notice

jefftkSep 19, 2022, 2:30 AM
46 points
12 comments2 min readLW link
(www.jefftk.com)

[To Be Re­vised]Per­haps the Mean­ing of Life, An Ad­ven­ture in Plu­ral­is­tic Morality

NoBadCakeSep 18, 2022, 10:37 PM
−5 points
3 comments4 min readLW link

Lev­er­ag­ing Le­gal In­for­mat­ics to Align AI

John NaySep 18, 2022, 8:39 PM
11 points
0 comments3 min readLW link
(forum.effectivealtruism.org)

The In­ter-Agent Facet of AI Alignment

Michael OesterleSep 18, 2022, 8:39 PM
12 points
1 comment5 min readLW link

Bi­den should be ap­plauded for ap­point­ing Re­nee We­grzyn for ARPA-H

ChristianKlSep 18, 2022, 7:57 PM
34 points
0 comments2 min readLW link

Sum­maries: Align­ment Fun­da­men­tals Curriculum

Leon LangSep 18, 2022, 1:08 PM
44 points
3 comments1 min readLW link
(docs.google.com)

In­ner al­ign­ment: what are we point­ing at?

lemonhopeSep 18, 2022, 11:09 AM
14 points
2 comments1 min readLW link

Pod­casts on sur­veys, slower AI, AI ar­gu­ments, etc

KatjaGraceSep 18, 2022, 7:30 AM
13 points
0 comments1 min readLW link
(worldspiritsockpuppet.com)

There is no royal road to alignment

Eleni AngelouSep 18, 2022, 3:33 AM
4 points
2 comments3 min readLW link

[Question] Up­dates on FLI’s Value Alig­ment Map?

T431Sep 17, 2022, 10:27 PM
17 points
4 comments1 min readLW link

Most sen­si­ble ab­strac­tion & fea­ture set for a sys­tems lan­guage?

Jasen QinSep 17, 2022, 7:49 PM
0 points
5 comments10 min readLW link

Sparse tri­nary weighted RNNs as a path to bet­ter lan­guage model interpretability

Am8ryllisSep 17, 2022, 7:48 PM
19 points
13 comments3 min readLW link

Ap­ply for men­tor­ship in AI Safety field-building

Orpheus16Sep 17, 2022, 7:06 PM
9 points
0 comments1 min readLW link
(forum.effectivealtruism.org)

Refine’s Third Blog Post Day/​Week

adamShimiSep 17, 2022, 5:03 PM
18 points
0 comments1 min readLW link

[Closed] Prize and fast track to al­ign­ment re­search at ALTER

Vanessa KosoySep 17, 2022, 4:58 PM
63 points
8 comments3 min readLW link

Re­mote Lo­gin For Turnkey De­vices?

jefftkSep 17, 2022, 3:40 PM
9 points
2 comments2 min readLW link
(www.jefftk.com)

Many ther­apy schools work with in­ner mul­ti­plic­ity (not just IFS)

Sep 17, 2022, 10:27 AM
52 points
16 comments18 min readLW link

Should AI learn hu­man val­ues, hu­man norms or some­thing else?

Q HomeSep 17, 2022, 6:19 AM
5 points
1 comment4 min readLW link

Take­aways from our ro­bust in­jury clas­sifier pro­ject [Red­wood Re­search]

dmzSep 17, 2022, 3:55 AM
143 points
12 comments6 min readLW link1 review

[Question] Why doesn’t China (or didn’t any­one) en­courage/​man­date elas­tomeric res­pi­ra­tors to con­trol COVID?

Wei DaiSep 17, 2022, 3:07 AM
34 points
15 comments1 min readLW link

Emer­gency Res­i­den­tial So­lar Jury-Rigging

jefftkSep 17, 2022, 2:30 AM
34 points
0 comments3 min readLW link
(www.jefftk.com)

A Bite Sized In­tro­duc­tion to ELK

Luk27182Sep 17, 2022, 12:28 AM
5 points
0 comments6 min readLW link

D&D.Sci Septem­ber 2022: The Allo­ca­tion Helm

abstractapplicSep 16, 2022, 11:10 PM
34 points
34 comments1 min readLW link

Towards a philos­o­phy of safety

jasoncrawfordSep 16, 2022, 9:10 PM
12 points
2 comments8 min readLW link
(rootsofprogress.org)

Refine Blog­post Day #3: The short­forms I did write

Alexander Gietelink OldenzielSep 16, 2022, 9:03 PM
23 points
0 comments1 min readLW link