Don’t al­ign agents to eval­u­a­tions of plans

TurnTroutNov 26, 2022, 9:16 PM
48 points
49 comments18 min readLW link

What videos should Ra­tional An­i­ma­tions make?

WriterNov 26, 2022, 8:28 PM
30 points
24 comments1 min readLW link

The First Filter

Nov 26, 2022, 7:37 PM
67 points
5 comments1 min readLW link

Re­spect­ing your Lo­cal Preferences

Scott GarrabrantNov 26, 2022, 7:04 PM
73 points
1 comment4 min readLW link

[Question] Opinions on the sleep synap­tic home­osta­sis hy­poth­e­sis?

Angela PretoriusNov 26, 2022, 7:01 PM
3 points
0 comments1 min readLW link

Why square er­rors?

AprillionNov 26, 2022, 1:40 PM
41 points
11 comments2 min readLW link

[Question] As­sum­ing that at least one re­li­gion is true, what would you ex­pect it to be?

risediveNov 26, 2022, 8:34 AM
−9 points
9 comments1 min readLW link

Three Align­ment Schemas & Their Problems

Shoshannah TekofskyNov 26, 2022, 4:25 AM
19 points
1 comment6 min readLW link

The many types of blog posts

Adam ZernerNov 26, 2022, 3:57 AM
10 points
2 comments4 min readLW link

New Fron­tiers in Mojibake

Adam ScherlisNov 26, 2022, 2:37 AM
60 points
7 comments6 min readLW link1 review
(adam.scherlis.com)

Semi-con­duc­tor/​AI Stock Dis­cus­sion.

sapphireNov 25, 2022, 11:35 PM
28 points
25 comments1 min readLW link

NEFFA Should Allow Small Children

jefftkNov 25, 2022, 11:00 PM
10 points
2 comments2 min readLW link
(www.jefftk.com)

Pod­cast: Shoshan­nah Tekofsky on skil­ling up in AI safety, vis­it­ing Berkeley, and de­vel­op­ing novel re­search ideas

AkashNov 25, 2022, 8:47 PM
37 points
2 comments9 min readLW link

The man and the tool

pedroalvaradoNov 25, 2022, 7:51 PM
−1 points
0 comments4 min readLW link

[Question] What AI newslet­ters or sub­stacks about AI do you recom­mend?

wunanNov 25, 2022, 7:29 PM
6 points
1 comment1 min readLW link

Mechanis­tic anomaly de­tec­tion and ELK

paulfchristianoNov 25, 2022, 6:50 PM
135 points
22 comments21 min readLW link
(ai-alignment.com)

The Least Con­tro­ver­sial Ap­pli­ca­tion of Geo­met­ric Rationality

Scott GarrabrantNov 25, 2022, 4:50 PM
60 points
22 comments4 min readLW link

Planes are still decades away from dis­plac­ing most bird jobs

guzeyNov 25, 2022, 4:49 PM
166 points
13 comments3 min readLW link

Take part in our gi­ant study of cog­ni­tive abil­ities and get a cus­tomized re­port of your strengths and weak­nesses!

spencergNov 25, 2022, 4:28 PM
8 points
1 comment1 min readLW link
(www.guidedtrack.com)

Guardian AI (Misal­igned sys­tems are all around us.)

Jessica RumbelowNov 25, 2022, 3:55 PM
15 points
6 comments2 min readLW link

In­tu­itions by ML re­searchers may get pro­gres­sively worse con­cern­ing likely can­di­dates for trans­for­ma­tive AI

Viktor RehnbergNov 25, 2022, 3:49 PM
7 points
0 comments2 min readLW link

Refin­ing the Sharp Left Turn threat model, part 2: ap­ply­ing al­ign­ment techniques

Nov 25, 2022, 2:36 PM
39 points
9 comments6 min readLW link
(vkrakovna.wordpress.com)

[Question] Who holds all the USDT?

ChristianKlNov 25, 2022, 11:58 AM
17 points
6 comments1 min readLW link

Fair Col­lec­tive Effi­cient Altruism

Jobst HeitzigNov 25, 2022, 9:38 AM
2 points
1 comment5 min readLW link

[Question] If hu­man­ity one day dis­cov­ers that it is a form of dis­ease that threat­ens to de­stroy the uni­verse, should it al­low it­self to be shut down?

ShmiNov 25, 2022, 8:27 AM
4 points
12 comments1 min readLW link

Could a sin­gle alien mes­sage de­stroy us?

Nov 25, 2022, 7:32 AM
61 points
23 comments6 min readLW link
(youtu.be)

How do I start a pro­gram­ming ca­reer in the West?

Lao MeinNov 25, 2022, 6:37 AM
38 points
7 comments2 min readLW link

The AI Safety com­mu­nity has four main work groups, Strat­egy, Gover­nance, Tech­ni­cal and Move­ment Building

peterslatteryNov 25, 2022, 3:45 AM
1 point
0 comments6 min readLW link

Less Suc­cess­ful Cider Adventures

jefftkNov 25, 2022, 1:50 AM
11 points
1 comment1 min readLW link
(www.jefftk.com)

Gliders in Lan­guage Models

Alexandre VariengienNov 25, 2022, 12:38 AM
30 points
11 comments10 min readLW link

On Kelly and altruism

philhNov 24, 2022, 11:40 PM
17 points
6 comments12 min readLW link
(reasonableapproximation.net)

Open tech­ni­cal prob­lem: A Quinean proof of Löb’s the­o­rem, for an eas­ier car­toon guide

Andrew_CritchNov 24, 2022, 9:16 PM
58 points
35 comments3 min readLW link1 review

[Question] His­tor­i­cal ex­am­ples of peo­ple gain­ing un­usual cog­ni­tive abil­ities?

Nicholas / Heather KrossNov 24, 2022, 7:01 PM
8 points
2 comments1 min readLW link

Cor­rigi­bil­ity Via Thought-Pro­cess Deference

Thane RuthenisNov 24, 2022, 5:06 PM
17 points
5 comments9 min readLW link

Geo­met­ric Ex­plo­ra­tion, Arith­metic Exploitation

Scott GarrabrantNov 24, 2022, 3:36 PM
126 points
5 comments7 min readLW link

What I Learned Run­ning Refine

adamShimiNov 24, 2022, 2:49 PM
108 points
5 comments4 min readLW link

Covid 11/​24/​22: Thanks for Good Health

ZviNov 24, 2022, 1:00 PM
26 points
4 comments8 min readLW link
(thezvi.wordpress.com)

[Question] Dumb and ill-posed ques­tion: Is con­cep­tual re­search like this MIRI pa­per on the shut­down prob­lem/​Cor­rigi­bil­ity “real”

joraineNov 24, 2022, 5:08 AM
25 points
11 comments1 min readLW link

Clar­ify­ing wire­head­ing terminology

leogaoNov 24, 2022, 4:53 AM
66 points
6 comments1 min readLW link

LW Beta Fea­ture: Side-Comments

jimrandomhNov 24, 2022, 1:55 AM
103 points
47 comments1 min readLW link

Against “Clas­sic Style”

Cleo NardoNov 23, 2022, 10:10 PM
67 points
30 comments4 min readLW link

South Bay ACX/​LW Meetup

ISNov 23, 2022, 10:05 PM
2 points
0 comments1 min readLW link

Meme Dialects

jefftkNov 23, 2022, 9:30 PM
26 points
1 comment2 min readLW link
(www.jefftk.com)

[Question] When do you vi­su­al­ize (or not) while do­ing math?

Alex_AltairNov 23, 2022, 8:15 PM
20 points
9 comments1 min readLW link

When AI solves a game, fo­cus on the game’s me­chan­ics, not its theme.

Cleo NardoNov 23, 2022, 7:16 PM
89 points
7 comments2 min readLW link

The Geo­met­ric Expectation

Scott GarrabrantNov 23, 2022, 6:05 PM
151 points
21 comments4 min readLW link

“Far Co­or­di­na­tion”

DragonGodNov 23, 2022, 5:14 PM
6 points
17 comments9 min readLW link

Con­jec­ture Se­cond Hiring Round

23 Nov 2022 17:11 UTC
92 points
0 comments1 min readLW link

Con­jec­ture: a ret­ro­spec­tive af­ter 8 months of work

23 Nov 2022 17:10 UTC
180 points
9 comments8 min readLW link

Against a Gen­eral Fac­tor of Doom

Jeffrey Heninger23 Nov 2022 16:50 UTC
61 points
19 comments4 min readLW link1 review
(aiimpacts.org)