Agents vs. Pre­dic­tors: Con­crete differ­en­ti­at­ing factors

evhub24 Feb 2023 23:50 UTC
37 points
3 comments4 min readLW link

Chris­ti­ano (ARC) and GA (Con­jec­ture) Dis­cuss Align­ment Cruxes

24 Feb 2023 23:03 UTC
61 points
7 comments47 min readLW link

Ret­ro­spec­tive on the 2022 Con­jec­ture AI Discussions

Andrea_Miotti24 Feb 2023 22:41 UTC
90 points
5 comments2 min readLW link

How pop­u­lar is ChatGPT? Part 1: more pop­u­lar than Tay­lor Swift

Harlan24 Feb 2023 22:30 UTC
56 points
0 comments2 min readLW link
(aiimpacts.org)

Are you sta­bly al­igned?

Seth Herd24 Feb 2023 22:08 UTC
13 points
0 comments2 min readLW link

Puz­zle Cycles

Screwtape24 Feb 2023 21:35 UTC
8 points
2 comments4 min readLW link

Sam Alt­man: “Plan­ning for AGI and be­yond”

LawrenceC24 Feb 2023 20:28 UTC
104 points
54 comments6 min readLW link
(openai.com)

A Pro­posed Test to Deter­mine the Ex­tent to Which Large Lan­guage Models Un­der­stand the Real World

Bruce G24 Feb 2023 20:20 UTC
4 points
7 comments8 min readLW link

Meta “open sources” LMs com­pet­i­tive with Chin­chilla, PaLM, and code-davinci-002 (Paper)

LawrenceC24 Feb 2023 19:57 UTC
38 points
19 comments1 min readLW link
(research.facebook.com)

Re­la­tion­ship Orientations

DaystarEld24 Feb 2023 19:43 UTC
37 points
1 comment3 min readLW link
(daystareld.com)

The alien simu­la­tion meme doesn’t make sense

FTPickle24 Feb 2023 19:27 UTC
4 points
1 comment1 min readLW link

Exit Duty Gen­er­a­tor by Matti Häyry

Oldphan24 Feb 2023 18:35 UTC
−2 points
0 comments1 min readLW link
(www.cambridge.org)

2023 Stan­ford Ex­is­ten­tial Risks Conference

elizabethcooper24 Feb 2023 18:35 UTC
7 points
0 comments1 min readLW link

How ma­jor gov­ern­ments can help with the most im­por­tant century

HoldenKarnofsky24 Feb 2023 18:20 UTC
29 points
0 comments4 min readLW link
(www.cold-takes.com)

Con­sent Isn’t Always Enough

jefftk24 Feb 2023 15:40 UTC
57 points
16 comments3 min readLW link
(www.jefftk.com)

[Question] Train­ing for cor­ri­ga­bil­ity: ob­vi­ous prob­lems?

Ben Amitay24 Feb 2023 14:02 UTC
4 points
6 comments1 min readLW link

Death and Des­per­a­tion

Ustice24 Feb 2023 12:43 UTC
1 point
3 comments1 min readLW link

[Question] Are there ra­tio­nal­ity tech­niques similar to star­ing at the wall for 4 hours?

trevor24 Feb 2023 11:48 UTC
31 points
8 comments1 min readLW link

The fast take­off motte/​bailey

lc24 Feb 2023 7:11 UTC
0 points
7 comments1 min readLW link

AGI sys­tems & hu­mans will both need to solve the al­ign­ment problem

Jeffrey Ladish24 Feb 2023 3:29 UTC
59 points
14 comments4 min readLW link

A poor but cer­tain at­tempt to philo­soph­i­cally un­der­mine the or­thog­o­nal­ity of in­tel­li­gence and aims

Jay9524 Feb 2023 3:03 UTC
−2 points
1 comment1 min readLW link

I wanna Gan­dalf here

Igor Timofeev24 Feb 2023 1:22 UTC
5 points
4 comments1 min readLW link

[Link] A com­mu­nity alert about Ziz

DanielFilan24 Feb 2023 0:06 UTC
169 points
131 comments2 min readLW link3 reviews
(medium.com)

Teleose­man­tics!

abramdemski23 Feb 2023 23:26 UTC
82 points
27 comments6 min readLW link1 review

AI that shouldn’t work, yet kind of does

Donald Hobson23 Feb 2023 23:18 UTC
27 points
8 comments3 min readLW link

The AGI Op­ti­mist’s Dilemma

kaputmi23 Feb 2023 20:20 UTC
−6 points
1 comment1 min readLW link

Search­ing for a model’s con­cepts by their shape – a the­o­ret­i­cal framework

23 Feb 2023 20:14 UTC
51 points
0 comments19 min readLW link

Why I’m Skep­ti­cal of De-Extinction

Niko_McCarty23 Feb 2023 19:42 UTC
16 points
1 comment11 min readLW link
(cell.substack.com)

[Question] What causes ran­dom­ness?

lotsofquestions23 Feb 2023 18:50 UTC
1 point
12 comments1 min readLW link

Somerville Roads Get­ting More Danger­ous?

jefftk23 Feb 2023 18:20 UTC
15 points
1 comment1 min readLW link
(www.jefftk.com)

EIS XII: Sum­mary

scasper23 Feb 2023 17:45 UTC
18 points
0 comments6 min readLW link

How to sur­vive in an AGI cataclysm

RomanS23 Feb 2023 14:34 UTC
−4 points
3 comments4 min readLW link

Covid 2/​23/​23: Your Best Pos­si­ble Situation

Zvi23 Feb 2023 13:10 UTC
92 points
9 comments5 min readLW link
(thezvi.wordpress.com)

Full Tran­script: Eliezer Yud­kowsky on the Ban­kless podcast

23 Feb 2023 12:34 UTC
138 points
89 comments75 min readLW link

Au­to­mated Sand­wich­ing & Quan­tify­ing Hu­man-LLM Co­op­er­a­tion: ScaleOver­sight hackathon results

23 Feb 2023 10:48 UTC
8 points
0 comments6 min readLW link

[Question] How to es­ti­mate a pre-al­igned value for a com­mon dis­cus­sion ground?

EL_File413823 Feb 2023 10:38 UTC
−4 points
12 comments1 min readLW link

In­ter­per­sonal al­ign­ment in­tu­itions

TekhneMakre23 Feb 2023 9:37 UTC
29 points
18 comments2 min readLW link

Big Mac Sub­sidy?

jefftk23 Feb 2023 4:00 UTC
157 points
25 comments2 min readLW link
(www.jefftk.com)

[Question] What moral sys­tems (e.g util­i­tar­i­anism) are com­mon among LessWrong users?

hollowing23 Feb 2023 3:33 UTC
1 point
9 comments1 min readLW link

AGI is likely to be cautious

PonPonPon23 Feb 2023 1:16 UTC
9 points
14 comments3 min readLW link

Short Notes on Re­search Process

Shoshannah Tekofsky22 Feb 2023 23:41 UTC
21 points
0 comments2 min readLW link

Video/​an­i­ma­tion: Neel Nanda ex­plains what mechanis­tic in­ter­pretabil­ity is

DanielFilan22 Feb 2023 22:42 UTC
24 points
7 comments1 min readLW link
(youtu.be)

A Tele­pathic Exam about AI and Consequentialism

alkexr22 Feb 2023 21:00 UTC
4 points
4 comments4 min readLW link

[Question] In­ject­ing noise to GPT to get mul­ti­ple answers

bipolo22 Feb 2023 20:02 UTC
1 point
1 comment1 min readLW link

EIS XI: Mov­ing Forward

scasper22 Feb 2023 19:05 UTC
19 points
2 comments9 min readLW link

Build­ing and En­ter­tain­ing Couples

Jacob Falkovich22 Feb 2023 19:02 UTC
85 points
11 comments4 min readLW link

Can sub­marines swim?

jasoncrawford22 Feb 2023 18:48 UTC
18 points
14 comments13 min readLW link
(rootsofprogress.org)

Is there a ML agent that aban­dons it’s util­ity func­tion out-of-dis­tri­bu­tion with­out los­ing ca­pa­bil­ities?

Christopher King22 Feb 2023 16:49 UTC
1 point
7 comments1 min readLW link

The male AI al­ign­ment solution

TekhneMakre22 Feb 2023 16:34 UTC
−25 points
24 comments1 min readLW link

Progress links and tweets, 2023-02-22

jasoncrawford22 Feb 2023 16:23 UTC
13 points
0 comments1 min readLW link
(rootsofprogress.org)