Ilya: The AI sci­en­tist shap­ing the world

David VargaNov 20, 2023, 1:09 PM
11 points
0 comments4 min readLW link

[Linkpost] OpenAI’s In­terim CEO’s views on AI x-risk

Bogdan Ionut CirsteaNov 20, 2023, 1:00 PM
9 points
0 comments1 min readLW link

A Girar­dian in­ter­pre­ta­tion of the Alt­man af­fair, it’s on my to-do list

Bill BenzonNov 20, 2023, 12:21 PM
3 points
0 comments1 min readLW link

[Question] How did you in­te­grate voice-to-text AI into your work­flow?

ChristianKlNov 20, 2023, 12:01 PM
28 points
12 comments1 min readLW link

Short film adap­ta­tion of the es­say “The Sim­ple Truth” [eng sub]

bayesyatinaNov 20, 2023, 11:42 AM
15 points
4 comments1 min readLW link

For Civ­i­liza­tion and Against Niceness

Gabriel AlfourNov 20, 2023, 10:56 AM
46 points
14 comments8 min readLW link
(cognition.cafe)

“Op­ti­mists always win!” is the biggest sur­vivor­ship bias

Yunfan YeNov 20, 2023, 8:53 AM
8 points
0 comments2 min readLW link

Sam Alt­man, Greg Brock­man and oth­ers from OpenAI join Microsoft

OzyrusNov 20, 2023, 8:23 AM
58 points
15 comments1 min readLW link
(twitter.com)

Em­mett Shear to be in­terim CEO of OpenAI

Max HNov 20, 2023, 5:40 AM
21 points
5 comments1 min readLW link
(www.theverge.com)

[Question] Where can I learn about al­gorith­mic trans­for­ma­tion of AI prompts?

denyeverywhereNov 20, 2023, 4:35 AM
0 points
1 comment1 min readLW link

Ex­treme web­site and app blocking

tbenthompsonNov 20, 2023, 3:53 AM
7 points
0 comments4 min readLW link
(tbenthompson.com)

Am I go­ing in­sane or is the qual­ity of ed­u­ca­tion at top uni­ver­si­ties shock­ingly low?

ChrisRumanovNov 20, 2023, 3:53 AM
26 points
30 comments2 min readLW link

Res­i­den­tial De­mo­li­tion Tooling

jefftkNov 20, 2023, 3:20 AM
16 points
1 comment3 min readLW link
(www.jefftk.com)

Aaron Silver­book on anti-cav­ity bacteria

DanielFilanNov 20, 2023, 3:06 AM
31 points
3 comments1 min readLW link
(youtu.be)

Cheap Model → Big Model design

Maxwell PetersonNov 19, 2023, 10:50 PM
15 points
2 comments7 min readLW link

Hu­man-like sys­tem­atic gen­er­al­iza­tion through a meta-learn­ing neu­ral network

BurnyNov 19, 2023, 9:41 PM
7 points
0 comments2 min readLW link
(twitter.com)

“Benev­olent [ie, Ruler] AI is a bad idea” and a sug­gested alternative

the gears to ascensionNov 19, 2023, 8:22 PM
22 points
11 comments1 min readLW link
(www.palladiummag.com)

Align­ment is Hard: An Un­com­putable Align­ment Problem

Alexander BistagneNov 19, 2023, 7:38 PM
−5 points
4 comments1 min readLW link
(github.com)

New pa­per shows truth­ful­ness & in­struc­tion-fol­low­ing don’t gen­er­al­ize by default

joshcNov 19, 2023, 7:27 PM
60 points
0 comments4 min readLW link

In favour of a sovereign state of Gaza

Yair HalberstadtNov 19, 2023, 4:08 PM
8 points
3 comments4 min readLW link

My Crit­i­cism of Sin­gu­lar Learn­ing Theory

Joar SkalseNov 19, 2023, 3:19 PM
83 points
56 comments12 min readLW link

“Why can’t you just turn it off?”

RokoNov 19, 2023, 2:46 PM
48 points
25 comments1 min readLW link

Spa­cious­ness In Part­ner Dance: A Nat­u­ral­ism Demo

LoganStrohlNov 19, 2023, 7:00 AM
78 points
6 comments19 min readLW link1 review

Alt­man firing re­tal­i­a­tion in­com­ing?

trevorNov 19, 2023, 12:10 AM
50 points
23 comments5 min readLW link

When Will AIs Develop Long-Term Plan­ning?

PeterMcCluskeyNov 19, 2023, 12:08 AM
18 points
5 comments4 min readLW link
(bayesianinvestor.com)

Killswitch

JunioNov 18, 2023, 10:53 PM
2 points
0 comments3 min readLW link

Superalignment

Douglas_ReayNov 18, 2023, 10:37 PM
−4 points
4 comments1 min readLW link
(openai.com)

Pre­dictable Defect-Co­op­er­ate?

quetzal_rainbowNov 18, 2023, 3:38 PM
7 points
1 comment2 min readLW link

I think I’m just con­fused. Once a model ex­ists, how do you “red-team” it to see whether it’s safe. Isn’t it already dan­ger­ous?

FTPickleNov 18, 2023, 2:16 PM
21 points
13 comments1 min readLW link

AI Safety Camp 2024

Linda LinseforsNov 18, 2023, 10:37 AM
15 points
1 comment4 min readLW link
(aisafety.camp)

Post-EAG Mu­sic Party

jefftkNov 18, 2023, 3:00 AM
14 points
2 comments2 min readLW link
(www.jefftk.com)

Let­ter to a Sonoma County Jail Cell

MadHatterNov 18, 2023, 2:24 AM
9 points
1 comment1 min readLW link
(open.substack.com)

1. A Sense of Fair­ness: De­con­fus­ing Ethics

RogerDearnaleyNov 17, 2023, 8:55 PM
16 points
8 comments15 min readLW link

Sam Alt­man fired from OpenAI

LawrenceCNov 17, 2023, 8:42 PM
192 points
75 comments1 min readLW link
(openai.com)

On the lethal­ity of bi­ased hu­man re­ward ratings

Nov 17, 2023, 6:59 PM
48 points
10 comments37 min readLW link

Coup probes: Catch­ing catas­tro­phes with probes trained off-policy

Fabien RogerNov 17, 2023, 5:58 PM
85 points
9 comments11 min readLW link1 review

On Lies and Liars

Gabriel AlfourNov 17, 2023, 5:13 PM
33 points
4 comments14 min readLW link
(cognition.cafe)

Clas­sify­ing rep­re­sen­ta­tions of sparse au­toen­coders (SAEs)

AnnahNov 17, 2023, 1:54 PM
15 points
6 comments2 min readLW link

R&D is a Huge Ex­ter­nal­ity, So Why Do Mar­kets Do So Much of it?

Maxwell TabarrokNov 17, 2023, 1:14 PM
15 points
14 comments3 min readLW link
(maximumprogress.substack.com)

On ex­clud­ing dan­ger­ous in­for­ma­tion from training

ShayBenMosheNov 17, 2023, 11:14 AM
23 points
5 comments3 min readLW link

The dan­gers of re­pro­duc­ing while old

garymmNov 17, 2023, 5:55 AM
23 points
6 comments1 min readLW link
(www.garymm.org)

I put odds on ends with Nathan Young

KatjaGraceNov 17, 2023, 5:40 AM
8 points
0 comments1 min readLW link
(worldspiritsockpuppet.com)

De­bate helps su­per­vise hu­man ex­perts [Paper]

habrykaNov 17, 2023, 5:25 AM
29 points
6 comments1 min readLW link
(github.com)

A to Z of things

KatjaGraceNov 17, 2023, 5:20 AM
71 points
8 comments1 min readLW link1 review
(worldspiritsockpuppet.com)

On Tap­ping Out

ScrewtapeNov 17, 2023, 3:23 AM
50 points
14 comments8 min readLW link1 review

Elic­it­ing La­tent Knowl­edge in Com­pre­hen­sive AI Ser­vices Models

acabodiNov 17, 2023, 2:36 AM
6 points
0 comments5 min readLW link

Some Rules for an Alge­bra of Bayes Nets

Nov 16, 2023, 11:53 PM
77 points
38 comments14 min readLW link1 review

How much to up­date on re­cent AI gov­er­nance moves?

Nov 16, 2023, 11:46 PM
112 points
5 comments29 min readLW link

New LessWrong fea­ture: Dialogue Matching

jacobjacobNov 16, 2023, 9:27 PM
106 points
22 comments3 min readLW link

Towards Eval­u­at­ing AI Sys­tems for Mo­ral Sta­tus Us­ing Self-Reports

Nov 16, 2023, 8:18 PM
45 points
3 comments1 min readLW link
(arxiv.org)