ChatGPT tells sto­ries about XP-708-DQ, Eliezer, drag­ons, dark sor­cer­esses, and un­al­igned robots be­com­ing aligned

Bill Benzon8 Jan 2023 23:21 UTC
6 points
2 comments18 min readLW link

Si­mu­lacra are Things

janus8 Jan 2023 23:03 UTC
63 points
7 comments2 min readLW link

[Question] GPT learn­ing from smarter texts?

Viliam8 Jan 2023 22:23 UTC
26 points
7 comments1 min readLW link

La­tent vari­able pre­dic­tion mar­kets mockup + de­signer request

tailcalled8 Jan 2023 22:18 UTC
25 points
4 comments1 min readLW link

Cita­bil­ity of Less­wrong and the Align­ment Forum

Leon Lang8 Jan 2023 22:12 UTC
48 points
2 comments1 min readLW link

I tried to learn as much Deep Learn­ing math as I could in 24 hours

Phosphorous8 Jan 2023 21:07 UTC
31 points
2 comments7 min readLW link

[Question] What spe­cific thing would you do with AI Align­ment Re­search As­sis­tant GPT?

quetzal_rainbow8 Jan 2023 19:24 UTC
45 points
9 comments1 min readLW link

[Question] Re­search ideas (AI In­ter­pretabil­ity & Neu­ro­sciences) for a 2-months project

flux8 Jan 2023 15:36 UTC
3 points
1 comment1 min readLW link

200 COP in MI: Image Model Interpretability

Neel Nanda8 Jan 2023 14:53 UTC
18 points
3 comments6 min readLW link

Hal­i­fax Monthly Meetup: Moloch in the HRM

Ideopunk8 Jan 2023 14:49 UTC
10 points
0 comments1 min readLW link

Dangers of deference

TsviBT8 Jan 2023 14:36 UTC
58 points
5 comments2 min readLW link

Could evolu­tion pro­duce some­thing truly al­igned with its own op­ti­miza­tion stan­dards? What would an an­swer to this mean for AI al­ign­ment?

No77e8 Jan 2023 11:04 UTC
3 points
4 comments1 min readLW link

AI psy­chol­ogy should ground the the­o­ries of AI con­scious­ness and in­form hu­man-AI eth­i­cal in­ter­ac­tion design

Roman Leventov8 Jan 2023 6:37 UTC
19 points
8 comments2 min readLW link

Stop Talk­ing to Each Other and Start Buy­ing Things: Three Decades of Sur­vival in the Desert of So­cial Media

the gears to ascension8 Jan 2023 4:45 UTC
1 point
14 comments1 min readLW link
(catvalente.substack.com)

Can Ads be GDPR Com­pli­ant?

jefftk8 Jan 2023 2:50 UTC
39 points
10 comments7 min readLW link
(www.jefftk.com)

Fea­ture sug­ges­tion: add a ‘clar­ity score’ to posts

LVSN8 Jan 2023 1:00 UTC
17 points
5 comments1 min readLW link

[Question] How do I bet­ter stick to a morn­ing sched­ule?

Randomized, Controlled8 Jan 2023 0:52 UTC
8 points
8 comments1 min readLW link

Pro­tec­tion­ism will Slow the De­ploy­ment of AI

bgold7 Jan 2023 20:57 UTC
30 points
6 comments2 min readLW link

David Krueger on AI Align­ment in Academia, Co­or­di­na­tion and Test­ing Intuitions

Michaël Trazzi7 Jan 2023 19:59 UTC
13 points
0 comments4 min readLW link
(theinsideview.ai)

Look­ing for Span­ish AI Align­ment Researchers

Antb7 Jan 2023 18:52 UTC
7 points
3 comments1 min readLW link

Noth­ing New: Pro­duc­tive Reframing

adamShimi7 Jan 2023 18:43 UTC
44 points
7 comments3 min readLW link
(epistemologicalvigilance.substack.com)

[Question] Ask­ing for a name for a symp­tom of rationalization

metachirality7 Jan 2023 18:34 UTC
6 points
5 comments1 min readLW link

The Foun­tain of Health: a First Prin­ci­ples Guide to Rejuvenation

PhilJackson7 Jan 2023 18:34 UTC
115 points
38 comments41 min readLW link

What’s wrong with the pa­per­clips sce­nario?

No77e7 Jan 2023 17:58 UTC
31 points
11 comments1 min readLW link

Build­ing a Rosetta stone for re­duc­tion­ism and telism (WIP)

mrcbarbier7 Jan 2023 16:22 UTC
5 points
0 comments8 min readLW link

What should a telic sci­ence look like?

mrcbarbier7 Jan 2023 16:13 UTC
10 points
0 comments11 min readLW link

Open & Wel­come Thread—Jan­uary 2023

DragonGod7 Jan 2023 11:16 UTC
15 points
37 comments1 min readLW link

An­chor­ing fo­cal­ism and the Iden­ti­fi­able vic­tim effect: Bias in Eval­u­at­ing AGI X-Risks

Remmelt7 Jan 2023 9:59 UTC
1 point
2 comments1 min readLW link

Can ChatGPT count?

p.b.7 Jan 2023 7:57 UTC
13 points
11 comments2 min readLW link

Benev­olent AI and men­tal health

peter schwarz7 Jan 2023 1:30 UTC
−31 points
2 comments1 min readLW link

An Ig­no­rant View on Ineffec­tive­ness of AI Safety

Iknownothing7 Jan 2023 1:29 UTC
14 points
7 comments3 min readLW link

Op­ti­miz­ing Hu­man Col­lec­tive In­tel­li­gence to Align AI

Shoshannah Tekofsky7 Jan 2023 1:21 UTC
12 points
5 comments6 min readLW link

[Question] [Dis­cus­sion] How Broad is the Hu­man Cog­ni­tive Spec­trum?

DragonGod7 Jan 2023 0:56 UTC
29 points
51 comments2 min readLW link

Im­pli­ca­tions of simulators

TW1237 Jan 2023 0:37 UTC
17 points
0 comments12 min readLW link

[Linkpost] Jan Leike on three kinds of al­ign­ment taxes

Akash6 Jan 2023 23:57 UTC
27 points
2 comments3 min readLW link
(aligned.substack.com)

The Limit of Lan­guage Models

DragonGod6 Jan 2023 23:53 UTC
44 points
26 comments4 min readLW link

Why didn’t we get the four-hour work­day?

jasoncrawford6 Jan 2023 21:29 UTC
139 points
34 comments6 min readLW link
(rootsofprogress.org)

AI se­cu­rity might be helpful for AI alignment

Igor Ivanov6 Jan 2023 20:16 UTC
36 points
1 comment2 min readLW link

Cat­e­go­riz­ing failures as “outer” or “in­ner” mis­al­ign­ment is of­ten confused

Rohin Shah6 Jan 2023 15:48 UTC
93 points
21 comments8 min readLW link

Defi­ni­tions of “ob­jec­tive” should be Prob­a­ble and Predictive

Rohin Shah6 Jan 2023 15:40 UTC
43 points
27 comments12 min readLW link

200 COP in MI: Tech­niques, Tool­ing and Automation

Neel Nanda6 Jan 2023 15:08 UTC
13 points
0 comments15 min readLW link

Ball Square Sta­tion and Rider­ship Maximization

jefftk6 Jan 2023 13:20 UTC
13 points
0 comments1 min readLW link
(www.jefftk.com)

Child­hood Roundup #1

Zvi6 Jan 2023 13:00 UTC
84 points
27 comments8 min readLW link
(thezvi.wordpress.com)

AI im­prov­ing AI [MLAISU W01!]

Esben Kran6 Jan 2023 11:13 UTC
5 points
0 comments4 min readLW link
(newsletter.apartresearch.com)

AI Safety Camp, Vir­tual Edi­tion 2023

Linda Linsefors6 Jan 2023 11:09 UTC
40 points
10 comments3 min readLW link
(aisafety.camp)

Kakistocuriosity

LVSN6 Jan 2023 7:38 UTC
7 points
3 comments1 min readLW link

AI Safety Camp: Ma­chine Learn­ing for Scien­tific Dis­cov­ery

Eleni Angelou6 Jan 2023 3:21 UTC
3 points
0 comments1 min readLW link

Me­tac­u­lus Year in Re­view: 2022

ChristianWilliams6 Jan 2023 1:23 UTC
6 points
0 comments1 min readLW link

UDASSA

Jacob Falkovich6 Jan 2023 1:07 UTC
21 points
8 comments10 min readLW link

The In­vol­un­tary Pacifists

Capybasilisk6 Jan 2023 0:28 UTC
11 points
3 comments2 min readLW link