The ‘Ne­glected Ap­proaches’ Ap­proach: AE Stu­dio’s Align­ment Agenda

18 Dec 2023 20:35 UTC
168 points
21 comments12 min readLW link

The Short­est Path Between Scylla and Charybdis

Thane Ruthenis18 Dec 2023 20:08 UTC
50 points
8 comments5 min readLW link

OpenAI: Pre­pared­ness framework

Zach Stein-Perlman18 Dec 2023 18:30 UTC
70 points
23 comments4 min readLW link
(openai.com)

[Valence se­ries] 5. “Valence Di­sor­ders” in Men­tal Health & Personality

Steven Byrnes18 Dec 2023 15:26 UTC
43 points
12 comments13 min readLW link

Dis­cus­sion: Challenges with Un­su­per­vised LLM Knowl­edge Discovery

18 Dec 2023 11:58 UTC
147 points
21 comments10 min readLW link

In­ter­pret­ing the Learn­ing of Deceit

RogerDearnaley18 Dec 2023 8:12 UTC
30 points
14 comments9 min readLW link

Talk: “AI Would Be A Lot Less Alarm­ing If We Un­der­stood Agents”

johnswentworth17 Dec 2023 23:46 UTC
58 points
3 comments1 min readLW link
(www.youtube.com)

∀: a story

Richard_Ngo17 Dec 2023 22:42 UTC
37 points
1 comment8 min readLW link
(www.narrativeark.xyz)

Re­viv­ing a 2015 MacBook

jefftk17 Dec 2023 21:00 UTC
11 points
0 comments1 min readLW link
(www.jefftk.com)

A Com­mon-Sense Case For Mu­tu­ally-Misal­igned AGIs Ally­ing Against Humans

Thane Ruthenis17 Dec 2023 20:28 UTC
29 points
7 comments11 min readLW link

The Limits of Ar­tifi­cial Con­scious­ness: A Biol­ogy-Based Cri­tique of Chalmers’ Fad­ing Qualia Argument

Štěpán Los17 Dec 2023 19:11 UTC
−6 points
9 comments17 min readLW link

What makes teach­ing math special

Viliam17 Dec 2023 14:15 UTC
41 points
27 comments11 min readLW link

The pre­dic­tive power of dis­si­pa­tive adaptation

dr_s17 Dec 2023 14:01 UTC
56 points
14 comments19 min readLW link

Linkpost: Francesca v Harvard

Linch17 Dec 2023 6:18 UTC
5 points
5 comments2 min readLW link
(www.francesca-v-harvard.org)

Les­sons from mas­sag­ing my­self, oth­ers, dogs, and cats

Chipmonk17 Dec 2023 4:28 UTC
2 points
27 comments5 min readLW link
(chipmonk.blog)

The Serendipity of Density

jefftk17 Dec 2023 3:50 UTC
40 points
4 comments1 min readLW link
(www.jefftk.com)

Bounty: Di­verse hard tasks for LLM agents

17 Dec 2023 1:04 UTC
49 points
31 comments16 min readLW link

2022 (and All Time) Posts by Ping­back Count

Raemon16 Dec 2023 21:17 UTC
53 points
14 comments6 min readLW link

“Hu­man­ity vs. AGI” Will Never Look Like “Hu­man­ity vs. AGI” to Humanity

Thane Ruthenis16 Dec 2023 20:08 UTC
189 points
34 comments5 min readLW link

A vi­sual anal­ogy for text gen­er­a­tion by LLMs?

Bill Benzon16 Dec 2023 17:58 UTC
3 points
0 comments1 min readLW link

Up­grad­ing the AI Safety Community

16 Dec 2023 15:34 UTC
42 points
9 comments42 min readLW link

cold alu­minum for medicine

bhauth16 Dec 2023 14:38 UTC
42 points
4 comments4 min readLW link
(www.bhauth.com)

Scal­able Over­sight and Weak-to-Strong Gen­er­al­iza­tion: Com­pat­i­ble ap­proaches to the same problem

16 Dec 2023 5:49 UTC
73 points
3 comments6 min readLW link

Weak-to-Strong Gen­er­al­iza­tion: Elic­it­ing Strong Ca­pa­bil­ities With Weak Supervision

leogao16 Dec 2023 5:39 UTC
55 points
5 comments1 min readLW link

Pope Fran­cis shares thoughts on re­spon­si­ble AI development

corruptedCatapillar16 Dec 2023 3:49 UTC
15 points
4 comments1 min readLW link
(www.vatican.va)

Cur­rent AIs Provide Nearly No Data Rele­vant to AGI Alignment

Thane Ruthenis15 Dec 2023 20:16 UTC
124 points
157 comments8 min readLW link1 review

Ag­glomer­a­tion of ‘Ought’

DavidAndresBloom15 Dec 2023 19:07 UTC
1 point
1 comment11 min readLW link

Pre­dict­ing the fu­ture with the power of the In­ter­net (and piss­ing off Rob Miles)

Writer15 Dec 2023 17:37 UTC
23 points
9 comments4 min readLW link
(youtu.be)

Progress links di­gest, 2023-12-15: Vi­talik on d/​acc, $100M+ in prizes, and more

jasoncrawford15 Dec 2023 15:52 UTC
20 points
0 comments12 min readLW link
(rootsofprogress.org)

“AI Align­ment” is a Danger­ously Over­loaded Term

Roko15 Dec 2023 14:34 UTC
108 points
100 comments3 min readLW link

[Valence se­ries] 4. Valence & So­cial Sta­tus (de­p­re­cated)

Steven Byrnes15 Dec 2023 14:24 UTC
35 points
19 comments11 min readLW link

Con­tra Scott on Abol­ish­ing the FDA

Maxwell Tabarrok15 Dec 2023 14:00 UTC
46 points
3 comments6 min readLW link
(maximumprogress.substack.com)

[Paper] Tra­jec­to­ries through se­man­tic spaces in schizophre­nia and the re­la­tion­ship to rip­ple bursts

bvbvbvbvbvbvbvbvbvbvbv15 Dec 2023 13:37 UTC
3 points
0 comments1 min readLW link
(www.pnas.org)

Take­aways from a Mechanis­tic In­ter­pretabil­ity pro­ject on “For­bid­den Facts”

15 Dec 2023 11:05 UTC
33 points
8 comments10 min readLW link

Refine­ment of Ac­tive In­fer­ence agency ontology

Roman Leventov15 Dec 2023 9:31 UTC
16 points
0 comments5 min readLW link
(arxiv.org)

EU poli­cy­mak­ers reach an agree­ment on the AI Act

tlevin15 Dec 2023 6:02 UTC
78 points
7 comments7 min readLW link

Where Does Ad­ver­sar­ial Pres­sure Come From?

quetzal_rainbow14 Dec 2023 22:31 UTC
16 points
1 comment2 min readLW link

Epoch wise crit­i­cal pe­ri­ods, and sin­gu­lar learn­ing theory

Garrett Baker14 Dec 2023 20:55 UTC
9 points
1 comment5 min readLW link

OpenAI Su­per­al­ign­ment: Weak-to-strong generalization

Dalmert14 Dec 2023 19:47 UTC
25 points
3 comments1 min readLW link
(openai.com)

Ap­pli­ca­tions for EA Global are still open!

Eli_Nathan14 Dec 2023 19:10 UTC
1 point
0 comments1 min readLW link

Per­sonal Devel­op­ment Sys­tem: Win­ning Re­peat­edly and Grow­ing Effec­tively With The BIG4

Paul Rohde14 Dec 2023 18:49 UTC
13 points
0 comments33 min readLW link
(blog.paul-rohde.com)

In­tro­duc­ing The ‘From Big Ideas To Real-World Re­sults’: A Series for Effec­tive Per­sonal Development

Paul Rohde14 Dec 2023 18:49 UTC
13 points
1 comment8 min readLW link
(blog.paul-rohde.com)

Talk­ing With Peo­ple Who Speak to Con­gres­sional Staffers about AI risk

Eneasz14 Dec 2023 17:55 UTC
32 points
0 comments1 min readLW link
(www.thebayesianconspiracy.com)

Bayesian Injustice

Kevin Dorst14 Dec 2023 15:44 UTC
124 points
10 comments6 min readLW link
(kevindorst.substack.com)

AI #42: The Wrong Answer

Zvi14 Dec 2023 14:50 UTC
67 points
6 comments54 min readLW link
(thezvi.wordpress.com)

Some for-profit AI al­ign­ment org ideas

Eric Ho14 Dec 2023 14:23 UTC
86 points
19 comments9 min readLW link

Map­ping the se­man­tic void: Strange go­ings-on in GPT em­bed­ding spaces

mwatkins14 Dec 2023 13:10 UTC
114 points
31 comments14 min readLW link

Cat­e­gor­i­cal Or­ga­ni­za­tion in Me­mory: ChatGPT Or­ga­nizes the 665 Topic Tags from My New Sa­vanna Blog

Bill Benzon14 Dec 2023 13:02 UTC
0 points
6 comments2 min readLW link

Mo­ral Mountains

Adam Zerner14 Dec 2023 10:40 UTC
8 points
10 comments2 min readLW link

Up­date on Chi­nese IQ-re­lated gene panels

Lao Mein14 Dec 2023 10:12 UTC
70 points
7 comments1 min readLW link