Ex­e­cutable philos­o­phy as a failed to­tal­iz­ing meta-worldview

jessicata4 Sep 2024 22:50 UTC
93 points
40 comments4 min readLW link
(unstableontology.com)

Against Ex­plo­sive Growth

c.trout4 Sep 2024 21:45 UTC
14 points
1 comment5 min readLW link

The Frag­ility of Life Hy­poth­e­sis and the Evolu­tion of Cooperation

KristianRonn4 Sep 2024 21:04 UTC
50 points
6 comments11 min readLW link

Emo­tion-In­formed Valu­a­tion Mechanism for Im­proved AI Align­ment in Large Lan­guage Models

Javier Marin Valenzuela4 Sep 2024 17:00 UTC
2 points
4 comments6 min readLW link

What hap­pens if you pre­sent 500 peo­ple with an ar­gu­ment that AI is risky?

4 Sep 2024 16:40 UTC
102 points
7 comments3 min readLW link
(blog.aiimpacts.org)

Au­tomat­ing LLM Au­dit­ing with Devel­op­men­tal Interpretability

4 Sep 2024 15:50 UTC
17 points
0 comments3 min readLW link

Michael Dick­ens’ Caf­feine Tol­er­ance Research

niplav4 Sep 2024 15:41 UTC
46 points
3 comments2 min readLW link
(mdickens.me)

[Question] Are UV-C Air puri­fiers so use­ful?

JohnBuridan4 Sep 2024 14:16 UTC
9 points
0 comments1 min readLW link

AI and the Tech­nolog­i­cal Richter Scale

Zvi4 Sep 2024 14:00 UTC
48 points
8 comments13 min readLW link
(thezvi.wordpress.com)

[Question] Is there any rigor­ous work on us­ing an­thropic un­cer­tainty to pre­vent situ­a­tional aware­ness /​ de­cep­tion?

David Scott Krueger (formerly: capybaralet)4 Sep 2024 12:40 UTC
19 points
7 comments1 min readLW link

A Com­par­i­son Between The Prag­mato­sphere And Less Wrong

Zero Contradictions4 Sep 2024 9:39 UTC
−18 points
10 comments2 min readLW link
(zerocontradictions.net)

An­nounc­ing the Ul­ti­mate Jailbreak­ing Championship

InnerHufflepuff4 Sep 2024 0:35 UTC
15 points
1 comment1 min readLW link

AI Safety at the Fron­tier: Paper High­lights, Au­gust ’24

gasteigerjo3 Sep 2024 19:17 UTC
28 points
0 comments6 min readLW link
(aisafetyfrontier.substack.com)

The Check­list: What Suc­ceed­ing at AI Safety Will In­volve

Sam Bowman3 Sep 2024 18:18 UTC
142 points
49 comments22 min readLW link
(sleepinyourhat.github.io)

Democ­racy be­yond majoritarianism

Arturo Macias3 Sep 2024 15:10 UTC
5 points
2 comments4 min readLW link

On the UBI Paper

Zvi3 Sep 2024 14:50 UTC
57 points
6 comments19 min readLW link
(thezvi.wordpress.com)

An Opinionated Look at In­fer­ence Rules

Gianluca Calcagni3 Sep 2024 13:32 UTC
−5 points
2 comments13 min readLW link

An­nounc­ing the PIBBSS Sym­po­sium ’24!

3 Sep 2024 11:19 UTC
19 points
0 comments3 min readLW link

Re­duc­ing global AI com­pe­ti­tion through the Com­merce Con­trol List and Im­mi­gra­tion re­form: a dual-pronged approach

Ben Smith3 Sep 2024 5:28 UTC
16 points
2 comments1 min readLW link

How I got 4.2M YouTube views with­out mak­ing a sin­gle video

Closed Limelike Curves3 Sep 2024 3:52 UTC
376 points
36 comments1 min readLW link

Duped: AI and the Mak­ing of a Global Suicide Cult

izzyness2 Sep 2024 18:51 UTC
−8 points
0 comments1 min readLW link

A gen­tle in­tro­duc­tion to sparse autoencoders

Nick Jiang2 Sep 2024 18:11 UTC
9 points
0 comments6 min readLW link

What makes math prob­lems hard for re­in­force­ment learn­ing: a case study

Anibal, Bartek, Sergei, Shehper and Piotr2 Sep 2024 18:11 UTC
1 point
0 comments2 min readLW link
(arxiv.org)

Sur­vey: How Do Elite Chi­nese Stu­dents Feel About the Risks of AI?

Nick Corvino2 Sep 2024 18:11 UTC
141 points
13 comments10 min readLW link

Data-driven dona­tions to help Democrats win fed­eral elec­tions: an update

Michael Cohn2 Sep 2024 16:32 UTC
−1 points
2 comments1 min readLW link
(perplexedguide.net)

[Question] What are the effec­tive util­i­tar­ian pros and cons of hav­ing chil­dren (in rich coun­tries)?

SpectrumDT2 Sep 2024 10:01 UTC
2 points
4 comments1 min readLW link

My de­com­po­si­tion of the al­ign­ment problem

Daniel C2 Sep 2024 0:21 UTC
20 points
22 comments13 min readLW link

DC Fore­cast­ing & Pre­dic­tion Mar­kets Meetup

David Glidden2 Sep 2024 0:00 UTC
1 point
0 comments1 min readLW link

A primer on the next gen­er­a­tion of antibodies

Abhishaike Mahajan1 Sep 2024 22:37 UTC
25 points
0 comments19 min readLW link
(www.owlposting.com)

[Question] Who looked into ex­treme nu­clear melt­downs?

Remmelt1 Sep 2024 21:38 UTC
2 points
8 comments1 min readLW link

Re­dun­dant At­ten­tion Heads in Large Lan­guage Models For In Con­text Learning

skunnavakkam1 Sep 2024 20:08 UTC
7 points
1 comment4 min readLW link
(skunnavakkam.github.io)

The Role of Trans­parency and Ex­plain­abil­ity in Re­spon­si­ble NLP

RAMEBC781 Sep 2024 20:08 UTC
−3 points
1 comment5 min readLW link

Book Re­view: What Even Is Gen­der?

Joey Marcellino1 Sep 2024 16:09 UTC
31 points
14 comments12 min readLW link

Can a Bayesian Or­a­cle Prevent Harm from an Agent? (Ben­gio et al. 2024)

mattmacdermott1 Sep 2024 7:46 UTC
26 points
0 comments5 min readLW link
(yoshuabengio.org)

San Fran­cisco ACX Meetup “First Satur­day”

Nate Sternberg1 Sep 2024 4:48 UTC
2 points
1 comment1 min readLW link

Fore­cast­ing One-Shot Games

Raemon31 Aug 2024 23:10 UTC
46 points
0 comments7 min readLW link

On epistemic autonomy

sanyer31 Aug 2024 18:50 UTC
11 points
0 comments2 min readLW link

Epistemic states as a po­ten­tial be­nign prior

Tamsin Leake31 Aug 2024 18:26 UTC
31 points
2 comments8 min readLW link
(carado.moe)

My Model of Epistemology

adamShimi31 Aug 2024 17:01 UTC
35 points
0 comments8 min readLW link
(epistemologicalfascinations.substack.com)

Ver­ifi­ca­tion meth­ods for in­ter­na­tional AI agreements

Akash31 Aug 2024 14:58 UTC
14 points
1 comment4 min readLW link
(arxiv.org)

Fake Blog Posts as a Prob­lem Solv­ing Device

silentbob31 Aug 2024 9:22 UTC
7 points
0 comments2 min readLW link

Ac­tu­ally Ra­tional & Kind Se­quences Read­ing Group

segfault 31 Aug 2024 4:21 UTC
−55 points
1 comment1 min readLW link

An­thropic is be­ing sued for copy­ing books to train Claude

Remmelt31 Aug 2024 2:57 UTC
20 points
4 comments2 min readLW link
(fingfx.thomsonreuters.com)

Book re­view: On the Edge

PeterMcCluskey30 Aug 2024 22:18 UTC
34 points
0 comments9 min readLW link
(bayesianinvestor.com)

Can Large Lan­guage Models effec­tively iden­tify cy­ber­se­cu­rity risks?

emile delcourt30 Aug 2024 20:20 UTC
18 points
0 comments11 min readLW link

Sin­gu­lar learn­ing the­ory: exercises

Zach Furman30 Aug 2024 20:00 UTC
88 points
5 comments14 min readLW link

AI for Bio: State Of The Field

sarahconstantin30 Aug 2024 18:00 UTC
73 points
2 comments15 min readLW link
(sarahconstantin.substack.com)

Multi-Tiered AI

Timothy Bruneau30 Aug 2024 17:46 UTC
1 point
0 comments2 min readLW link

I uni­ver­sally try­ing to re­ject the Mind Pro­jec­tion Fal­lacy—consequences

YanLyutnev30 Aug 2024 17:42 UTC
−4 points
0 comments9 min readLW link

AIS ter­minol­ogy pro­posal: stan­dard­ize terms for prob­a­bil­ity ranges

eggsyntax30 Aug 2024 15:43 UTC
30 points
12 comments2 min readLW link