Reli­able Sources: The Story of David Gerard

TracingWoodgrains10 Jul 2024 19:50 UTC
381 points
53 comments43 min readLW link

Manag­ing Emo­tional Po­ten­tial Energy

adamShimi10 Jul 2024 18:20 UTC
23 points
4 comments4 min readLW link
(epistemologicalfascinations.substack.com)

[EAFo­rum xpost] A break­down of OpenAI’s revenue

10 Jul 2024 18:09 UTC
57 points
5 comments1 min readLW link
(forum.effectivealtruism.org)

Solv­ing Pas­cal’s Wager us­ing dy­namic programming

Paul Wilczewski10 Jul 2024 18:09 UTC
1 point
0 comments5 min readLW link

Fluent, Cruxy Predictions

Raemon10 Jul 2024 18:00 UTC
85 points
14 comments14 min readLW link

An­titrust as Con­trol­led Creative Destruction

Martin Sustrik10 Jul 2024 16:40 UTC
14 points
2 comments2 min readLW link
(250bpm.substack.com)

New page: Integrity

Zach Stein-Perlman10 Jul 2024 15:00 UTC
91 points
3 comments1 min readLW link

AirBnB Baking

jefftk10 Jul 2024 12:50 UTC
7 points
1 comment1 min readLW link
(www.jefftk.com)

DIY RLHF: A sim­ple im­ple­men­ta­tion for hands on experience

10 Jul 2024 12:07 UTC
28 points
0 comments6 min readLW link

Use­ful­ness grounds truth

invertedpassion10 Jul 2024 7:58 UTC
0 points
0 comments4 min readLW link

On pass­ing Com­plete and Hon­est Ide­olog­i­cal Tur­ing Tests (CHITTs)

Aryeh Englander10 Jul 2024 4:01 UTC
11 points
2 comments1 min readLW link

[Question] Pon­der­ing how good or bad things will be in the AGI future

Sherrinford9 Jul 2024 22:46 UTC
11 points
9 comments2 min readLW link

Causal Graphs of GPT-2-Small’s Resi­d­ual Stream

David Udell9 Jul 2024 22:06 UTC
53 points
7 comments7 min readLW link

[Question] If AI starts to end the world, is suicide a good idea?

IlluminateReality9 Jul 2024 21:53 UTC
0 points
8 comments1 min readLW link

Ra­tion­al­ist Pu­rity Test

Gunnar_Zarncke9 Jul 2024 20:30 UTC
−9 points
5 comments1 min readLW link
(ratpuritytest.com)

That which can be de­stroyed by the truth, should be as­sumed to should be de­stroyed by it

Thac09 Jul 2024 19:39 UTC
5 points
0 comments3 min readLW link

AISN #38: Supreme Court De­ci­sion Could Limit Fed­eral Abil­ity to Reg­u­late AI Plus, “Cir­cuit Break­ers” for AI sys­tems, and up­dates on China’s AI industry

9 Jul 2024 19:28 UTC
5 points
0 comments5 min readLW link
(newsletter.safe.ai)

Sum­mer Tour Stops

jefftk9 Jul 2024 19:10 UTC
10 points
0 comments3 min readLW link
(www.jefftk.com)

Fix sim­ple mis­takes in ARC-AGI, etc.

Oleg Trott9 Jul 2024 17:46 UTC
9 points
9 comments1 min readLW link

Paper Sum­mary: The Effects of Com­mu­ni­cat­ing Uncer­tainty on Public Trust in Facts and Numbers

Jeffrey Heninger9 Jul 2024 16:50 UTC
42 points
2 comments2 min readLW link
(blog.aiimpacts.org)

UC Berkeley course on LLMs and ML Safety

Dan H9 Jul 2024 15:40 UTC
36 points
1 comment1 min readLW link
(rdi.berkeley.edu)

What and Why: Devel­op­men­tal In­ter­pretabil­ity of Re­in­force­ment Learning

Garrett Baker9 Jul 2024 14:09 UTC
67 points
4 comments6 min readLW link

Med­i­cal Roundup #3

Zvi9 Jul 2024 13:10 UTC
39 points
4 comments19 min readLW link
(thezvi.wordpress.com)

Con­sent across power differentials

Ramana Kumar9 Jul 2024 11:42 UTC
50 points
12 comments3 min readLW link

[Question] How bad would AI progress need to be for us to think gen­eral tech­nolog­i­cal progress is also bad?

Jim Buhler9 Jul 2024 10:43 UTC
9 points
5 comments1 min readLW link

How LLMs Learn: What We Know, What We Don’t (Yet) Know, and What Comes Next

Jonasb9 Jul 2024 9:58 UTC
2 points
0 comments16 min readLW link
(www.denominations.io)

WTF is with the In­fancy Gospel of Thomas?!? A deep dive into satire, philos­o­phy, and more

kromem9 Jul 2024 9:29 UTC
13 points
1 comment11 min readLW link

Book Re­view: Safe Enough? A His­tory of Nu­clear Power and Ac­ci­dent Risk

ErickBall9 Jul 2024 1:12 UTC
10 points
0 comments28 min readLW link

Me, My­self, and AI: the Si­tu­a­tional Aware­ness Dataset (SAD) for LLMs

8 Jul 2024 22:24 UTC
106 points
28 comments5 min readLW link

Robin Han­son & Liron Shapira De­bate AI X-Risk

Liron8 Jul 2024 21:45 UTC
34 points
4 comments1 min readLW link
(www.youtube.com)

“The Sin­gu­lar­ity Is Nearer” by Ray Kurzweil—Review

Lavender8 Jul 2024 21:32 UTC
22 points
0 comments4 min readLW link

Sam­ple Prevalence vs Global Prevalence

jefftk8 Jul 2024 21:00 UTC
11 points
0 comments2 min readLW link
(www.jefftk.com)

Ad­vice to ju­nior AI gov­er­nance researchers

Akash8 Jul 2024 19:19 UTC
65 points
1 comment5 min readLW link

Pan­theon Interface

8 Jul 2024 19:03 UTC
126 points
22 comments6 min readLW link

Launch­ing the AI Fore­cast­ing Bench­mark Series Q3 | $30k in Prizes

ChristianWilliams8 Jul 2024 17:20 UTC
5 points
0 comments1 min readLW link
(www.metaculus.com)

The Golden Mean of Scien­tific Virtues

adamShimi8 Jul 2024 17:16 UTC
12 points
4 comments8 min readLW link
(epistemologicalfascinations.substack.com)

Mas­s­ape­qua (Long Is­land), New York, USA – ACX Meetup

Gabriel Weil8 Jul 2024 17:01 UTC
2 points
0 comments1 min readLW link

Dialogue in­tro­duc­tion to Sin­gu­lar Learn­ing Theory

Olli Järviniemi8 Jul 2024 16:58 UTC
97 points
14 comments8 min readLW link

An­nounc­ing The Techno-Hu­man­ist Man­i­festo: A new philos­o­phy of progress for the 21st century

jasoncrawford8 Jul 2024 16:33 UTC
18 points
4 comments5 min readLW link
(blog.rootsofprogress.org)

Re­sponse to Dileep Ge­orge: AGI safety war­rants plan­ning ahead

Steven Byrnes8 Jul 2024 15:27 UTC
27 points
7 comments27 min readLW link

Why not par­li­a­men­tar­i­anism? [book by Ti­ago Ribeiro dos San­tos]

Arturo Macias8 Jul 2024 14:57 UTC
2 points
1 comment4 min readLW link

Games of My Child­hood: The Troops

Kaj_Sotala8 Jul 2024 11:20 UTC
18 points
0 comments5 min readLW link
(kajsotala.fi)

Towards shut­down­able agents via stochas­tic choice

8 Jul 2024 10:14 UTC
59 points
12 comments23 min readLW link
(arxiv.org)

On scal­able over­sight with weak LLMs judg­ing strong LLMs

8 Jul 2024 8:59 UTC
49 points
18 comments7 min readLW link
(arxiv.org)

Poker is a bad game for teach­ing epistemics. Fig­gie is a bet­ter one.

rossry8 Jul 2024 6:05 UTC
104 points
47 comments11 min readLW link
(blog.rossry.net)

Con­trol­led Creative Destruction

Martin Sustrik8 Jul 2024 4:36 UTC
11 points
0 comments2 min readLW link

On say­ing “Thank you” in­stead of “I’m Sorry”

Michael Cohn8 Jul 2024 3:13 UTC
132 points
16 comments3 min readLW link

How can I get over my fear of be­com­ing an em­u­lated con­scious­ness?

James Dowdell7 Jul 2024 22:02 UTC
6 points
8 comments5 min readLW link

An Ex­tremely Opinionated An­no­tated List of My Favourite Mechanis­tic In­ter­pretabil­ity Papers v2

Neel Nanda7 Jul 2024 17:39 UTC
134 points
16 comments25 min readLW link

Joint manda­tory dona­tion as a way to in­crease the num­ber of donations

Crazy philosopher7 Jul 2024 10:56 UTC
3 points
3 comments2 min readLW link