HDBSCAN is Sur­pris­ingly Effec­tive at Find­ing In­ter­pretable Clusters of the SAE De­coder Matrix

11 Oct 2024 23:06 UTC
8 points
2 comments10 min readLW link

Chang­ing the Mind of an LLM

testingthewaters11 Oct 2024 22:25 UTC
2 points
0 comments5 min readLW link

EIS XIV: Is mechanis­tic in­ter­pretabil­ity about to be prac­ti­cally use­ful?

scasper11 Oct 2024 22:13 UTC
68 points
4 comments7 min readLW link

Dario Amodei — Machines of Lov­ing Grace

Matrice Jacobine11 Oct 2024 21:43 UTC
62 points
26 comments1 min readLW link
(darioamodei.com)

“Deep Galac­tic Chillout” a space to re­lax dur­ing SF tech week & meet whole­some, fun people

Jared Phillip Mantell11 Oct 2024 19:50 UTC
1 point
0 comments1 min readLW link

Open let­ter to young EAs

Leif Wenar11 Oct 2024 19:49 UTC
9 points
10 comments1 min readLW link

The Great Bootstrap

KristianRonn11 Oct 2024 19:46 UTC
11 points
0 comments15 min readLW link

Em­brac­ing com­plex­ity when de­vel­op­ing and eval­u­at­ing AI re­spon­si­bly

Aliya Amirova11 Oct 2024 17:46 UTC
2 points
9 comments9 min readLW link

How much I’m pay­ing for AI pro­duc­tivity soft­ware (and the fu­ture of AI use)

jacquesthibs11 Oct 2024 17:11 UTC
57 points
16 comments8 min readLW link
(jacquesthibodeau.com)

AI: The Philoso­pher’s Stone of the 21st Century

HNX11 Oct 2024 16:55 UTC
0 points
2 comments29 min readLW link

[Question] Who cre­ated the Less Wrong Gather Town?

Arepo11 Oct 2024 8:53 UTC
2 points
1 comment1 min readLW link

A Heuris­tic Proof of Prac­ti­cal Aligned Superintelligence

Roko11 Oct 2024 5:05 UTC
7 points
6 comments1 min readLW link
(transhumanaxiology.substack.com)

An AI crash is our best bet for re­strict­ing AI

Remmelt11 Oct 2024 2:12 UTC
27 points
3 comments1 min readLW link

A Triple Decker for Elfland

jefftk11 Oct 2024 1:50 UTC
25 points
0 comments1 min readLW link
(www.jefftk.com)

OODA your OODA Loop

Raemon11 Oct 2024 0:50 UTC
37 points
3 comments3 min readLW link

Scal­ing pre­dic­tion mar­kets with meta-markets

Dentosal10 Oct 2024 21:17 UTC
1 point
0 comments2 min readLW link

Startup Suc­cess Rates Are So Low Be­cause the Re­wards Are So Large

AppliedDivinityStudies10 Oct 2024 20:22 UTC
42 points
6 comments2 min readLW link

Can AI Out­pre­dict Hu­mans? Re­sults From Me­tac­u­lus’s Q3 AI Fore­cast­ing Benchmark

ChristianWilliams10 Oct 2024 18:58 UTC
50 points
2 comments1 min readLW link
(www.metaculus.com)

Ra­tion­al­ity Quotes—Fall 2024

Screwtape10 Oct 2024 18:37 UTC
78 points
26 comments1 min readLW link

[Question] why won’t this al­ign­ment plan work?

KvmanThinking10 Oct 2024 15:44 UTC
6 points
7 comments1 min readLW link

AI #85: AI Wins the No­bel Prize

Zvi10 Oct 2024 13:40 UTC
30 points
6 comments31 min readLW link
(thezvi.wordpress.com)

Be­hav­ioral red-team­ing is un­likely to pro­duce clear, strong ev­i­dence that mod­els aren’t scheming

Buck10 Oct 2024 13:36 UTC
100 points
4 comments13 min readLW link

Joshua Achiam Public State­ment Analysis

Zvi10 Oct 2024 12:50 UTC
73 points
14 comments21 min readLW link
(thezvi.wordpress.com)

Do you want to do a de­bate on youtube? I’m look­ing for po­lite, truth-seek­ing par­ti­ci­pants.

Nathan Young10 Oct 2024 9:32 UTC
12 points
0 comments1 min readLW link

Ra­tion­al­ist Gnosticism

tailcalled10 Oct 2024 9:06 UTC
9 points
10 comments3 min readLW link

The deep­est athe­ist: Sam Altman

Trey Edwin10 Oct 2024 3:27 UTC
14 points
2 comments4 min readLW link

Values Are Real Like Harry Potter

9 Oct 2024 23:42 UTC
81 points
17 comments5 min readLW link

Mo­men­tum of Light in Glass

Ben9 Oct 2024 20:19 UTC
144 points
44 comments11 min readLW link

vgillioz’s Shortform

vgillioz9 Oct 2024 19:31 UTC
1 point
2 comments1 min readLW link

Hamil­to­nian Dy­nam­ics in AI: A Novel Ap­proach to Op­ti­miz­ing Rea­son­ing in Lan­guage Models

Javier Marin Valenzuela9 Oct 2024 19:14 UTC
3 points
0 comments10 min readLW link

Tri­an­gu­lat­ing My In­ter­pre­ta­tion of Meth­ods: Black Boxes by Marco J. Nathan

adamShimi9 Oct 2024 19:13 UTC
8 points
0 comments6 min readLW link
(formethods.substack.com)

Scaf­fold­ing for “Notic­ing Me­tacog­ni­tion”

Raemon9 Oct 2024 17:54 UTC
80 points
4 comments17 min readLW link

Safe Pre­dic­tive Agents with Joint Scor­ing Rules

Rubi J. Hudson9 Oct 2024 16:38 UTC
55 points
10 comments17 min readLW link

Demis Hass­abis and Ge­offrey Hin­ton Awarded No­bel Prizes

Anna Gajdova9 Oct 2024 12:56 UTC
48 points
14 comments1 min readLW link

Hu­mans are (mostly) metarational

Yair Halberstadt9 Oct 2024 5:51 UTC
14 points
6 comments3 min readLW link

[Job Ad] MATS is hiring!

9 Oct 2024 2:17 UTC
10 points
0 comments5 min readLW link

Pal­isade is hiring: Exec As­sis­tant, Con­tent Lead, Ops Lead, and Policy Lead

Charlie Rogers-Smith9 Oct 2024 0:04 UTC
11 points
0 comments4 min readLW link

AGI & Con­scious­ness—Joscha Bach

Rahul Chand8 Oct 2024 22:51 UTC
1 point
0 comments10 min readLW link

Video and tran­script of pre­sen­ta­tion on Oth­er­ness and con­trol in the age of AGI

Joe Carlsmith8 Oct 2024 22:30 UTC
35 points
1 comment27 min readLW link

From seeded com­plex­ity to con­scious­ness—yes, it’s all the same.

eschatail8 Oct 2024 21:31 UTC
−23 points
0 comments2 min readLW link

Limits of safe and al­igned AI

Shivam8 Oct 2024 21:30 UTC
2 points
0 comments4 min readLW link

[Question] What con­sti­tutes an in­fo­haz­ard?

K1r4d4rk.v18 Oct 2024 21:29 UTC
−4 points
8 comments1 min readLW link

[Question] What makes one a “ra­tio­nal­ist”?

mathyouf8 Oct 2024 20:25 UTC
7 points
5 comments3 min readLW link

[In­tu­itive self-mod­els] 4. Trance

Steven Byrnes8 Oct 2024 13:30 UTC
75 points
7 comments24 min readLW link

Schel­ling game eval­u­a­tions for AI control

Olli Järviniemi8 Oct 2024 12:01 UTC
65 points
5 comments11 min readLW link

Think­ing About a Pedalboard

jefftk8 Oct 2024 11:50 UTC
9 points
2 comments1 min readLW link
(www.jefftk.com)

Overview of strong hu­man in­tel­li­gence am­plifi­ca­tion methods

TsviBT8 Oct 2024 8:37 UTC
270 points
141 comments10 min readLW link

Near-death experiences

Declan Molony8 Oct 2024 6:34 UTC
3 points
1 comment3 min readLW link

The un­rea­son­able effec­tive­ness of plas­mid se­quenc­ing as a service

Abhishaike Mahajan8 Oct 2024 2:02 UTC
23 points
2 comments13 min readLW link
(www.owlposting.com)

There is a globe in your LLM

jacob_drori8 Oct 2024 0:43 UTC
86 points
4 comments1 min readLW link