RSS

Thane Ruthenis

Karma: 5,110

[Question] Are You More Real If You’re Really For­get­ful?

Thane RuthenisNov 24, 2024, 7:30 PM
39 points
25 comments5 min readLW link

Towards the Oper­a­tional­iza­tion of Philos­o­phy & Wisdom

Thane RuthenisOct 28, 2024, 7:45 PM
20 points
2 comments33 min readLW link
(aiimpacts.org)

Thane Ruthe­nis’s Shortform

Thane RuthenisSep 13, 2024, 8:52 PM
7 points
56 comments1 min readLW link

A Crisper Ex­pla­na­tion of Si­mu­lacrum Levels

Thane RuthenisDec 23, 2023, 10:13 PM
90 points
13 comments13 min readLW link

Ideal­ized Agents Are Ap­prox­i­mate Causal Mir­rors (+ Rad­i­cal Op­ti­mism on Agent Foun­da­tions)

Thane RuthenisDec 22, 2023, 8:19 PM
74 points
14 comments6 min readLW link

Most Peo­ple Don’t Real­ize We Have No Idea How Our AIs Work

Thane RuthenisDec 21, 2023, 8:02 PM
159 points
42 comments1 min readLW link

How Would an Utopia-Max­i­mizer Look Like?

Thane RuthenisDec 20, 2023, 8:01 PM
31 points
23 comments10 min readLW link

Don’t Share In­for­ma­tion Exfo­haz­ardous on Others’ AI-Risk Models

Thane RuthenisDec 19, 2023, 8:09 PM
67 points
11 comments1 min readLW link

The Short­est Path Between Scylla and Charybdis

Thane RuthenisDec 18, 2023, 8:08 PM
50 points
8 comments5 min readLW link

A Com­mon-Sense Case For Mu­tu­ally-Misal­igned AGIs Ally­ing Against Humans

Thane RuthenisDec 17, 2023, 8:28 PM
29 points
7 comments11 min readLW link

“Hu­man­ity vs. AGI” Will Never Look Like “Hu­man­ity vs. AGI” to Humanity

Thane RuthenisDec 16, 2023, 8:08 PM
189 points
34 comments5 min readLW link

Cur­rent AIs Provide Nearly No Data Rele­vant to AGI Alignment

Thane RuthenisDec 15, 2023, 8:16 PM
129 points
157 comments8 min readLW link1 review

Hands-On Ex­pe­rience Is Not Magic

Thane RuthenisMay 27, 2023, 4:57 PM
21 points
14 comments5 min readLW link

A Case for the Least For­giv­ing Take On Alignment

Thane RuthenisMay 2, 2023, 9:34 PM
100 points
84 comments22 min readLW link

World-Model In­ter­pretabil­ity Is All We Need

Thane RuthenisJan 14, 2023, 7:37 PM
35 points
22 comments21 min readLW link

In­ter­nal In­ter­faces Are a High-Pri­or­ity In­ter­pretabil­ity Target

Thane RuthenisDec 29, 2022, 5:49 PM
26 points
6 comments7 min readLW link

In Defense of Wrap­per-Minds

Thane RuthenisDec 28, 2022, 6:28 PM
24 points
38 comments3 min readLW link

Ac­cu­rate Models of AI Risk Are Hyper­ex­is­ten­tial Exfohazards

Thane RuthenisDec 25, 2022, 4:50 PM
32 points
38 comments9 min readLW link

Cor­rigi­bil­ity Via Thought-Pro­cess Deference

Thane RuthenisNov 24, 2022, 5:06 PM
17 points
5 comments9 min readLW link

Value For­ma­tion: An Over­ar­ch­ing Model

Thane RuthenisNov 15, 2022, 5:16 PM
34 points
20 comments34 min readLW link