RSS

Chris_Leong

Karma: 7,016

Linkpost: “Imag­in­ing and build­ing wise ma­chines: The cen­tral­ity of AI metacog­ni­tion” by John­son, Karimi, Ben­gio, et al.

Chris_Leong11 Nov 2024 16:13 UTC
25 points
6 comments1 min readLW link
(arxiv.org)

Some Pre­limi­nary Notes on the Promise of a Wis­dom Explosion

Chris_Leong31 Oct 2024 9:21 UTC
2 points
0 comments1 min readLW link
(aiimpacts.org)

Linkpost: Hypocrisy standoff

Chris_Leong29 Sep 2024 14:27 UTC
5 points
1 comment1 min readLW link
(x.com)

On the de­struc­tion of Amer­ica’s best high school

Chris_Leong12 Sep 2024 15:30 UTC
−6 points
7 comments1 min readLW link
(scottaaronson.blog)

The Bar for Con­tribut­ing to AI Safety is Lower than You Think

Chris_Leong16 Aug 2024 15:20 UTC
20 points
1 comment2 min readLW link

Michael Stream­lines on Buddhism

Chris_Leong9 Aug 2024 4:44 UTC
8 points
0 comments1 min readLW link
(x.com)

[Question] Have peo­ple given up on iter­ated dis­til­la­tion and am­plifi­ca­tion?

Chris_Leong19 Jul 2024 12:23 UTC
20 points
1 comment1 min readLW link

Poli­tics is the mind-kil­ler, but maybe we should talk about it anyway

Chris_Leong3 Jun 2024 6:37 UTC
14 points
33 comments3 min readLW link

[Question] Does re­duc­ing the amount of RL for a given ca­pa­bil­ity level make AI safer?

Chris_Leong5 May 2024 17:04 UTC
43 points
22 comments1 min readLW link

Link: Let’s Think Dot by Dot: Hid­den Com­pu­ta­tion in Trans­former Lan­guage Models by Ja­cob Pfau, William Mer­rill & Sa­muel R. Bowman

Chris_Leong27 Apr 2024 13:22 UTC
12 points
0 comments1 min readLW link
(twitter.com)

“You’re the most beau­tiful girl in the world” and Wittgen­stei­nian Lan­guage Games

Chris_Leong20 Apr 2024 14:54 UTC
5 points
18 comments1 min readLW link

The ar­gu­ment for near-term hu­man dis­em­pow­er­ment through AI

Chris_Leong16 Apr 2024 4:50 UTC
21 points
2 comments1 min readLW link
(link.springer.com)

Re­v­erse Reg­u­la­tory Capture

Chris_Leong11 Apr 2024 2:40 UTC
12 points
3 comments1 min readLW link

On the Con­fu­sion be­tween In­ner and Outer Misalignment

Chris_Leong25 Mar 2024 11:59 UTC
17 points
10 comments1 min readLW link

The Best Es­say (Paul Gra­ham)

Chris_Leong11 Mar 2024 19:25 UTC
25 points
2 comments1 min readLW link
(paulgraham.com)

[Question] Can we get an AI to “do our al­ign­ment home­work for us”?

Chris_Leong26 Feb 2024 7:56 UTC
53 points
33 comments1 min readLW link

[Question] What’s the the­ory of im­pact for ac­ti­va­tion vec­tors?

Chris_Leong11 Feb 2024 7:34 UTC
57 points
12 comments1 min readLW link

No­tice When Peo­ple Are Direc­tion­ally Correct

Chris_Leong14 Jan 2024 14:12 UTC
129 points
8 comments2 min readLW link

Are Me­tac­u­lus AI Timelines In­con­sis­tent?

Chris_Leong2 Jan 2024 6:47 UTC
16 points
7 comments2 min readLW link

Ran­dom Mus­ings on The­ory of Im­pact for Ac­ti­va­tion Vectors

Chris_Leong7 Dec 2023 13:07 UTC
8 points
0 comments1 min readLW link