RSS

Charbel-Raphaël

Karma: 2,143

Charbel-Raphael Segerie

https://​​crsegerie.github.io/​​

Living in Paris

🇫🇷 An­nounc­ing CeSIA: The French Cen­ter for AI Safety

Charbel-Raphaël20 Dec 2024 14:17 UTC
84 points
2 comments8 min readLW link

Are we drop­ping the ball on Recom­men­da­tion AIs?

Charbel-Raphaël23 Oct 2024 17:48 UTC
41 points
17 comments6 min readLW link

[Question] We might be drop­ping the ball on Au­tonomous Repli­ca­tion and Adap­ta­tion.

31 May 2024 13:49 UTC
61 points
30 comments4 min readLW link

AI Safety Strate­gies Landscape

Charbel-Raphaël9 May 2024 17:33 UTC
34 points
1 comment42 min readLW link

Con­structabil­ity: Plainly-coded AGIs may be fea­si­ble in the near future

27 Apr 2024 16:04 UTC
83 points
13 comments13 min readLW link

[Question] What con­vinc­ing warn­ing shot could help pre­vent ex­tinc­tion from AI?

13 Apr 2024 18:09 UTC
105 points
18 comments2 min readLW link

My in­tel­lec­tual jour­ney to (dis)solve the hard prob­lem of consciousness

Charbel-Raphaël6 Apr 2024 9:32 UTC
44 points
43 comments30 min readLW link

AI Safety 101 : Ca­pa­bil­ities—Hu­man Level AI, What? How? and When?

7 Mar 2024 17:29 UTC
46 points
8 comments54 min readLW link

The case for train­ing fron­tier AIs on Sume­rian-only corpus

15 Jan 2024 16:40 UTC
130 points
15 comments3 min readLW link

aisafety.info, the Table of Content

Charbel-Raphaël31 Dec 2023 13:57 UTC
23 points
1 comment11 min readLW link

Re­sults from the Tur­ing Sem­i­nar hackathon

7 Dec 2023 14:50 UTC
29 points
1 comment6 min readLW link

AI Safety 101 - Chap­ter 5.2 - Un­re­stricted Ad­ver­sar­ial Training

Charbel-Raphaël31 Oct 2023 14:34 UTC
17 points
0 comments19 min readLW link

AI Safety 101 - Chap­ter 5.1 - Debate

Charbel-Raphaël31 Oct 2023 14:29 UTC
15 points
0 comments13 min readLW link

Char­bel-Raphaël and Lu­cius dis­cuss interpretability

30 Oct 2023 5:50 UTC
110 points
7 comments21 min readLW link

Against Al­most Every The­ory of Im­pact of Interpretability

Charbel-Raphaël17 Aug 2023 18:44 UTC
328 points
90 comments26 min readLW link2 reviews

AI Safety 101 : In­tro­duc­tion to Vi­sion Interpretability

28 Jul 2023 17:32 UTC
42 points
0 comments1 min readLW link
(github.com)

AIS 101: Task de­com­po­si­tion for scal­able oversight

Charbel-Raphaël25 Jul 2023 13:34 UTC
27 points
0 comments19 min readLW link
(docs.google.com)

An Overview of AI risks—the Flyer

17 Jul 2023 12:03 UTC
20 points
0 comments1 min readLW link
(docs.google.com)

In­tro­duc­ing EffiS­ciences’ AI Safety Unit

30 Jun 2023 7:44 UTC
68 points
0 comments12 min readLW link

Im­prove­ment on MIRI’s Corrigibility

9 Jun 2023 16:10 UTC
54 points
8 comments13 min readLW link