RSS

Charbel-Raphaël

Karma: 2,206

Charbel-Raphael Segerie

https://​​crsegerie.github.io/​​

Living in Paris

Char­bel-Raphaël’s Shortform

Charbel-RaphaëlApr 21, 2025, 8:49 PM
6 points
5 commentsLW link

Un­der­stand­ing Bench­marks and mo­ti­vat­ing Evaluations

Feb 6, 2025, 1:32 AM
9 points
0 comments11 min readLW link
(ai-safety-atlas.com)

🇫🇷 An­nounc­ing CeSIA: The French Cen­ter for AI Safety

Charbel-RaphaëlDec 20, 2024, 2:17 PM
88 points
2 comments8 min readLW link

Are we drop­ping the ball on Recom­men­da­tion AIs?

Charbel-RaphaëlOct 23, 2024, 5:48 PM
41 points
17 comments6 min readLW link

[Question] We might be drop­ping the ball on Au­tonomous Repli­ca­tion and Adap­ta­tion.

May 31, 2024, 1:49 PM
61 points
30 comments4 min readLW link

AI Safety Strate­gies Landscape

Charbel-RaphaëlMay 9, 2024, 5:33 PM
34 points
1 comment42 min readLW link

Con­structabil­ity: Plainly-coded AGIs may be fea­si­ble in the near future

Apr 27, 2024, 4:04 PM
85 points
13 comments13 min readLW link

[Question] What con­vinc­ing warn­ing shot could help pre­vent ex­tinc­tion from AI?

Apr 13, 2024, 6:09 PM
106 points
22 comments2 min readLW link

My in­tel­lec­tual jour­ney to (dis)solve the hard prob­lem of consciousness

Charbel-RaphaëlApr 6, 2024, 9:32 AM
49 points
44 comments30 min readLW link

AI Safety 101 : Ca­pa­bil­ities—Hu­man Level AI, What? How? and When?

Mar 7, 2024, 5:29 PM
46 points
8 comments54 min readLW link

The case for train­ing fron­tier AIs on Sume­rian-only corpus

Jan 15, 2024, 4:40 PM
130 points
16 comments3 min readLW link

aisafety.info, the Table of Content

Charbel-RaphaëlDec 31, 2023, 1:57 PM
23 points
1 comment11 min readLW link

AI Safety 101 - Chap­ter 5.2 - Un­re­stricted Ad­ver­sar­ial Training

Charbel-RaphaëlOct 31, 2023, 2:34 PM
17 points
0 comments19 min readLW link

AI Safety 101 - Chap­ter 5.1 - Debate

Charbel-RaphaëlOct 31, 2023, 2:29 PM
15 points
0 comments13 min readLW link

Char­bel-Raphaël and Lu­cius dis­cuss interpretability

Oct 30, 2023, 5:50 AM
111 points
7 comments21 min readLW link

Against Al­most Every The­ory of Im­pact of Interpretability

Charbel-RaphaëlAug 17, 2023, 6:44 PM
329 points
90 comments26 min readLW link2 reviews

AIS 101: Task de­com­po­si­tion for scal­able oversight

Charbel-RaphaëlJul 25, 2023, 1:34 PM
27 points
0 comments19 min readLW link
(docs.google.com)

An Overview of AI risks—the Flyer

Jul 17, 2023, 12:03 PM
20 points
0 comments1 min readLW link
(docs.google.com)

In­tro­duc­ing EffiS­ciences’ AI Safety Unit

Jun 30, 2023, 7:44 AM
68 points
0 comments12 min readLW link

Im­prove­ment on MIRI’s Corrigibility

Jun 9, 2023, 4:10 PM
54 points
8 comments13 min readLW link