RSS

Guaran­teed Safe AI

TagLast edit: Aug 9, 2024, 11:22 PM by Ben Goldhaber

AXRP Epi­sode 40 - Ja­son Gross on Com­pact Proofs and Interpretability

DanielFilanMar 28, 2025, 6:40 PM
23 points
0 comments89 min readLW link

Towards Guaran­teed Safe AI: A Frame­work for En­sur­ing Ro­bust and Reli­able AI Systems

Gunnar_ZarnckeMay 16, 2024, 1:09 PM
51 points
20 comments1 min readLW link
(arxiv.org)

Towards Guaran­teed Safe AI: A Frame­work for En­sur­ing Ro­bust and Reli­able AI Systems

Joar SkalseMay 17, 2024, 7:13 PM
67 points
10 comments2 min readLW link

Novem­ber-De­cem­ber 2024 Progress in Guaran­teed Safe AI

QuinnJan 22, 2025, 1:20 AM
17 points
0 comments4 min readLW link
(gsai.substack.com)

In re­sponse to cri­tiques of Guaran­teed Safe AI

Nora_AmmannJan 31, 2025, 1:43 AM
44 points
14 comments26 min readLW link

Topolog­i­cal De­bate Framework

lunatic_at_largeJan 16, 2025, 5:19 PM
10 points
5 comments9 min readLW link

Can a Bayesian Or­a­cle Prevent Harm from an Agent? (Ben­gio et al. 2024)

mattmacdermottSep 1, 2024, 7:46 AM
26 points
0 comments5 min readLW link
(yoshuabengio.org)

Davi­dad’s Prov­ably Safe AI Ar­chi­tec­ture—ARIA’s Pro­gramme Thesis

simeon_cFeb 1, 2024, 9:30 PM
69 points
17 comments1 min readLW link
(www.aria.org.uk)

Prov­ably Safe AI

PeterMcCluskeyOct 5, 2023, 10:18 PM
35 points
15 comments4 min readLW link
(bayesianinvestor.com)

Limi­ta­tions on For­mal Ver­ifi­ca­tion for AI Safety

Andrew DicksonAug 19, 2024, 11:03 PM
134 points
60 comments23 min readLW link

Prov­ably Safe AI: Wor­ld­view and Projects

Aug 9, 2024, 11:21 PM
54 points
44 comments7 min readLW link
No comments.