RSS

AI Safety Men­tors and Men­tees Program

TagLast edit: May 10, 2023, 10:55 AM by Magdalena Wache

The AI Safety Mentors and Mentees program aims to facilitate mentoring in AI safety. It does so by

An­nounc­ing AI safety Men­tors and Mentees

Marius HobbhahnNov 23, 2022, 3:21 PM
62 points
7 comments10 min readLW link

Launch­ing Ap­pli­ca­tions for the Global AI Safety Fel­low­ship 2025!

Aditya_SKNov 30, 2024, 2:02 PM
11 points
5 comments1 min readLW link

AISC Pro­ject: Model­ling Tra­jec­to­ries of Lan­guage Models

NickyPNov 13, 2023, 2:33 PM
27 points
0 comments12 min readLW link

What Dis­cov­er­ing La­tent Knowl­edge Did and Did Not Find

Fabien RogerMar 13, 2023, 7:29 PM
166 points
17 comments11 min readLW link

The Translu­cent Thoughts Hy­pothe­ses and Their Implications

Fabien RogerMar 9, 2023, 4:30 PM
142 points
7 comments19 min readLW link

If Went­worth is right about nat­u­ral ab­strac­tions, it would be bad for alignment

Wuschel SchulzDec 8, 2022, 3:19 PM
29 points
5 comments4 min readLW link

[Heb­bian Nat­u­ral Ab­strac­tions] Introduction

Nov 21, 2022, 8:34 PM
34 points
3 comments4 min readLW link
(www.snellessen.com)

[Heb­bian Nat­u­ral Ab­strac­tions] Math­e­mat­i­cal Foundations

Dec 25, 2022, 8:58 PM
15 points
2 comments6 min readLW link
(www.snellessen.com)

How Do In­duc­tion Heads Ac­tu­ally Work in Trans­form­ers With Finite Ca­pac­ity?

Fabien RogerMar 23, 2023, 9:09 AM
27 points
0 comments5 min readLW link

I made an AI safety fel­low­ship. What I wish I knew.

Ruben CastaingJun 8, 2024, 3:23 PM
12 points
0 comments2 min readLW link

The In­ter-Agent Facet of AI Alignment

Michael OesterleSep 18, 2022, 8:39 PM
12 points
1 comment5 min readLW link
No comments.