RSS

Rohin Shah

Karma: 15,620

Research Scientist at Google DeepMind. Creator of the Alignment Newsletter. http://​​rohinshah.com/​​

Google Deep­Mind: An Ap­proach to Tech­ni­cal AGI Safety and Security

Rohin ShahApr 5, 2025, 10:00 PM
73 points
12 comments18 min readLW link
(arxiv.org)

Nega­tive Re­sults for SAEs On Down­stream Tasks and Depri­ori­tis­ing SAE Re­search (GDM Mech In­terp Team Progress Up­date #2)

Mar 26, 2025, 7:07 PM
113 points
15 comments29 min readLW link
(deepmindsafetyresearch.medium.com)