RSS

Geoffrey Irving

Karma: 483

Chief Scientist at the UK AI Safety Institute (AISI). Previously, DeepMind, OpenAI, Google Brain, etc.

Prospects for Align­ment Au­toma­tion: In­ter­pretabil­ity Case Study

Mar 21, 2025, 2:05 PM
30 points
4 comments8 min readLW link

A sketch of an AI con­trol safety case

Jan 30, 2025, 5:28 PM
57 points
0 comments5 min readLW link

Elic­it­ing bad contexts

Jan 24, 2025, 10:39 AM
31 points
8 comments3 min readLW link

Au­toma­tion collapse

Oct 21, 2024, 2:50 PM
72 points
9 comments7 min readLW link

De­bate, Or­a­cles, and Obfus­cated Arguments

Jun 20, 2024, 11:14 PM
44 points
4 comments21 min readLW link