RSS

Geoffrey Irving

Karma: 456

Chief Scientist at the UK AI Safety Institute (AISI). Previously, DeepMind, OpenAI, Google Brain, etc.

A sketch of an AI con­trol safety case

30 Jan 2025 17:28 UTC
58 points
0 comments5 min readLW link

Elic­it­ing bad contexts

24 Jan 2025 10:39 UTC
31 points
8 comments3 min readLW link

Au­toma­tion collapse

21 Oct 2024 14:50 UTC
72 points
9 comments7 min readLW link

De­bate, Or­a­cles, and Obfus­cated Arguments

20 Jun 2024 23:14 UTC
41 points
4 comments21 min readLW link

Does Cir­cuit Anal­y­sis In­ter­pretabil­ity Scale? Ev­i­dence from Mul­ti­ple Choice Ca­pa­bil­ities in Chinchilla

20 Jul 2023 10:50 UTC
44 points
3 comments2 min readLW link
(arxiv.org)