Garrett Baker

Independent alignment researcher

I have signed no contracts or agreements whose existence I cannot mention.

What and Why: Devel­op­men­tal In­ter­pretabil­ity of Re­in­force­ment Learning

Garrett Baker9 Jul 2024 14:09 UTC
On Com­plex­ity Science

Garrett Baker5 Apr 2024 2:24 UTC
So You Created a So­ciopath—New Book An­nounce­ment!

Garrett Baker1 Apr 2024 18:02 UTC
An­nounc­ing Suffer­ing For Good

Garrett Baker1 Apr 2024 17:08 UTC
Neu­ro­science and Alignment

Garrett Baker18 Mar 2024 21:09 UTC
Epoch wise crit­i­cal pe­ri­ods, and sin­gu­lar learn­ing theory

Garrett Baker14 Dec 2023 20:55 UTC
A bet on crit­i­cal pe­ri­ods in neu­ral networks

6 Nov 2023 23:21 UTC
When and why should you use the Kelly crite­rion?

5 Nov 2023 23:26 UTC
Sin­gu­lar learn­ing the­ory and bridg­ing from ML to brain emulations

1 Nov 2023 21:31 UTC
My hopes for al­ign­ment: Sin­gu­lar learn­ing the­ory and whole brain emulation

Garrett Baker25 Oct 2023 18:31 UTC
AI pres­i­dents dis­cuss AI al­ign­ment agendas

9 Sep 2023 18:55 UTC
Ac­ti­va­tion ad­di­tions in a small resi­d­ual network

Garrett Baker22 May 2023 20:28 UTC
Col­lec­tive Identity

18 May 2023 9:00 UTC
Ac­ti­va­tion ad­di­tions in a sim­ple MNIST network

Garrett Baker18 May 2023 2:49 UTC
Value drift threat models

Garrett Baker12 May 2023 23:03 UTC
[Question] What con­straints does deep learn­ing place on al­ign­ment plans?

Garrett Baker3 May 2023 20:40 UTC
Pes­simistic Shard Theory

Garrett Baker25 Jan 2023 0:59 UTC
Perform­ing an SVD on a time-se­ries ma­trix of gra­di­ent up­dates on an MNIST net­work pro­duces 92.5 sin­gu­lar values

Garrett Baker21 Dec 2022 0:44 UTC
Don’t de­sign agents which ex­ploit ad­ver­sar­ial inputs

18 Nov 2022 1:48 UTC
A frame­work and open ques­tions for game the­o­retic shard modeling

Garrett Baker21 Oct 2022 21:40 UTC
