The Theoretical Foundations of Reward Learning

In this sequence I provide an overview of the theoretical reward learning research agenda, including its motivating assumptions, several core results, and starting points for how to contribute to it further.

The The­o­ret­i­cal Re­ward Learn­ing Re­search Agenda: In­tro­duc­tion and Motivation

Par­tial Iden­ti­fi­a­bil­ity in Re­ward Learning

Misspeci­fi­ca­tion in In­verse Re­in­force­ment Learning

STARC: A Gen­eral Frame­work For Quan­tify­ing Differ­ences Between Re­ward Functions

Misspeci­fi­ca­tion in In­verse Re­in­force­ment Learn­ing—Part II

Defin­ing and Char­ac­ter­is­ing Re­ward Hacking

Other Papers About the The­ory of Re­ward Learning

How to Con­tribute to The­o­ret­i­cal Re­ward Learn­ing Research