This is a YouTube playlist of recorded lectures on the learning-theoretic AI alignment agenda (LTA) I gave for my MATS scholars of the Winter 2024 cohort, edited by my beloved spouse @Marcus Ogren. H/t William Brewer for helping with the recording, and the rest of the MATS team for making this possible.
I hope these will become a useful resource for anyone who wants to get up to speed on the LTA, complementary to the reading list. Notable topics that aren’t covered include metacognitive agents (although there is an older recorded talk on that) and infra-Bayesian physicalism. In the future, I might record more lectures to expand this playlist.
EDIT: I know the audio quality is bad, and I apologize. I will try to do better next time.
Video lectures on the learning-theoretic agenda
Link post
This is a YouTube playlist of recorded lectures on the learning-theoretic AI alignment agenda (LTA) I gave for my MATS scholars of the Winter 2024 cohort, edited by my beloved spouse @Marcus Ogren. H/t William Brewer for helping with the recording, and the rest of the MATS team for making this possible.
I hope these will become a useful resource for anyone who wants to get up to speed on the LTA, complementary to the reading list. Notable topics that aren’t covered include metacognitive agents (although there is an older recorded talk on that) and infra-Bayesian physicalism. In the future, I might record more lectures to expand this playlist.
EDIT: I know the audio quality is bad, and I apologize. I will try to do better next time.
Table of Contents
Agents and AIXI
Hidden rewards and the problem of privilege
Compositionality
Nonrealizability
It’s a trap!
Traps, continued
Traps and frequentist guarantees
Game theory and learning theory
Hidden rewards
Algorithmic Descriptive Agency Measure (ADAM)
General reinforcement learning
Infra-Bayesianism
Learnability
Infra-Bandits
Newcombian problems
Ultradistributions and semi-environments
Formalizing Newcombian problems
Pseudocausality and a general formulation of Newcombian problems
Decision rules and pseudocausality
Instrumental reward functions
Infra-Bayesian haggling, part 1
Infra-Bayesian haggling, part 2
Anytime algorithms in multi-agent settings
Bounded inductive rationality