Rob Bensinger answers Critiques of the Agent Foundations agenda?

Rob Bensinger 4 Dec 2020 16:53 UTC
12 points
Some that come to mind (note: I work at MIRI):
- 2016: Open Philanthropy Project, Anonymized Reviews of Three Recent Papers from MIRI’s Agent Foundations Research Agenda (separate reply from Nate Soares, and comments by Eliezer Yudkowsky)
- 2017: Daniel Dewey, My current thoughts on MIRI’s “highly reliable agent design” work (replies from Nate Soares in the comments)
- 2018: Richard Ngo, Realism about rationality
- 2018: Wolfgang Schwarz, On Functional Decision Theory
- 2019: Will MacAskill, A Critique of Functional Decision Theory (replies from Abram Demski in the comments)
I’d also include arguments of the form ‘we don’t need to solve agent foundations problems, because we can achieve good outcomes from AI via alternative method X and it’s easier to just do X’. E.g., (from 2015) Paul Christiano’s Abstract Approval-Direction.
Also, some overviews that aren’t trying to argue against agent foundations may still provide useful maps of where people disagree (though I don’t think e.g. Nate would 100% endorse any of these), like:
- 2016: Jessica Taylor, My current take on the Paul-MIRI disagreement on alignability of messy AI
- 2017: Jessica Taylor, On motivations for MIRI’s highly reliable agent design research
- 2020: Issa Rice, Plausible cases for HRAD work, and locating the crux in the “realism about rationality” debate