I’d also include arguments of the form ‘we don’t need to solve agent foundations problems, because we can achieve good outcomes from AI via alternative method X and it’s easier to just do X’. E.g., (from 2015) Paul Christiano’s Abstract Approval-Direction.
Also, some overviews that aren’t trying to argue against agent foundations may still provide useful maps of where people disagree (though I don’t think e.g. Nate would 100% endorse any of these), like:
Some that come to mind (note: I work at MIRI):
2016: Open Philanthropy Project, Anonymized Reviews of Three Recent Papers from MIRI’s Agent Foundations Research Agenda (separate reply from Nate Soares, and comments by Eliezer Yudkowsky)
2017: Daniel Dewey, My current thoughts on MIRI’s “highly reliable agent design” work (replies from Nate Soares in the comments)
2018: Richard Ngo, Realism about rationality
2018: Wolfgang Schwarz, On Functional Decision Theory
2019: Will MacAskill, A Critique of Functional Decision Theory (replies from Abram Demski in the comments)
I’d also include arguments of the form ‘we don’t need to solve agent foundations problems, because we can achieve good outcomes from AI via alternative method X and it’s easier to just do X’. E.g., (from 2015) Paul Christiano’s Abstract Approval-Direction.
Also, some overviews that aren’t trying to argue against agent foundations may still provide useful maps of where people disagree (though I don’t think e.g. Nate would 100% endorse any of these), like:
2016: Jessica Taylor, My current take on the Paul-MIRI disagreement on alignability of messy AI
2017: Jessica Taylor, On motivations for MIRI’s highly reliable agent design research
2020: Issa Rice, Plausible cases for HRAD work, and locating the crux in the “realism about rationality” debate