Rohin Shah comments on What I talk about when I talk about AI x-risk: 3 core claims I want machine learning researchers to address.

Rohin Shah 29 Dec 2019 2:35 UTC
LW: 2 AF: 2
AF
Planned summary for the Alignment Newsletter:
When making the case for work on AI x-risk to other ML researchers, what should we focus on? This post suggests arguing for three core claims:
1. Due to Goodhart’s law, instrumental goals, and safety-performance trade-offs, the development of advanced AI increases the risk of human extinction non-trivially.
2. To mitigate this x-risk, we need to know how to build safe systems, know that we know how to build safe systems, and prevent people from building unsafe systems.
3. So, we should mitigate AI x-risk, as it is impactful, neglected, and challenging but tractable.
Planned opinion:
This is a nice concise case to make, but I think the bulk of the work is in splitting the first claim into subclaims: this is the part that is usually a sticking point.