Rohin Shah comments on Misalignment and misuse: whose values are manifest?

Rohin Shah 13 Nov 2020 20:42 UTC
LW: 8 AF: 7
AF
- misalignment means the bad outcomes were wanted by AI (and not by its human creators), and
- accident means that the bad outcomes were not wanted by those in power but happened anyway due to error.
My impression was that accident just meant “the AI system’s operator didn’t want the bad thing to happen”, so that it is a superset of misalignment.
Though I agree with the broader point that in realistic scenarios there is usually no single root cause to enable this sort of categorization.