Rohin Shah comments on (My understanding of) What Everyone in Technical Alignment is Doing and Why

Rohin Shah 30 Aug 2022 7:35 UTC
10 points
3
What I meant by ‘is the NAH true for ethics?’ is ‘do sufficiently intelligent agents tend to converge on the same goals?’, which, now that I think about it, is just the negation of the orthogonality thesis.
Ah, got it, that makes sense. The reason I was confused is that NAH applied to ethics would only say that the AI system has a concept of ethics similar to the ones humans have; it wouldn’t claim that the AI system would be motivated by that concept of ethics.