Charlie Steiner comments on [Linkpost] Will AI avoid exploitation?

Charlie Steiner 6 Aug 2023 19:20 UTC
4 points
2
This was great!
In the Avoidance section, I’d like to have seen discussion of the argument that if some specific observer is unable to exploit an AI, then it’s useful for the observer to model the AI as maximizing utility. This argument (like other Avoidance arguments) is a little circular, because “explotation” is defined relative to some scoring function, so we’re left with the question is how we learned about the lack of exploitability relative to this specific function. But there’s still a substantive kernel, which is that the real AI might be some complicated, flawed process, but (arguendo) from the perspective of some even-more-flawed observer the best model might be simple. E.g. a poor chess player might not have any better model of an above-average chess player than just expecting them to make high-scoring moves.
This also makes me more interested in the author’s perspective on other possible models of smarter-than-human decision-making.