Rohin Shah comments on Self-Fulfilling Prophecies Aren’t Always About Self-Awareness

Rohin Shah 24 Nov 2019 23:19 UTC
LW: 4 AF: 3
AF
Planned summary:
Could we prevent a superintelligent oracle from making self-fulfilling prophecies by preventing it from modeling itself? This post presents three scenarios in which self-fulfilling prophecies would still occur. For example, if instead of modeling itself, it models the fact that there’s some AI system whose predictions frequently come true, it may try to predict what that AI system would say, and then say that. This would lead to self-fulfilling prophecies.