Mark Xu comments on Open Problems with Myopia

Mark Xu 10 Mar 2021 19:55 UTC
LW: 9 AF: 7
AF
Yeah, you’re right that it’s obviously unsafe. The words “in theory” were meant to gesture at that, but it could be much better worded. Changed to “A prototypical example is a time-limited myopic approval-maximizing agent. In theory, such an agent has some desirable safety properties because a human would only approve safe actions (although we still would consider it unsafe).”