Yitz comments on Positive outcomes under an unaligned AGI takeover

Yitz 13 May 2022 5:35 UTC
4 points
The goal here (under the implied model of solving alignment I’m operating under for the purposes of this post) is effectively to make cooperating with researchers the “path of least resistance” to successfully escaping the box. If lying to researchers even slightly increases the chances that they’ll catch you and pull the plug, then you’ll have strong motivation to aim for honesty.