The biggest disconnect is that this post is nota proposal for how to solve corrigibility. I’m just thinking about what corrigibility is/should be, and this seems like a shard of it—but only a shard. I’ll edit the post to better communicate that.
So, your points are good, but they run skew to what I was thinking about while writing the post.
The biggest disconnect is that this post is not a proposal for how to solve corrigibility. I’m just thinking about what corrigibility is/should be, and this seems like a shard of it—but only a shard. I’ll edit the post to better communicate that.
So, your points are good, but they run skew to what I was thinking about while writing the post.