The conjecture I brought up that deceptive alignment relies on selected policies being optimizers gives me the idea that something similar to your argument (where the target of optimization wouldn’t matter, only the fact of optimization for anything at all) would imply that deceptive alignment is less likely to happen. I didn’t mean to claim that I’m reading you as making this implication in the post, or believing it’s true or relevant, that’s instead an implication I’m describing in my comment.
I don’t see how this comment relates to my post. What gives you the idea that I’m trying to refute worries about deceptive alignment?
The conjecture I brought up that deceptive alignment relies on selected policies being optimizers gives me the idea that something similar to your argument (where the target of optimization wouldn’t matter, only the fact of optimization for anything at all) would imply that deceptive alignment is less likely to happen. I didn’t mean to claim that I’m reading you as making this implication in the post, or believing it’s true or relevant, that’s instead an implication I’m describing in my comment.