You’re right that we don’t want agents to keep the probability of shutdown constant in all situations, for all the reasons you give. The key thing you’re missing is that the setting for the First Theorem is what I call a ‘shutdown-influencing state’, where the only thing that the agent can influence is the probability of shutdown. We want the agent’s preferences to be such that they would lack a preference between all available actions in such states. And that’s because: if they had preferences between the available actions in such states, they would resist our attempts to shut them down; and if they lacked preferences between the available actions in such states, they wouldn’t resist our attempts to shut them down.
You’re right that we don’t want agents to keep the probability of shutdown constant in all situations, for all the reasons you give. The key thing you’re missing is that the setting for the First Theorem is what I call a ‘shutdown-influencing state’, where the only thing that the agent can influence is the probability of shutdown. We want the agent’s preferences to be such that they would lack a preference between all available actions in such states. And that’s because: if they had preferences between the available actions in such states, they would resist our attempts to shut them down; and if they lacked preferences between the available actions in such states, they wouldn’t resist our attempts to shut them down.
Ah! That makes more sense.