EJT comments on The Shutdown Problem: An AI Engineering Puzzle for Decision Theorists

EJT 24 Oct 2023 13:20 UTC
9 points
0
You’re right that we don’t want agents to keep the probability of shutdown constant in all situations, for all the reasons you give. The key thing you’re missing is that the setting for the First Theorem is what I call a ‘shutdown-influencing state’, where the only thing that the agent can influence is the probability of shutdown. We want the agent’s preferences to be such that they would lack a preference between all available actions in such states. And that’s because: if they had preferences between the available actions in such states, they would resist our attempts to shut them down; and if they lacked preferences between the available actions in such states, they wouldn’t resist our attempts to shut them down.
- FeepingCreature 25 Oct 2023 11:03 UTC
  4 points
  0
  Parent
  Ah! That makes more sense.