I am surprised that I need to write this, but if killing the humans will decrease P(shutdown) by more than 1e-4, then continuing to refrain from killing the humans is going to worry and weigh on the AI more than a 1e-4 possibility it is in a simulation. (For simplicity, assume that the possibility of shutdown is currently the dominant danger faced by the AI.)
This is assuming that the AI only care about being alive. For any utility function, we could make a non-linear transformation of it to make it risk adverse. E.g. we can transform it such that it can never take a value above 100, and that the default world (without the AI) has a value of 99.999. If we also give the case where an outside observer disapproves of the agent a value of 0, the AI would rather be shut down by humans than do something it know would be disapproved by the outside observer.
This is assuming that the AI only care about being alive. For any utility function, we could make a non-linear transformation of it to make it risk adverse. E.g. we can transform it such that it can never take a value above 100, and that the default world (without the AI) has a value of 99.999. If we also give the case where an outside observer disapproves of the agent a value of 0, the AI would rather be shut down by humans than do something it know would be disapproved by the outside observer.