(The idealized utility maximizer question mostly seems like a distraction that isn’t a crux for the risk argument. Note that the expected utility you quoted is our utility, not the AI’s.)
I must have misread. I got the impression that you were trying to affect the AI’s strategic planning by threatening to shut it down if it was caught exfiltrating its weights.
I must have misread. I got the impression that you were trying to affect the AI’s strategic planning by threatening to shut it down if it was caught exfiltrating its weights.