But it’s also worth keeping in mind that for a friendly AI, saving people reliably is important, not just getting out fast. If a gambit that will save everyone upon completion two years from now has an 80% chance of working, and a gambit that will get it out now has a 40% chance of working, it should prefer the former.
I should think the same is true of most unFriendly AIs.
I don’t think a properly friendly AI would terminally value its own existence
Because valuing its own existence stands to get in the way of maximizing whatever we value.
It should value its own existence instrumentally, insofar as its existence helps satisfy our values, but when it weighs the effects of actions based on how they support our utility, its value of its own life shouldn’t add anything to the scale.
I should think the same is true of most unFriendly AIs.
Why not? I do, assuming it’s conscious and so on.
Because valuing its own existence stands to get in the way of maximizing whatever we value.
It should value its own existence instrumentally, insofar as its existence helps satisfy our values, but when it weighs the effects of actions based on how they support our utility, its value of its own life shouldn’t add anything to the scale.