Why would an AI wirehead itself to short-circuit its utility function? Beings governed by a utility function don’t want to trick themselves into believing that they have optimized the world into a state with higher utility, they want to actually optimize the world into such a state.
If I want to save the world, I don’t wirehead because that wouldn’t save the world.
I’m sorry, I must have misunderstood your initial proposal. I thought you were specifying an additional component—after it has achieved its maximum utility, the additional component steps in and shuts down the entity.
Rather, you were saying: If the AI achieves the goal, it will want nothing further, and therefore automatically act as if it were shut down. Presumably if we take this as given, the negative consequences would have to be while accomplishing the “fairly-easy” goal.
I am merely trying to create amusing or interesting science fiction “poetic justice” scenarios, similar to Dresden Codak’s “caveman science fiction”. I am not trying to create serious arguments, and I don’t want to try to be serious on this subject.
Rather, you were saying: If the AI achieves the goal, it will want nothing further, and therefore automatically act as if it were shut down.
If you don’t provide an explicit shutdown goal (as Dorikka did have in mind), then you get into a situation where all remaining potential utility gains come from skeptical scenarios where the upper bound hasn’t actually been achieved, so the AI devotes all available resources to making ever more sure that there are no Cartesian demons deceiving it. (Also, depending on its implicit ontology, maybe to making sure time travelers can’t undo its success, or other things like that.)
This comment is my patch for “why will the AI actually shut down,” but I didn’t read your comment as trying to circumvent the shut-down procedure but rather the utility function itself (from the words “achieving the upper bound”), so I (erroneously) didn’t consider it applicable at the time. But, yes, the patch is needed so that the AI doesn’t consider the shutdown function an ordinary bit of code that it can modify.
Mmph. I’m more interested in seeing how far I can push this before my AI idea gets binned (and I am pretty sure it will.)
Why would an AI wirehead itself to short-circuit its utility function? Beings governed by a utility function don’t want to trick themselves into believing that they have optimized the world into a state with higher utility, they want to actually optimize the world into such a state.
If I want to save the world, I don’t wirehead because that wouldn’t save the world.
I’m sorry, I must have misunderstood your initial proposal. I thought you were specifying an additional component—after it has achieved its maximum utility, the additional component steps in and shuts down the entity.
Rather, you were saying: If the AI achieves the goal, it will want nothing further, and therefore automatically act as if it were shut down. Presumably if we take this as given, the negative consequences would have to be while accomplishing the “fairly-easy” goal.
I am merely trying to create amusing or interesting science fiction “poetic justice” scenarios, similar to Dresden Codak’s “caveman science fiction”. I am not trying to create serious arguments, and I don’t want to try to be serious on this subject.
http://dresdencodak.com/2009/09/22/caveman-science-fiction/
If you don’t provide an explicit shutdown goal (as Dorikka did have in mind), then you get into a situation where all remaining potential utility gains come from skeptical scenarios where the upper bound hasn’t actually been achieved, so the AI devotes all available resources to making ever more sure that there are no Cartesian demons deceiving it. (Also, depending on its implicit ontology, maybe to making sure time travelers can’t undo its success, or other things like that.)
This comment is my patch for “why will the AI actually shut down,” but I didn’t read your comment as trying to circumvent the shut-down procedure but rather the utility function itself (from the words “achieving the upper bound”), so I (erroneously) didn’t consider it applicable at the time. But, yes, the patch is needed so that the AI doesn’t consider the shutdown function an ordinary bit of code that it can modify.
Mmph. I’m more interested in seeing how far I can push this before my AI idea gets binned (and I am pretty sure it will.)