Ok, so basically, we could make an AI that wants to maximize a variable called Utility and that AI might edit its code, but we probably would figure out a way to write it so that it always evaluates the decision on whether to modify its utility function according to its current utility function, so it never would—is that what you’re saying?
Also, maybe I’m conflating unrelated idea here—I’m not in the AI field—but I think I recall there being a tiling problem of trying to prove that an agent that makes a copy of itself wouldn’t change its utility function. If any VNM-rational agent wouldn’t want to change its utility function does that mean that the question is just whether the AI would make a mistake when creating its successor?
so basically, we could make an AI that wants to maximize a variable called Utility
Oh, maybe this is the confusion. It’s not a variable called Utility. It’s the actual true goal of the agent. We call it “utility” when analyzing decisions, and VNM-rational agents act as if they have a utility function over states of the world, but it doesn’t have to be external or programmable.
I’d taken your pseudocode as a shorthand for “design the rational agent such that what it wants is …”. It’s not literally a variable, nor a simple piece of code that non-simple code could change.
Ok, so basically, we could make an AI that wants to maximize a variable called Utility and that AI might edit its code, but we probably would figure out a way to write it so that it always evaluates the decision on whether to modify its utility function according to its current utility function, so it never would—is that what you’re saying?
Also, maybe I’m conflating unrelated idea here—I’m not in the AI field—but I think I recall there being a tiling problem of trying to prove that an agent that makes a copy of itself wouldn’t change its utility function. If any VNM-rational agent wouldn’t want to change its utility function does that mean that the question is just whether the AI would make a mistake when creating its successor?
Oh, maybe this is the confusion. It’s not a variable called Utility. It’s the actual true goal of the agent. We call it “utility” when analyzing decisions, and VNM-rational agents act as if they have a utility function over states of the world, but it doesn’t have to be external or programmable.
I’d taken your pseudocode as a shorthand for “design the rational agent such that what it wants is …”. It’s not literally a variable, nor a simple piece of code that non-simple code could change.