@timtyler, you have made a nice point about taxonomy—also noting your comment re Hibbard below.
I suggest classifying like this:
Agents that maximize a utility register, a memory location that can be hijacked (as Utilitron; something similar happened with Eurisko).
Agents that maximize an internally-calculated utility function of either input (observations) or of world-model. Agents that maximize a function of the input stream can hijack that input stream or any point in the pipeline of calculations that produces this number. Drugs and electrical wireheading relate to this.
Agents that maximize a reward provided from the outside, whether from the creator or the the environment at large. The reward function may be unknown to the agent. These agents can hijack the reward stream.
All these are distinct from:
Wireheading in humans, which as Eliezer points out, results from different desires of different mental parts.
Paperclippers, which could naively be seen as wireheading if we falsely liken its simplistic behavior to a human who is satisfying a simple pleasure sensation as opposed to a more complex value system: “Why are you going wild with stimulating your cravings for making paperclips, like humans who overeat, rather than considering more deeply what would be the right thing do?”
Anja, nice post.
@timtyler, you have made a nice point about taxonomy—also noting your comment re Hibbard below.
I suggest classifying like this:
Agents that maximize a utility register, a memory location that can be hijacked (as Utilitron; something similar happened with Eurisko).
Agents that maximize an internally-calculated utility function of either input (observations) or of world-model. Agents that maximize a function of the input stream can hijack that input stream or any point in the pipeline of calculations that produces this number. Drugs and electrical wireheading relate to this.
Agents that maximize a reward provided from the outside, whether from the creator or the the environment at large. The reward function may be unknown to the agent. These agents can hijack the reward stream.
All these are distinct from:
Wireheading in humans, which as Eliezer points out, results from different desires of different mental parts.
Paperclippers, which could naively be seen as wireheading if we falsely liken its simplistic behavior to a human who is satisfying a simple pleasure sensation as opposed to a more complex value system: “Why are you going wild with stimulating your cravings for making paperclips, like humans who overeat, rather than considering more deeply what would be the right thing do?”