If your resources are limited, you cannot follow certain goals. If your goal is to compute at least 1000 digits of Chaitin’s constant, sucks to be computable. I think no agent with a polynomial amount of memory can follow a utility function vulnerable to Pascal’s Mugging.
This raises a general issues of how to distinguish an agent that wants X and fails to get it from one that wants to avoid X.
This raises a general issues of how to distinguish an agent that wants X and fails to get it from one that wants to avoid X.
An agent’s purpose is, in principle, quite easy to detect. That is, there are no issues of philosophy, only of practicality. Or to put that another way, it is no longer philosophy, but science, which is what philosophy that works is called.
Here is a program that can read your mind and tell you your purpose!
FWIW, I tried the program. So far it’s batting 0⁄3.
I think it’s not very well tuned. I’ve seen another version of the demo that was very quick to spot which perception the user was controlling. One reason is that this version tries to make it difficult for a human onlooker to see at once which of the cartoon heads you’re controlling, by keeping the general variability of the motion of each one the same. It may take 10 or 20 seconds for Mr. Burns to show up. And of course, you have to play your part in the demo as well as you can; the point of it is what happens when you do.
I think the correct answer is going to separate different notions of ‘goal’ (I think Aristotle might have done this; someone more erudite than I is welcome to pull that in).
One possible notion is the ‘design’ goal: in the case of a man-made machine, the designer’s intent; in the case of a standard machine learner, the training function; in the case of a biological entity, reproductive fitness. There’s also a sense in which the behavior itself can be thought of as the goal; that is, an entity’s goal is to produce the outputs that it in fact produces.
There can also be internal structures that we might call ‘deliberate goals’; this is what human self-help materials tell you to set. I’m not sure if there’s a good general definition of this that’s not parochial to human intelligence.
I’m not sure if there’s a fourth kind, but I have an inkling that there might be: an approximate goal. If we say “Intelligence A maximizes function X”, we can quantify how much simpler this is than the true description of A and how much error it introduces into our predictions. If the simplification is high and the error is low it might make sense to call X an approximate goal of A.
This raises a general issues of how to distinguish an agent that wants X and fails to get it from one that wants to avoid X.
An agent’s purpose is, in principle, quite easy to detect. That is, there are no issues of philosophy, only of practicality. Or to put that another way, it is no longer philosophy, but science, which is what philosophy that works is called.
Here is a program that can read your mind and tell you your purpose!
FWIW, I tried the program. So far it’s batting 0⁄3.
I think it’s not very well tuned. I’ve seen another version of the demo that was very quick to spot which perception the user was controlling. One reason is that this version tries to make it difficult for a human onlooker to see at once which of the cartoon heads you’re controlling, by keeping the general variability of the motion of each one the same. It may take 10 or 20 seconds for Mr. Burns to show up. And of course, you have to play your part in the demo as well as you can; the point of it is what happens when you do.
Nice demonstration.
I think the correct answer is going to separate different notions of ‘goal’ (I think Aristotle might have done this; someone more erudite than I is welcome to pull that in).
One possible notion is the ‘design’ goal: in the case of a man-made machine, the designer’s intent; in the case of a standard machine learner, the training function; in the case of a biological entity, reproductive fitness. There’s also a sense in which the behavior itself can be thought of as the goal; that is, an entity’s goal is to produce the outputs that it in fact produces.
There can also be internal structures that we might call ‘deliberate goals’; this is what human self-help materials tell you to set. I’m not sure if there’s a good general definition of this that’s not parochial to human intelligence.
I’m not sure if there’s a fourth kind, but I have an inkling that there might be: an approximate goal. If we say “Intelligence A maximizes function X”, we can quantify how much simpler this is than the true description of A and how much error it introduces into our predictions. If the simplification is high and the error is low it might make sense to call X an approximate goal of A.