I found the idea of an AI that is not goal-directed very enticing. It seemed the perfect antidote to Omohundro et. al. on universal instrumental goals, because the latter arguments rely on a utility function, something that even human beings arguably don’t have. A utility function is a crisp mathematical idealization of the concept of goal-direction. (I’ll just assert that without argument, and hope it rings true.) If human beings, the example sine qua non of intelligence, don’t exactly have utilities, might it not be possible to make other forms of intelligence that are even further from goal-directed behavior?
Unfortunately, Paul Christiano has convinced me that I was probably mistaken:
We might try to write a program that doesn’t pursue a goal, and fail.
Issue [2] sounds pretty strange—it’s not the kind of bug most software has. But when you are programming with gradient descent, strange things can happen.
I found the idea of an AI that is not goal-directed very enticing. It seemed the perfect antidote to Omohundro et. al. on universal instrumental goals, because the latter arguments rely on a utility function, something that even human beings arguably don’t have. A utility function is a crisp mathematical idealization of the concept of goal-direction. (I’ll just assert that without argument, and hope it rings true.) If human beings, the example sine qua non of intelligence, don’t exactly have utilities, might it not be possible to make other forms of intelligence that are even further from goal-directed behavior?
Unfortunately, Paul Christiano has convinced me that I was probably mistaken: