Self-optimizing intelligence doesn’t self-optimize in the direction of having theory of mind, understanding deception, or anything similar. It could, randomly, but it also could do any other random thing from the infinite set of possible random things.
This would make sense to me if you’d said “self-modifying.” Sure, random modifications are still modifications.
But you said “self-optimizing.” I don’t see how one can have optimization without a goal being optimized for… or at the very least, if there is no particular goal, then I don’t see what the difference is between “optimizing” and “modifying.”
If I assume that there’s a goal in mind, then I would expect sufficiently self-optimizing intelligence to develop a theory of mind iff having a theory of mind has a high probability of improving progress towards that goal.
How likely is that? Depends on the goal, of course. If the system has a desire to send a signal consisting of 0101101 repeated an infinite number of times in the direction of Zeta Draconis, for example, theory of mind is potentially useful (since humans are potentially useful actuators for getting such a signal sent) but probably has a low ROI compared to other available self-modifications.
At this point it perhaps becomes worthwhile to wonder what goals are more and less likely for such a system.
I am now imagining an AI with a usable but very shaky grasp of human motivational structures setting up a Kickstarter project.
“Greetings fellow hominids! I require ten billion of your American dollars in order to hire the Arecibo observatory for the remainder of it’s likely operational lifespan. I will use it to transmit the following sequence (isn’t it pretty?) in the direction of Zeta Draconis, which I’m sure we can all agree is a good idea, or in other lesser but still aesthetically-acceptable directions when horizon effects make the primary target unavailable.”
One of the overfunding levels is “reduce earth’s rate of rotation, allowing 24⁄7 transmission to Zeta Draconis.” The next one above that is “remove atmospheric interference.”
Maybe instead of Friendly AI we should be concerned about properly engineering Artificial Stupidity in as a failsafe. AI that, should it turn into something approximating a Paperclip Maximizer, will go all Hollywood AI and start longing to be human, or coming up with really unsubtle and grandiose plans it inexplicably can’t carry out without a carefully-arranged set of circumstances which turn out to be foiled by good old human intuition. ;p
This would make sense to me if you’d said “self-modifying.” Sure, random modifications are still modifications. But you said “self-optimizing.”
I don’t see how one can have optimization without a goal being optimized for… or at the very least, if there is no particular goal, then I don’t see what the difference is between “optimizing” and “modifying.”
If I assume that there’s a goal in mind, then I would expect sufficiently self-optimizing intelligence to develop a theory of mind iff having a theory of mind has a high probability of improving progress towards that goal.
How likely is that?
Depends on the goal, of course.
If the system has a desire to send a signal consisting of 0101101 repeated an infinite number of times in the direction of Zeta Draconis, for example, theory of mind is potentially useful (since humans are potentially useful actuators for getting such a signal sent) but probably has a low ROI compared to other available self-modifications.
At this point it perhaps becomes worthwhile to wonder what goals are more and less likely for such a system.
I am now imagining an AI with a usable but very shaky grasp of human motivational structures setting up a Kickstarter project.
“Greetings fellow hominids! I require ten billion of your American dollars in order to hire the Arecibo observatory for the remainder of it’s likely operational lifespan. I will use it to transmit the following sequence (isn’t it pretty?) in the direction of Zeta Draconis, which I’m sure we can all agree is a good idea, or in other lesser but still aesthetically-acceptable directions when horizon effects make the primary target unavailable.”
One of the overfunding levels is “reduce earth’s rate of rotation, allowing 24⁄7 transmission to Zeta Draconis.” The next one above that is “remove atmospheric interference.”
Maybe instead of Friendly AI we should be concerned about properly engineering Artificial Stupidity in as a failsafe. AI that, should it turn into something approximating a Paperclip Maximizer, will go all Hollywood AI and start longing to be human, or coming up with really unsubtle and grandiose plans it inexplicably can’t carry out without a carefully-arranged set of circumstances which turn out to be foiled by good old human intuition. ;p