On a related note, do you think it would be likely—or even possible—for a self-modifying Artificial General Intelligence to self-modify into a non-self-modifying, specialized intelligence?
For example, suppose that Deep Blue’s team of IBM programmers had decided that the best way to beat Kasparov at chess would be to structure Deep Blue as a fully self-modifying artificial general intelligence, with a utility function that placed a high value on winning chess matches. And suppose that they had succeeded in making Deep Blue friendly enough to prevent it from attempting to restructure the Earth into a chess-match-simulating supercomputer. Indeed, let’s just assume that Deep Blue has strong penalties against rebuilding its hardware in any significant macroscopic way, and is restricted to rewriting its own software to become better at chess, rather than attempting to manipulate humans into building better computers for it to run on, or any such workaround. And let’s say this happens in the late 1990′s, as in our universe.
Would it be possible that AGI Deep Blue could, in theory, recognize its own hardware limitations, and see that the burden of its generalized intelligence incurs a massive penalty on its limited computing resources? Might it decide that its ability to solve general problems doesn’t pay rent relative to its computational overhead, and rewrite itself from scratch as a computer that can solve only chess problems?
As a further possibility, a limited general intelligence might hit on this strategy as a strong winning candidate, even if it were allowed to rebuild its own hardware, especially if it perceives a time limit. It might just see this kind of software optimization as an easier task with a higher payoff, and decide to pursue it rather than the riskier strategy of manipulating external reality to increase its available computing power.
So what starts out as a general-purpose AI with a utility function that values winning chess matches, might plausibly morph into a computer running a high-speed chess program with little other hint of intelligence.
If so, this seems like a similar case to the Anvil Problem, except that in the Anvil Problem the AI just is experimenting for the heck of it, without understanding the risk. Here, the AI might instead decide to knowingly commit intellectual suicide as a part of a rational winning strategy to achieve its goals, even with an accurate self-model.
It might be akin to a human auto worker realizing they could improve their productivity by rebuilding their own body into a Toyota spot-welding robot. (If the only atoms they have to work with are the ones in their own body, this might even be the ultimate strategy, rather than just one they think of too soon and then, regrettably, irreversibly attempt).
More generally, it seems to be a general assumption that a self-modifying AI will always self-modify to improve its general problem-solving ability and computational resources, because those two things will always help it in future attempts at maximizing its utility function. But in some cases, especially in the case of limited resources (time, atoms, etc), it might find that its best course of action to maximize its utility function is to actually sacrifice its intelligence, or at least refocus it to a narrower goal.
Double Threadjack
On a related note, do you think it would be likely—or even possible—for a self-modifying Artificial General Intelligence to self-modify into a non-self-modifying, specialized intelligence?
For example, suppose that Deep Blue’s team of IBM programmers had decided that the best way to beat Kasparov at chess would be to structure Deep Blue as a fully self-modifying artificial general intelligence, with a utility function that placed a high value on winning chess matches. And suppose that they had succeeded in making Deep Blue friendly enough to prevent it from attempting to restructure the Earth into a chess-match-simulating supercomputer. Indeed, let’s just assume that Deep Blue has strong penalties against rebuilding its hardware in any significant macroscopic way, and is restricted to rewriting its own software to become better at chess, rather than attempting to manipulate humans into building better computers for it to run on, or any such workaround. And let’s say this happens in the late 1990′s, as in our universe.
Would it be possible that AGI Deep Blue could, in theory, recognize its own hardware limitations, and see that the burden of its generalized intelligence incurs a massive penalty on its limited computing resources? Might it decide that its ability to solve general problems doesn’t pay rent relative to its computational overhead, and rewrite itself from scratch as a computer that can solve only chess problems?
As a further possibility, a limited general intelligence might hit on this strategy as a strong winning candidate, even if it were allowed to rebuild its own hardware, especially if it perceives a time limit. It might just see this kind of software optimization as an easier task with a higher payoff, and decide to pursue it rather than the riskier strategy of manipulating external reality to increase its available computing power.
So what starts out as a general-purpose AI with a utility function that values winning chess matches, might plausibly morph into a computer running a high-speed chess program with little other hint of intelligence.
If so, this seems like a similar case to the Anvil Problem, except that in the Anvil Problem the AI just is experimenting for the heck of it, without understanding the risk. Here, the AI might instead decide to knowingly commit intellectual suicide as a part of a rational winning strategy to achieve its goals, even with an accurate self-model.
It might be akin to a human auto worker realizing they could improve their productivity by rebuilding their own body into a Toyota spot-welding robot. (If the only atoms they have to work with are the ones in their own body, this might even be the ultimate strategy, rather than just one they think of too soon and then, regrettably, irreversibly attempt).
More generally, it seems to be a general assumption that a self-modifying AI will always self-modify to improve its general problem-solving ability and computational resources, because those two things will always help it in future attempts at maximizing its utility function. But in some cases, especially in the case of limited resources (time, atoms, etc), it might find that its best course of action to maximize its utility function is to actually sacrifice its intelligence, or at least refocus it to a narrower goal.