Marcus Hutter is a rare exception who specified his AGI in such unambiguous mathematical terms that he actually succeeded at realizing, after some discussion with SIAI personnel, that AIXI would kill off its users and seize control of its reward button. But based on past sad experience with many other would-be designers, I say “Explain to a neutral judge how the math kills” and not “Explain to the person who invented that math and likes it.”
Any sources to this extraordinary claim? Hutter’s own statements? Cartesian-dualist AI has real trouble preserving itself against shut down, which you yourself have noted. It has to somehow have a model where reward disappears if it stops being computed, or you get the AI that would shut itself down when reward is pressed, and that’s it. edit: I.e. it is pretty clear that AIXI is not a friendly AI and can kill you, that’s pretty agreeable, but it remains to be shown that it would be hard to kill AIXI (assuming it can’t do infinite recursion predicting itself).
edit2: and of course, nothing in AIXI fundamentally requires that you sum the reward over a “large number of future steps” rather than 1 step. (I don’t think its scarier summing over unlimited number of steps though, think what sort of models it can make if it ever observes effects of slight temperature caused variations in the CPU clock rate for example, against the physics model it has on it’s other input. If it can’t understand speeding up itself, it’ll figure it slows down entire universe, more rewards per external risk. Here’s one anvil onto the head: overclocking, or just straight fan shutdown so that internal temperature rises and the quartz clock ticks a teeny bit faster. I think it is going to be deviously clever at killing itself as soon as possible. Hutter likes his math may be the reason why you can convince him it will actually be smart enough to kill people)
Any sources to this extraordinary claim? Hutter’s own statements? Cartesian-dualist AI has real trouble preserving itself against shut down, which you yourself have noted. It has to somehow have a model where reward disappears if it stops being computed, or you get the AI that would shut itself down when reward is pressed, and that’s it. edit: I.e. it is pretty clear that AIXI is not a friendly AI and can kill you, that’s pretty agreeable, but it remains to be shown that it would be hard to kill AIXI (assuming it can’t do infinite recursion predicting itself).
edit2: and of course, nothing in AIXI fundamentally requires that you sum the reward over a “large number of future steps” rather than 1 step. (I don’t think its scarier summing over unlimited number of steps though, think what sort of models it can make if it ever observes effects of slight temperature caused variations in the CPU clock rate for example, against the physics model it has on it’s other input. If it can’t understand speeding up itself, it’ll figure it slows down entire universe, more rewards per external risk. Here’s one anvil onto the head: overclocking, or just straight fan shutdown so that internal temperature rises and the quartz clock ticks a teeny bit faster. I think it is going to be deviously clever at killing itself as soon as possible. Hutter likes his math may be the reason why you can convince him it will actually be smart enough to kill people)