Thanks, I’ll take that as confirmation that Eliezer never posted his planned critique on Less Wrong.
in which Yudkowsky argued that AIXI’s lack of reflectivity leaves it vulnerable in Prisoner’s Dilemma-type situations
That’s one problem with AIXI, but not directly relevant to the blog post XiXiDu and I linked to. I was thinking of a recent presentation I saw where the presenter said “It [AIXI] gets rid of all the humans, and it gets a brick, and puts it on the reward button.” and it turns out that was Roko, not Eliezer.
A bit more searching reveals that I had actually made a version of this argument myself, here and here.
I was thinking of a recent presentation I saw where the presenter said “It [AIXI] gets rid of all the humans, and it gets a brick, and puts it on the reward button.” and it turns out that was Roko, not Eliezer.
Hutter has discussed AIXI wireheading several times, most recenly in his AGI-10 presentation—where he discusses wireheading in the Q & A at the end (01:03:00) - claiming that he can prove it won’t happen in some cases—but not all of them.
Mostly he argues that it probably won’t do it—for the same reason that many humans don’t take drugs: the long-term rewards are low.
Here’s a quote:
Another problem connected, but possibly not limited to embodied agents, especially if they are rewarded by humans, is the following: Sufficiently intelligent agents may increase their rewards by psychologically manipulating their human “teachers”, or by threatening them. This is a general sociological problem which successful AI will cause, which has nothing specifically to do with AIXI. Every intelligence superior to humans is capable of manipulating the latter. In the absence of manipulable humans, e.g. where the reward structure serves a survival function, AIXI may directly hack into its reward feedback. Since this will unlikely increase its long-term survival, AIXI will probably resist this kind of manipulation (like most humans don’t take hard drugs, due to their long-term catastrophic consequences).
Another problem connected, but possibly not limited to embodied agents, especially if they are rewarded by humans, is the following: Sufficiently intelligent agents may increase their rewards by psychologically manipulating their human “teachers”, or by threatening them. This is a general sociological problem which successful AI will cause, which has nothing specifically to do with AIXI.
These days, one might say: “this is a general sociological problem which pure reinforcement learning agents will cause—which illustrates why we should not build them.”
Hutter has discussed AIXI wireheading several times, most recenly in his AGI-10 presentation.
Thanks, I wasn’t aware that he had address the issue at all. When I made the argument to him in 2002, he didn’t respond to my post.
Mostly he argues that it probably won’t do it—for the same reason that many humans don’t take drugs: the long-term rewards are low.
After Googling for quote to see where it came from, I see that you refuted Hutter’s counter-argument yourself at http://alife.co.uk/essays/on_aixi/. (Why didn’t you link to it?) I agree with your counter-counter-argument.
Thanks, I’ll take that as confirmation that Eliezer never posted his planned critique on Less Wrong.
That’s one problem with AIXI, but not directly relevant to the blog post XiXiDu and I linked to. I was thinking of a recent presentation I saw where the presenter said “It [AIXI] gets rid of all the humans, and it gets a brick, and puts it on the reward button.” and it turns out that was Roko, not Eliezer.
A bit more searching reveals that I had actually made a version of this argument myself, here and here.
Hutter has discussed AIXI wireheading several times, most recenly in his AGI-10 presentation—where he discusses wireheading in the Q & A at the end (01:03:00) - claiming that he can prove it won’t happen in some cases—but not all of them.
Mostly he argues that it probably won’t do it—for the same reason that many humans don’t take drugs: the long-term rewards are low.
Here’s a quote:
Marcus Hutter once wrote:
These days, one might say: “this is a general sociological problem which pure reinforcement learning agents will cause—which illustrates why we should not build them.”
Thanks, I wasn’t aware that he had address the issue at all. When I made the argument to him in 2002, he didn’t respond to my post.
After Googling for quote to see where it came from, I see that you refuted Hutter’s counter-argument yourself at http://alife.co.uk/essays/on_aixi/. (Why didn’t you link to it?) I agree with your counter-counter-argument.
I have another video on the topic as well (Superintelligent junkies) - but unfortunalely there’s no transcript for that one at the moment.
Yes, that is correct.