Did Eliezer’s ever post his planned critique of AIXI? I’ve seen/heard Eliezer state his position on AIXI several times, but can’t locate a detailed argument.
I’ve seen/heard Eliezer state his position on AIXI several times, but can’t locate a detailed argument.
You may be thinking of a 2003 posting and ensuing discussion on the AGI mailing list, in which Yudkowsky argued that AIXI’s lack of reflectivity leaves it vulnerable in Prisoner’s Dilemma-type situations. Best wishes, the Less Wrong Reference Desk.
Thanks, I’ll take that as confirmation that Eliezer never posted his planned critique on Less Wrong.
in which Yudkowsky argued that AIXI’s lack of reflectivity leaves it vulnerable in Prisoner’s Dilemma-type situations
That’s one problem with AIXI, but not directly relevant to the blog post XiXiDu and I linked to. I was thinking of a recent presentation I saw where the presenter said “It [AIXI] gets rid of all the humans, and it gets a brick, and puts it on the reward button.” and it turns out that was Roko, not Eliezer.
A bit more searching reveals that I had actually made a version of this argument myself, here and here.
I was thinking of a recent presentation I saw where the presenter said “It [AIXI] gets rid of all the humans, and it gets a brick, and puts it on the reward button.” and it turns out that was Roko, not Eliezer.
Hutter has discussed AIXI wireheading several times, most recenly in his AGI-10 presentation—where he discusses wireheading in the Q & A at the end (01:03:00) - claiming that he can prove it won’t happen in some cases—but not all of them.
Mostly he argues that it probably won’t do it—for the same reason that many humans don’t take drugs: the long-term rewards are low.
Here’s a quote:
Another problem connected, but possibly not limited to embodied agents, especially if they are rewarded by humans, is the following: Sufficiently intelligent agents may increase their rewards by psychologically manipulating their human “teachers”, or by threatening them. This is a general sociological problem which successful AI will cause, which has nothing specifically to do with AIXI. Every intelligence superior to humans is capable of manipulating the latter. In the absence of manipulable humans, e.g. where the reward structure serves a survival function, AIXI may directly hack into its reward feedback. Since this will unlikely increase its long-term survival, AIXI will probably resist this kind of manipulation (like most humans don’t take hard drugs, due to their long-term catastrophic consequences).
Another problem connected, but possibly not limited to embodied agents, especially if they are rewarded by humans, is the following: Sufficiently intelligent agents may increase their rewards by psychologically manipulating their human “teachers”, or by threatening them. This is a general sociological problem which successful AI will cause, which has nothing specifically to do with AIXI.
These days, one might say: “this is a general sociological problem which pure reinforcement learning agents will cause—which illustrates why we should not build them.”
Hutter has discussed AIXI wireheading several times, most recenly in his AGI-10 presentation.
Thanks, I wasn’t aware that he had address the issue at all. When I made the argument to him in 2002, he didn’t respond to my post.
Mostly he argues that it probably won’t do it—for the same reason that many humans don’t take drugs: the long-term rewards are low.
After Googling for quote to see where it came from, I see that you refuted Hutter’s counter-argument yourself at http://alife.co.uk/essays/on_aixi/. (Why didn’t you link to it?) I agree with your counter-counter-argument.
And following some links from there leads to this 2003 Eliezer posting to an AGI mailing list in which he explains the mirror opinion.
I can’t say I completely understood the argument, but it seemed that the real reason EY deprecates AIXI is that he fears that it would defect in the PD, even when playing against a mirror image—because it wouldn’t recognize the symmetry.
I have to say that this habit of evaluating and grading minds based on how they perform on a cherry-picked selection of games (PD, Hitchhiker, Newcomb) leaves me scratching my head. For every game which makes some particular feature of a decision theory seem desirable (determinism, say, or ability to recognize a copy of yourself) there are other games where that feature doesn’t help, and even games which make that feature look undesirable. It seems to me that Eliezer is approaching decision theory in an amateurish and self-deluding fashion.
And following some links from there leads to this 2003 Eliezer posting to an AGI mailing list in which he explains the mirror opinion.
I can’t say I completely understood the argument, but it seemed that the real reason EY deprecates AIXI is that he fears that it would defect in the PD, even when playing against a mirror image—because it wouldn’t recognize the symmetry.
Probably the two most obvious problems with AIXI (apart from the uncomputability business) are that it:
Would be inclined to grab control of its own reward function—and make sure nobody got in the way of it doing that;
Doesn’t know it has a brain or a body—and so might easily eat its own brains accidentally.
I discuss these problems in more detail in my essay on the topic. Teaching it that it has a brain may not be rocket science.
It seems to me that Eliezer is approaching decision theory in an amateurish and self-deluding fashion.
Given your analysis I concluded the reverse. It is ‘amateurish’ to not pay particular attention to the critical edge cases in your decision theory. Your conclusion of ‘self-delusion’ was utterly absurd.
The Prisoner’s Dilemma. “Cherry Picked”? You can not be serious! It’s the flipping Prisoner’s Dilemma. It’s more or less the archetypal decision theory introduction to cooperation problems.
Did Eliezer’s ever post his planned critique of AIXI? I’ve seen/heard Eliezer state his position on AIXI several times, but can’t locate a detailed argument.
Just now, I wanted to point the author of http://physicsandcake.wordpress.com/2011/01/22/pavlovs-ai-what-did-it-mean/ (“Even in our deepest theories of machine intelligence, the idea of reward comes up. There is a theoretical model of intelligence called AIXI, developed by Marcus Hutter...”) to it, but I couldn’t.
Perhaps the flaws of AIXI are obvious to most of us here by now, but somebody should probably still write them down...
You may be thinking of a 2003 posting and ensuing discussion on the AGI mailing list, in which Yudkowsky argued that AIXI’s lack of reflectivity leaves it vulnerable in Prisoner’s Dilemma-type situations. Best wishes, the Less Wrong Reference Desk.
Thanks, I’ll take that as confirmation that Eliezer never posted his planned critique on Less Wrong.
That’s one problem with AIXI, but not directly relevant to the blog post XiXiDu and I linked to. I was thinking of a recent presentation I saw where the presenter said “It [AIXI] gets rid of all the humans, and it gets a brick, and puts it on the reward button.” and it turns out that was Roko, not Eliezer.
A bit more searching reveals that I had actually made a version of this argument myself, here and here.
Hutter has discussed AIXI wireheading several times, most recenly in his AGI-10 presentation—where he discusses wireheading in the Q & A at the end (01:03:00) - claiming that he can prove it won’t happen in some cases—but not all of them.
Mostly he argues that it probably won’t do it—for the same reason that many humans don’t take drugs: the long-term rewards are low.
Here’s a quote:
Marcus Hutter once wrote:
These days, one might say: “this is a general sociological problem which pure reinforcement learning agents will cause—which illustrates why we should not build them.”
Thanks, I wasn’t aware that he had address the issue at all. When I made the argument to him in 2002, he didn’t respond to my post.
After Googling for quote to see where it came from, I see that you refuted Hutter’s counter-argument yourself at http://alife.co.uk/essays/on_aixi/. (Why didn’t you link to it?) I agree with your counter-counter-argument.
I have another video on the topic as well (Superintelligent junkies) - but unfortunalely there’s no transcript for that one at the moment.
Yes, that is correct.
Arguably Marcus Hutter’s AIXI should go in this category: for a mind of infinite power, it’s awfully stupid—poor thing can’t even recognize itself in a mirror.
And following some links from there leads to this 2003 Eliezer posting to an AGI mailing list in which he explains the mirror opinion.
I can’t say I completely understood the argument, but it seemed that the real reason EY deprecates AIXI is that he fears that it would defect in the PD, even when playing against a mirror image—because it wouldn’t recognize the symmetry.
I have to say that this habit of evaluating and grading minds based on how they perform on a cherry-picked selection of games (PD, Hitchhiker, Newcomb) leaves me scratching my head. For every game which makes some particular feature of a decision theory seem desirable (determinism, say, or ability to recognize a copy of yourself) there are other games where that feature doesn’t help, and even games which make that feature look undesirable. It seems to me that Eliezer is approaching decision theory in an amateurish and self-deluding fashion.
Probably the two most obvious problems with AIXI (apart from the uncomputability business) are that it:
Would be inclined to grab control of its own reward function—and make sure nobody got in the way of it doing that;
Doesn’t know it has a brain or a body—and so might easily eat its own brains accidentally.
I discuss these problems in more detail in my essay on the topic. Teaching it that it has a brain may not be rocket science.
Given your analysis I concluded the reverse. It is ‘amateurish’ to not pay particular attention to the critical edge cases in your decision theory. Your conclusion of ‘self-delusion’ was utterly absurd.
The Prisoner’s Dilemma. “Cherry Picked”? You can not be serious! It’s the flipping Prisoner’s Dilemma. It’s more or less the archetypal decision theory introduction to cooperation problems.