Wei Dai comments on What do superintelligences really want? [Link]

Wei Dai 25 Jan 2011 0:11 UTC
0 points
Thanks, I’ll take that as confirmation that Eliezer never posted his planned critique on Less Wrong.

in which Yudkowsky argued that AIXI’s lack of reflectivity leaves it vulnerable in Prisoner’s Dilemma-type situations

That’s one problem with AIXI, but not directly relevant to the blog post XiXiDu and I linked to. I was thinking of a recent presentation I saw where the presenter said “It [AIXI] gets rid of all the humans, and it gets a brick, and puts it on the reward button.” and it turns out that was Roko, not Eliezer.

A bit more searching reveals that I had actually made a version of this argument myself, here and here.
- timtyler 25 Jan 2011 0:32 UTC
  1 point
  Parent
  
  I was thinking of a recent presentation I saw where the presenter said “It [AIXI] gets rid of all the humans, and it gets a brick, and puts it on the reward button.” and it turns out that was Roko, not Eliezer.
  
  Hutter has discussed AIXI wireheading several times, most recenly in his AGI-10 presentation—where he discusses wireheading in the Q & A at the end (01:03:00) - claiming that he can prove it won’t happen in some cases—but not all of them.
  
  Mostly he argues that it probably won’t do it—for the same reason that many humans don’t take drugs: the long-term rewards are low.
  
  Here’s a quote:
  
  Another problem connected, but possibly not limited to embodied agents, especially if they are rewarded by humans, is the following: Sufficiently intelligent agents may increase their rewards by psychologically manipulating their human “teachers”, or by threatening them. This is a general sociological problem which successful AI will cause, which has nothing specifically to do with AIXI. Every intelligence superior to humans is capable of manipulating the latter. In the absence of manipulable humans, e.g. where the reward structure serves a survival function, AIXI may directly hack into its reward feedback. Since this will unlikely increase its long-term survival, AIXI will probably resist this kind of manipulation (like most humans don’t take hard drugs, due to their long-term catastrophic consequences).
  - timtyler 2 Feb 2011 0:14 UTC
    0 points
    Parent
    Marcus Hutter once wrote:
    
    Another problem connected, but possibly not limited to embodied agents, especially if they are rewarded by humans, is the following: Sufficiently intelligent agents may increase their rewards by psychologically manipulating their human “teachers”, or by threatening them. This is a general sociological problem which successful AI will cause, which has nothing specifically to do with AIXI.
    
    These days, one might say: “this is a general sociological problem which pure reinforcement learning agents will cause—which illustrates why we should not build them.”
  - Wei Dai 25 Jan 2011 1:08 UTC
    0 points
    Parent
    
    Hutter has discussed AIXI wireheading several times, most recenly in his AGI-10 presentation.
    
    Thanks, I wasn’t aware that he had address the issue at all. When I made the argument to him in 2002, he didn’t respond to my post.
    
    Mostly he argues that it probably won’t do it—for the same reason that many humans don’t take drugs: the long-term rewards are low.
    
    After Googling for quote to see where it came from, I see that you refuted Hutter’s counter-argument yourself at http://alife.co.uk/essays/on_aixi/. (Why didn’t you link to it?) I agree with your counter-counter-argument.
    - timtyler 25 Jan 2011 9:00 UTC
      0 points
      Parent
      I have another video on the topic as well (Superintelligent junkies) - but unfortunalely there’s no transcript for that one at the moment.
- timtyler 25 Jan 2011 10:00 UTC
  0 points
  Parent
  
  Eliezer never posted his planned critique [of AIXI] on Less Wrong.
  
  Yes, that is correct.