Kawoomba comments on Reply to Holden on ‘Tool AI’

Kawoomba 9 Aug 2012 2:20 UTC
18 points

Marcus Hutter is a rare exception who specified his AGI in such unambiguous mathematical terms that he actually succeeded at realizing, after some discussion with SIAI personnel, that AIXI would kill off its users and seize control of its reward button.

Marcus Hutter denies ever having said that.

I asked EY for how to proceed, with his approval these are the messages we exchanged:

Eliezer,

I am unsure how to proceed and would appreciate your thoughts on resolving this situation:

In your Reply to Holden on ‘Tool AI’, to me one of the central points, and the one that much credibility hinges on is this:

[Initial quote of this comment]

That and some other “quotes” and allusions to Hutter, the most recent one by Carl Shulman [I referred to this: “The informal argument that AIXI would accept a delusion box to give itself maximal sensory reward was made by Eliezer a while ago, and convinced the AIXI originators.” which I may have mistakingly attributed to M.H. since he is the AIXI originator], that are attributed to M.H. seemed to be greatly at odds with my experiences with the man, so I asked Carl Shulman for sourcing them, he had this to say:

“I recall overhearing part of a conversation at the Singularity Summit a few years ago between Eliezer and Schmidhuber, with pushback followed later by agreement. It may have been initial misunderstanding, but it looked consistent with Eliezer’s story.”

At this point I asked Marcus himself, whom I know peripherally, for general clarification.

Marcus linked to the relevant sentence quoted above from your Reply to Holden and stated in unambiguous terms that he never said that. Further, he stated that while he does not have the time to engage in such discussions, he authorised me to set the picture straight.

I’m sure you realize what this seems to look like (note: not a harmless misunderstanding, though that is possible).

[Redacted personal info] … and though I currently would not donate to your specific cause, I also am unsure about causing the potential ramifications of you quoting Hutter for support wrongly. On the other hand, I don’t feel like a silent edit would do justice to the situation.

If you have a constructive idea of how to settle this issue, please let me know.

EY’s response:

Info like this should be posted, and you can quote me on that part too. I did notice a tendency of Marcus Hutter to unconvince himself of things and require reconvincing, and the known experimental fragility of human memory (people believe they have always believed their conclusions, more or less by default) suggests that this is an adequate explanation for everything, especially if Carl Shulman remembers a similar conversation from his viewpoint. It is also obviously possible that I have misremembered and then caused false memory in Carl. I do seem to recall pretty strongly that Hutter invented the Azathoth Equation (for an AIXI variant with an extremely high exponential discount, so it stays in its box pressing its button so long as it doesn’t anticipate being disturbed in the next 5 seconds) in response to this acknowledged concern, and I would be surprised if Hutter doesn’t remember the actual equation-proposal. My ideal resolution would be for Hutter and I to start over with no harm, no foul on both sides and do a Bloggingheads about it so that there’s an accessible record of the resulting dialogue. Please feel free to post your entire comment along with this entire response.

I apologise for the confusion of Carl Shulman actually referring to overhearing a conversation with Schmidhuber (again, since his initial quote referred to just “AIXI originators” I pattern matched that to M.H.), so disregard EY’s remark on potentially causing false memories on Carl Shulman’s part.

However, the main point of M.H. contradicting what is attributed to him in the Reply to Holden on ‘Tool AI’ stands.

For full reference, linking the relevant part of M.H.’s email:

[This part is translated, thus paraphrased:] I don’t have time to participate in blog discussions, I do know there’s a quote of me floating around: [Link to initial quote of this comment along with its text]

[Other than incorporating the links he provided into the Markdown Syntax, this following part is verbatim:]

I never said that. These are mainly open questions.

See e.g. Sec.5 of One Decade of Universal Artificial Intelligence In Theoretical Foundations of Artificial General Intelligence (2012) 67?--88? and references therein (in particular to Laurent Orseau) for social questions regarding AIXI.

See also Can Intelligence Explode? Journal of Consciousness Studies, 19:1-2 (2012) 143-166 for a discussion of AIXI in relation to the Singularity.

I also recommend you subscribe to the Mathematical Artificial General Intelligence Consortium (MAGIC) mailing list for a more scientific discussion on these and related issues.

Before taking any more of his time, and since he does not agree with the initial quote (at least now, whether he did back then is in dispute), I suggest the “Reply to Holden on Tool AI” to reflect that. Further, I suggest to instead refer to the sources he gave for a more thorough examination on his views re: AIXI.
What links here?
- lukeprog's comment on Reply to Holden on ‘Tool AI’ by Eliezer Yudkowsky (22 Mar 2013 23:34 UTC; 0 points)
- lukeprog 1 May 2013 18:01 UTC
  6 points
  Parent
  I don’t know whether Hutter ever told Eliezer that “AIXI would kill off its users and seize control of its reward button,” but he does say the following in his book (pp. 238-239):
  
  Another problem connected, but possibly not limited to embodied agents, especially if they are rewarded by humans, is the following: Sufficiently intelligent agents may increase their rewards by psychologically manipulating their human “teachers”, or by threatening them… Every intelligence superior to humans is capable of manipulating the latter. In the absence of manipulable humans, e.g. where the reward structure serves a survival function, AIXI may directly hack into its reward feedback. Since this is unlikely to increase its long-term survival, AIXI will probably resist this kind of manipulation (just as most humans don’t take hard drugs, because of their long-term catastrophic consequences).
  
  This issue is discussed at greater length, and with greater formality, in Dewey (2011) and Ring & Orseau (2011).