My guess of what Eliezer had in mind for (2) is that if you control it by hooking up a reward button to it, then AIXI approximates an Outcome Pump and this is a Bad Thing.
But if that’s the problem, it also illustrates why a formal proof of unfriendliness is a rather tall order. It’s easy to formally specify what AIXI, or the Outcome Pump, will do. But in order prove that that’s not what we want, we also need a formal specification of what we want. Which is the fundamental problem of friendliness theory.
keep in mind that my opinion is that the whole so called ‘theory’ of his is about specifying intelligences in English/technobabble so that they would be friendly (which is also specified in English/technobabble), which is of no use what so ever (albeit may be intellectually stimulating and my first impression was that it was some sorta weird rationality training game here, before I noticed folk seriously wanting to donate hard earned dollars for this ‘work’).
One could for example show formally that the AIXI does not discriminate between wireheaded (input manipulating) solution and non-wireheaded solution; that would make it rather non scary; or one could show that it does, which would make it potentially scary. Ultimately the excuses like “But in order prove that that’s not what we want, we also need a formal specification of what we want” are a very bad sign.
My guess of what Eliezer had in mind for (2) is that if you control it by hooking up a reward button to it, then AIXI approximates an Outcome Pump and this is a Bad Thing.
But if that’s the problem, it also illustrates why a formal proof of unfriendliness is a rather tall order. It’s easy to formally specify what AIXI, or the Outcome Pump, will do. But in order prove that that’s not what we want, we also need a formal specification of what we want. Which is the fundamental problem of friendliness theory.
keep in mind that my opinion is that the whole so called ‘theory’ of his is about specifying intelligences in English/technobabble so that they would be friendly (which is also specified in English/technobabble), which is of no use what so ever (albeit may be intellectually stimulating and my first impression was that it was some sorta weird rationality training game here, before I noticed folk seriously wanting to donate hard earned dollars for this ‘work’).
One could for example show formally that the AIXI does not discriminate between wireheaded (input manipulating) solution and non-wireheaded solution; that would make it rather non scary; or one could show that it does, which would make it potentially scary. Ultimately the excuses like “But in order prove that that’s not what we want, we also need a formal specification of what we want” are a very bad sign.