rwallace comments on The problem of pseudofriendliness

rwallace 16 Mar 2010 23:35 UTC
1 point
Like other people, I care not only about the outcome, but that it was not reached by unethical means; and am prepared to accept that I don’t have a unique ranking order for all outcomes, and that I may be mistaken in some of my preferences, and that I should be more tentative in areas where I am more likely to be mistaken.

Could we aim, ultimately, to build an AGI with such properties? Yes indeed, and if we ever set out to build a self-willed AGI, that is how we should do it—precisely because it would have properties very different from those of the monomaniac utilitarian AGI postulated in most of what’s been written about friendly AI so far.
- Vladimir_Nesov 16 Mar 2010 23:38 UTC
  0 points
  Parent
  
  Yes indeed, and if we ever set out to build a self-willed AGI, that is how we should do it—precisely because it would have properties very different from those of the monomaniac utilitarian AGI postulated in most of what’s been written about friendly AI so far.
  
  Please pin it down: what are you talking about on both accounts (“how we should do it” and “the monomaniac utilitarian AGI”), and where do you place your interpretation of my concept of preference.
  - rwallace 16 Mar 2010 23:43 UTC
    0 points
    Parent
    I can have a go at that, but a comment box in a thread buried multiple hidden layers down is a pretty cramped place to do it. Figure it’s appropriate for a top-level post? Or we could take it to one of the AGI mailing lists.
    - Vladimir_Nesov 16 Mar 2010 23:54 UTC
      0 points
      Parent
      I meant to ask for a short indication of what you meant, long description will be a mistake, since you’ll misinterpret a lot of what I meant, given how little of the assumed ideas you agree with or understand the way they are intended.
      
      Signal to humbug ratio on AGI mailing lists is too low.
      - rwallace 17 Mar 2010 2:45 UTC
        0 points
        Parent
        Well, I had been attempting to give short indications of what I meant already, but I’ll try. Basically, a pure utilitarian (if you could build such an entity of high intelligence, which you can’t) would be a monomaniac, willing to commit any crime in the service of its utility function. That means a ridiculous amount of weight goes onto writing the perfect utility function (which is impossible), and then in an attempt to get around that you end up with lunacy like CEV (which is, very fortunately, impossible), and the whole thing goes off the rails. What I’m proposing is that if anything like a self-willed AGI is ever built, it will have to be done in stages with what it does co-developed with how it does it, which means that by the time it’s being trusted with the capability to do something in the external world, it will already have all sorts of built-in constraints on what it does and how it does it, that will necessarily have been developed along with and be an integral part of the system. That’s the only way it can work (unless we stick to purely smart tool AI, which is also an option), and it means we don’t have to take an exponentially unlikely gamble on writing the perfect utility function.
        Nick_Tarleton 17 Mar 2010 15:45 UTC
        0 points
        Parent
        
        a pure utilitarian (if you could build such an entity of high intelligence, which you can’t)
        
        CEV (which is … impossible)
        
        Citations needed.
        Vladimir_Nesov 17 Mar 2010 10:04 UTC
        0 points
        Parent
        Well, I feel unable to effectively communicate with you on this topic (the fact that I persisted for so long is due to unusual mood, and isn’t typical—I’ve been answering all comments directed to me for the last day). Good luck, maybe you’ll see the light one day.