Shmi comments on How Do We Align an AGI Without Getting Socially Engineered? (Hint: Box It)

Shmi Aug 10, 2022, 11:45 PM
1 point
−6
To quote Aella from https://aella.substack.com/p/my-attempts-to-sensemake-ai-risk (emphasis mine).
if you’re granting a superintelligent AGI and you still think it won’t be able to get out of the researcher’s box (like, it’s on a computer disconnected from the internet and wants you to connect it to the internet, or something), then I don’t think you’re properly imagining superintelligence. Maybe this is a bit silly, but for my own calibration I’ve often imagined a bunch of five-year-olds who’ve been strictly instructed not to pass the key through your prison door slot, and you have to convince them to do it. The intelligence gap between you and five year olds is probably much smaller than the gap between you and an AGI, but probably you could convince the five year olds to let you out. People arguing they just wouldn’t let an AGI take any sort of control of anything strikes me as silly as the five year olds swearing they won’t let the adult out no matter what. Most other arguments around human beings controlling the AGI in any way once it happens feels equally as silly. You just can’t properly comprehend a thing vastly smarter than you!
If an AGI with a goal of escaping emerges, there is nothing you can do about it. It may take a bit longer if it is disconnected from everything by some “failsafe”, but a human idea of “disconnected from everything” is pathetically misguided compared to something many times smarter. Just… drop the reliance on boxing an AI at all. As johnswentworth said, might as well do it, anyway, but it should not factor in your safety calculations.
- Stephen Fowler Aug 11, 2022, 6:33 AM
  4 points
  2
  Parent
  I think we lose a lot of nuance by automatically assuming the boxed AGI will have godlike capabilities (though it certainly might). Attempting to contain such a super intelligence is (probably) impossible, but I suspect there’s still a fair bit of merit to the idea of trying to box near human level AGIs.
  
  The best cyber criminal in the world could probably get into my bank account, but I’m also not using that as an excuse to go around with no passwords.
  - Shmi Aug 11, 2022, 7:44 AM
    0 points
    −1
    Parent
    Again, it’s good to have a box a human could not get in or out of, as a matter of course. Such a box should not appreciably change any serious safety considerations.