mruwnik comments on All AGI Safety questions welcome (especially basic ones) [~monthly thread]

mruwnik 3 Nov 2022 10:45 UTC
3 points
0
For the same reason that a chainsaw isn’t safe, just massively scaled up. Maybe Chornobyl would be a better example of an unsafe tool? That’s assuming that by tool AGI you mean something that isn’t agentic. If you let it additionally be agentic, then you’re back to square one, and all you have is a digital slave.
An oracle is nice in that it’s not trying to enforce its will upon the world. The problem with that is differentiating between it and an AGI that is sitting in a box and giving you (hopefully) good ideas, but with a hidden agenda. Which brings you back to the eternal question of how to get good advisors that are like Gandalf, rather than Wormtongue.
Check the tool AI and oracle AI tags for more info.
- DPiepgrass 3 Nov 2022 16:13 UTC
  1 point
  0
  Parent
  My question wouldn’t be how to make an oracle without a hidden agenda, but why others would expect an oracle to have a hidden agenda. Edit: I guess you’re saying somebody might make something that’s “really” an agentic AGI but acts like an oracle? Are you suggesting that even the “oracle”’s creators didn’t realize that they had made an agent?
  - mruwnik 4 Nov 2022 9:09 UTC
    1 point
    0
    Parent
    Pretty much. If you have a pure oracle, that could be fine. Although you have other failure modes e.g. where it suggests something which sounds nice, but has various unforeseen complications etc. which where obvious to the oracle, but not to you (seeing as it’s smarter than you).
    The hidden agenda might not even be all that hidden. One story you can tell is that if you have an oracle that really, really wants to answer your questions as best as possible, then it seems sensible for it to attempt to get more resources in order for it to be able to better answer you. If it only cares about answering, then it wouldn’t mind turning the whole universe into computron so it could give better answers. i.e. it can turn agentic to better answer you, at which point you’re back to square one.