And not just by persuading the guards—the kind of AIs we are talking about, transhuman-level AIs, could potentially do all kinds of mind-hacking things of which we haven’t even yet conceived. Hell, they could do things that we will never be able to conceive unaided.
If we ever set up a system that relies on humans restraining a self-modifying AI, we had better be sure beforehand that the AI is Friendly. The only restraints that I can think of that would provably work involve limiting the AIs access to resources so that it never achieves a level of intelligence equal to or higher than human—but then, we haven’t quite made an AI, have we? Not much benefit to a glorified expert system.
If you haven’t read the AI Box experiment reports I linked above, I recommend them—apparently, it doesn’t quite take a transhuman-level AI to get out of a “test harness.”
Why not make a recursively improving AI in some strongly typed language who provably can only interact with the world through printing names of stocks to buy?
How about one that can only make blueprints for star ships?
The hypothetical AI is assumed to be able to talk normal humans assigned to guard it into taking its side.
In other words, the safest way to restrain it is to simply not turn it on.
And not just by persuading the guards—the kind of AIs we are talking about, transhuman-level AIs, could potentially do all kinds of mind-hacking things of which we haven’t even yet conceived. Hell, they could do things that we will never be able to conceive unaided.
If we ever set up a system that relies on humans restraining a self-modifying AI, we had better be sure beforehand that the AI is Friendly. The only restraints that I can think of that would provably work involve limiting the AIs access to resources so that it never achieves a level of intelligence equal to or higher than human—but then, we haven’t quite made an AI, have we? Not much benefit to a glorified expert system.
If you haven’t read the AI Box experiment reports I linked above, I recommend them—apparently, it doesn’t quite take a transhuman-level AI to get out of a “test harness.”
You don’t use a few humans to restrain an advanced machine intelligence. That would be really stupid.
Safest, but maybe not the only safe way?
Why not make a recursively improving AI in some strongly typed language who provably can only interact with the world through printing names of stocks to buy?
How about one that can only make blueprints for star ships?