Companies are the self-improving systems of today—e.g. see Google.
They don’t hack the human brain much—but they don’t need to. Brains are not perfect—but they can have their inputs preprocessed, their outputs post-processed, and they can be replaced entirely by computers—via the well-known process of automation.
Do the folk at Google proceed without logical proofs? Of course they do! Only the slowest and most tentative programmer tries to prove the correctness of their programs before they deploy them. Instead most programmers extensively employ testing methodologies. Testing is the mantra of modern programmers. Test, test, test! That way they get their products to the market before the sun explodes.
As Eliezer has already showed, “test, test, test”ing AIs that aren’t provably Friendly (their recursive self-modification leads to Friendly results) can have disastrous consequences.
I’d rather wait until the sun explodes rather than deploying an unFriendly AI by accident.
The consequences of failing to adopt rapid development technologies when it comes to the development of intelligent machines should be pretty obvious—the effect is to pass the baton to another team with a different development philosophy.
Waiting until the sun explodes is not one of the realistic options.
The box experiments seem irrelevant to the case of testing machine intelligence. When testing prototypes in a harness, you would use powerful restraints—not human gatekeepers.
Turn it off, encase it in nanofabricated diamond, and bury it in a deep pit. Destroy the experimental records, retaining only enough information to help future, wiser generations to one day take up again the challenge of building a Friendly AI. Scatter the knowledge in fragments, hidden in durable artifacts, scatter even the knowledge of how to find the knowledge likewise, and arrange a secret brotherhood to pass down through the centuries the ultimate keys to the Book That Does Not Permit Itself To Be Read.
Tens of thousands of years later, when civilisation has (alas) fallen and risen several times over, a collect-all-the-plot-coupons fantasy novel takes place.
Use a facility designed by the government with multiple guards and built with vastly more resources than the imprisoned man can muster.
Want to restrain a machine?
You use the same strategy. Or you could use drugs, or build in a test harness. Whatever—but however you look at it, it doesn’t seem like a problem.
We can restrain individuals pretty securely today—and there is no indication that future developments are going to change that.
What’s with the question about removing restraints? That isn’t a problem either. You are suggesting that the imprisoned agent contacts and manipulates humans “on the outside”—and they attempt a jail-break? That is a strategy available to other prisoners as well. It has a low success rate. Those few that do escape are typically hunted down and then imprisoned again.
If you are particularly paranoid about escaped prisoners, then build a higher security prison. Typically, you can have whatever security level you are prepared to pay for.
And not just by persuading the guards—the kind of AIs we are talking about, transhuman-level AIs, could potentially do all kinds of mind-hacking things of which we haven’t even yet conceived. Hell, they could do things that we will never be able to conceive unaided.
If we ever set up a system that relies on humans restraining a self-modifying AI, we had better be sure beforehand that the AI is Friendly. The only restraints that I can think of that would provably work involve limiting the AIs access to resources so that it never achieves a level of intelligence equal to or higher than human—but then, we haven’t quite made an AI, have we? Not much benefit to a glorified expert system.
If you haven’t read the AI Box experiment reports I linked above, I recommend them—apparently, it doesn’t quite take a transhuman-level AI to get out of a “test harness.”
Why not make a recursively improving AI in some strongly typed language who provably can only interact with the world through printing names of stocks to buy?
How about one that can only make blueprints for star ships?
Companies are the self-improving systems of today—e.g. see Google.
They don’t hack the human brain much—but they don’t need to. Brains are not perfect—but they can have their inputs preprocessed, their outputs post-processed, and they can be replaced entirely by computers—via the well-known process of automation.
Do the folk at Google proceed without logical proofs? Of course they do! Only the slowest and most tentative programmer tries to prove the correctness of their programs before they deploy them. Instead most programmers extensively employ testing methodologies. Testing is the mantra of modern programmers. Test, test, test! That way they get their products to the market before the sun explodes.
As Eliezer has already showed, “test, test, test”ing AIs that aren’t provably Friendly (their recursive self-modification leads to Friendly results) can have disastrous consequences.
I’d rather wait until the sun explodes rather than deploying an unFriendly AI by accident.
The consequences of failing to adopt rapid development technologies when it comes to the development of intelligent machines should be pretty obvious—the effect is to pass the baton to another team with a different development philosophy.
Waiting until the sun explodes is not one of the realistic options.
The box experiments seem irrelevant to the case of testing machine intelligence. When testing prototypes in a harness, you would use powerful restraints—not human gatekeepers.
What powerful restraints would you suggest that would not require human judgment or human-designed decision algorithms to remove?
Turn it off, encase it in nanofabricated diamond, and bury it in a deep pit. Destroy the experimental records, retaining only enough information to help future, wiser generations to one day take up again the challenge of building a Friendly AI. Scatter the knowledge in fragments, hidden in durable artifacts, scatter even the knowledge of how to find the knowledge likewise, and arrange a secret brotherhood to pass down through the centuries the ultimate keys to the Book That Does Not Permit Itself To Be Read.
Tens of thousands of years later, when civilisation has (alas) fallen and risen several times over, a collect-all-the-plot-coupons fantasy novel takes place.
Want to restrain a man?
Use a facility designed by the government with multiple guards and built with vastly more resources than the imprisoned man can muster.
Want to restrain a machine?
You use the same strategy. Or you could use drugs, or build in a test harness. Whatever—but however you look at it, it doesn’t seem like a problem.
We can restrain individuals pretty securely today—and there is no indication that future developments are going to change that.
What’s with the question about removing restraints? That isn’t a problem either. You are suggesting that the imprisoned agent contacts and manipulates humans “on the outside”—and they attempt a jail-break? That is a strategy available to other prisoners as well. It has a low success rate. Those few that do escape are typically hunted down and then imprisoned again.
If you are particularly paranoid about escaped prisoners, then build a higher security prison. Typically, you can have whatever security level you are prepared to pay for.
The hypothetical AI is assumed to be able to talk normal humans assigned to guard it into taking its side.
In other words, the safest way to restrain it is to simply not turn it on.
And not just by persuading the guards—the kind of AIs we are talking about, transhuman-level AIs, could potentially do all kinds of mind-hacking things of which we haven’t even yet conceived. Hell, they could do things that we will never be able to conceive unaided.
If we ever set up a system that relies on humans restraining a self-modifying AI, we had better be sure beforehand that the AI is Friendly. The only restraints that I can think of that would provably work involve limiting the AIs access to resources so that it never achieves a level of intelligence equal to or higher than human—but then, we haven’t quite made an AI, have we? Not much benefit to a glorified expert system.
If you haven’t read the AI Box experiment reports I linked above, I recommend them—apparently, it doesn’t quite take a transhuman-level AI to get out of a “test harness.”
You don’t use a few humans to restrain an advanced machine intelligence. That would be really stupid.
Safest, but maybe not the only safe way?
Why not make a recursively improving AI in some strongly typed language who provably can only interact with the world through printing names of stocks to buy?
How about one that can only make blueprints for star ships?