Re: “However clever your algorithm, at that level, something’s bound to confuse it. Gimme FAI with checks and balances every time.”
I agree that a mature Friendly Artificial Intelligence should defer to something like humanity’s volition.
However, before it can figure out what humanity’s volition is and how to accomplish it, an FAI first needs to:
self-improve into trans-human intelligence while retaining humanity’s core goals
avoid UnFriendly Behavior (for example, murdering people to free up their resources) in the process of doing step (1)
If the AI falls prey to a paradoxes early on in the process of self-improvement, the FAI has failed and has to be shut down or patched.
Why is that a problem? Because if the AI falls prey to a paradox later on in the process of self-improvement, when the computer can outsmart human beings, the result could be catastrophic. (As Eliezer keeps pointing out: a rational AI might not agree to be patched, just as Gandhi would not agree to have his brain modified into becoming a psychopath, and Hitler would not agree to have his brain modified to become an egalitarian. All things equal, rational agents will try to block any actions that would prevent them from accomplishing their current goals.)
So you want to create an elegant (to the point, ideally, of being “provably correct”) structure that doesn’t need patches or hacks. If you have to constantly patch or hack early on in the process, that increases the chances that you’ve missed something fundamental, and that the AI will fail later on, when it’s too late to patch.
Re: “However clever your algorithm, at that level, something’s bound to confuse it. Gimme FAI with checks and balances every time.”
I agree that a mature Friendly Artificial Intelligence should defer to something like humanity’s volition.
However, before it can figure out what humanity’s volition is and how to accomplish it, an FAI first needs to:
self-improve into trans-human intelligence while retaining humanity’s core goals
avoid UnFriendly Behavior (for example, murdering people to free up their resources) in the process of doing step (1)
If the AI falls prey to a paradoxes early on in the process of self-improvement, the FAI has failed and has to be shut down or patched.
Why is that a problem? Because if the AI falls prey to a paradox later on in the process of self-improvement, when the computer can outsmart human beings, the result could be catastrophic. (As Eliezer keeps pointing out: a rational AI might not agree to be patched, just as Gandhi would not agree to have his brain modified into becoming a psychopath, and Hitler would not agree to have his brain modified to become an egalitarian. All things equal, rational agents will try to block any actions that would prevent them from accomplishing their current goals.)
So you want to create an elegant (to the point, ideally, of being “provably correct”) structure that doesn’t need patches or hacks. If you have to constantly patch or hack early on in the process, that increases the chances that you’ve missed something fundamental, and that the AI will fail later on, when it’s too late to patch.