Is there any chance that we (a) CAN’T restrict AI to be friendly per se, but (b) (conditional on this impossibility) CAN restrict it to keep it from blowing up in our faces? If friendly AI is in fact not possible, then first generation AI may recognize this fact and not want to build a successor that would destroy the first generation AI in an act of unfriendliness.
It seems to me like the worst case would be that Friendly AI is in fact possible...but that we aren’t the first to discover it. In which case AI would happily perpetuate itself. But what are the best and worst case scenarios conditioning on Friendly AI being IMpossible?
Has this been addressed before? As a disclaimer, I haven’t thought much about this and I suspect that I’m dressing up the problem in a way that sounds different to me only because I don’t fully understand the implications.
Is there any chance that we (a) CAN’T restrict AI to be friendly per se, but (b) (conditional on this impossibility) CAN restrict it to keep it from blowing up in our faces? If friendly AI is in fact not possible, then first generation AI may recognize this fact and not want to build a successor that would destroy the first generation AI in an act of unfriendliness.
It seems to me like the worst case would be that Friendly AI is in fact possible...but that we aren’t the first to discover it. In which case AI would happily perpetuate itself. But what are the best and worst case scenarios conditioning on Friendly AI being IMpossible?
Has this been addressed before? As a disclaimer, I haven’t thought much about this and I suspect that I’m dressing up the problem in a way that sounds different to me only because I don’t fully understand the implications.