“I suppose one potential failure mode which falls into the grey territory is building an AI that just executes peoples’ current volition without trying to extrapolate”
i.e. the device has to judge the usefulness by some metric and then decide to execute someone’s volition or not.
That’s exactly what my issue is with trying to define a utility function for the AI. You can’t. And since some people will have their utility function denied by the AI then who is to choose who get’s theirs executed?
I’d prefer to shoot for a NOT(UFAI) and then trade with it.
Here’s a thought experiment:
Is a cure for cancer maximizing everyone’s utility function?
Yes on average we all win.
BUT
Companies who are currently creating drugs to treat the symptoms of cancer and their employees would be out of business.
Which utility function should be executed? Creating better cancer drugs to treat the symptoms and then allowing the company to sell them, or put the companies out of business and cure cancer.
Well, that’s an easy question: if you’ve worked sixteen hour days for the last forty years and you’re just six months away from curing cancer completely and you know you’re going to get the Nobel and be fabulously wealthy etc. etc. and an alien shows up and offers you a cure for cancer on a plate, you take it, because a lot of people will die in six months. This isn’t even different to how the world currently is—if I invented a cure for cancer it would be detrimental to all those others who were trying to (and who only cared about getting there first) - what difference does it make if an FAI helps me? I mean, if someone really wants to murder me but I don’t want them to and they are stopped by the police, that’s clearly an example of the government taking the side of my utility function over the murderer’s. But so what? The murderer was in the wrong.
Anyway, have you read Eliezer’s paper on CEV? I’m not sure that I agree with him, but he does deal with the problem you bring up.
“I suppose one potential failure mode which falls into the grey territory is building an AI that just executes peoples’ current volition without trying to extrapolate”
i.e. the device has to judge the usefulness by some metric and then decide to execute someone’s volition or not.
That’s exactly what my issue is with trying to define a utility function for the AI. You can’t. And since some people will have their utility function denied by the AI then who is to choose who get’s theirs executed?
I’d prefer to shoot for a NOT(UFAI) and then trade with it.
Here’s a thought experiment:
Is a cure for cancer maximizing everyone’s utility function?
Yes on average we all win.
BUT
Companies who are currently creating drugs to treat the symptoms of cancer and their employees would be out of business.
Which utility function should be executed? Creating better cancer drugs to treat the symptoms and then allowing the company to sell them, or put the companies out of business and cure cancer.
Well, that’s an easy question: if you’ve worked sixteen hour days for the last forty years and you’re just six months away from curing cancer completely and you know you’re going to get the Nobel and be fabulously wealthy etc. etc. and an alien shows up and offers you a cure for cancer on a plate, you take it, because a lot of people will die in six months. This isn’t even different to how the world currently is—if I invented a cure for cancer it would be detrimental to all those others who were trying to (and who only cared about getting there first) - what difference does it make if an FAI helps me? I mean, if someone really wants to murder me but I don’t want them to and they are stopped by the police, that’s clearly an example of the government taking the side of my utility function over the murderer’s. But so what? The murderer was in the wrong.
Anyway, have you read Eliezer’s paper on CEV? I’m not sure that I agree with him, but he does deal with the problem you bring up.