‘It seems to me like the simplest way to solve friendliness is: “Ok AI, I’m friendly so do what I tell you to do and confirm with me before taking any action.” It is much simpler to program a goal system that responds to direct commands than to somehow try to infuse ‘friendliness’ into the AI.′
As was pointed out, this might not have the consequences one wants. However, even if that wasn’t true, I’d still be leery of this option—this’d effectively be giving one human unlimited power.
Would you expect all the AIs to work together under one person’s direction? Wouldn’t they group up different ways, and work with different people?
In that case, the problem of how to get AIs to be nice and avoid doing things that hurt people boils down to the old problem of how to get people to be nice and avoid doing things that hurt people. The people might have AIs to tell them about some of the consequences of their behavior, which would be an improvement. “But you never told us that humans don’t want small quantities of plutonium in their food. This changes everything.”
But if it just turns into multiple people having large amounts of power, then we need those people not to declare war on each other, and not crush the defenseless, and so on. Just like now except they’d have AIs working for them.
Would it help if we designed Friendly People? We’d need to design society so that Friendly People outcompeted Unfriendly People....
‘It seems to me like the simplest way to solve friendliness is: “Ok AI, I’m friendly so do what I tell you to do and confirm with me before taking any action.” It is much simpler to program a goal system that responds to direct commands than to somehow try to infuse ‘friendliness’ into the AI.′
As was pointed out, this might not have the consequences one wants. However, even if that wasn’t true, I’d still be leery of this option—this’d effectively be giving one human unlimited power.
Would you expect all the AIs to work together under one person’s direction? Wouldn’t they group up different ways, and work with different people?
In that case, the problem of how to get AIs to be nice and avoid doing things that hurt people boils down to the old problem of how to get people to be nice and avoid doing things that hurt people. The people might have AIs to tell them about some of the consequences of their behavior, which would be an improvement. “But you never told us that humans don’t want small quantities of plutonium in their food. This changes everything.”
But if it just turns into multiple people having large amounts of power, then we need those people not to declare war on each other, and not crush the defenseless, and so on. Just like now except they’d have AIs working for them.
Would it help if we designed Friendly People? We’d need to design society so that Friendly People outcompeted Unfriendly People....