What you’re describing is an evil AI, not just an unFriendly one—unFriendly AI doesn’t care about your values. Wouldn’t an evil AI be even harder to achieve than a Friendly one?
An unFriendly AI doesn’t necessarily care about human values—but I can’t see why, if it was based on human neural architecture, it might not exhibit good old-fashioned human values like empathy—or sadism.
I’m not saying that AI would have to be based on human uploads, but it seems like a credible path to superhuman AI.
Why do you think that an evil AI would be harder to achieve than a Friendly one?
Agreed, AI based on a human upload gives no guarantee about its values… actually right now I have no idea about how Friendliness of such AI could be ensured.
Why do you think that an evil AI would be harder to achieve than a Friendly one?
Maybe not harder, but less probable - ‘paperclipping’ seems to be a more likely failure of friendliness than AI wanting to torture humans forever.
I have to admit I haven’t thought much about this, though.
Paperclipping is a relatively simple failure. The difference between paperclipping and evil is mainly just that—a matter of complexity. Evil is complex, turning the universe into tuna is decidedly not.
On the scale of friendliness, I ironically see an “evil” failure (meaning, among other things, that we’re still in some sense around to notice it being evil) becoming more likely as friendliness increases. As we try to implement our own values, failures become more complex, and less likely to be total—thus letting us stick around to see them.
What you’re describing is an evil AI, not just an unFriendly one—unFriendly AI doesn’t care about your values. Wouldn’t an evil AI be even harder to achieve than a Friendly one?
“Where in this code do I need to put this “-ve” sign again?”
The two are approximately equal in difficulty, assuming equivalent flexibility in how “Evil” or “Friendly” it would have to be to qualify for the definition.
What you’re describing is an evil AI, not just an unFriendly one—unFriendly AI doesn’t care about your values. Wouldn’t an evil AI be even harder to achieve than a Friendly one?
An unFriendly AI doesn’t necessarily care about human values—but I can’t see why, if it was based on human neural architecture, it might not exhibit good old-fashioned human values like empathy—or sadism.
I’m not saying that AI would have to be based on human uploads, but it seems like a credible path to superhuman AI.
Why do you think that an evil AI would be harder to achieve than a Friendly one?
Agreed, AI based on a human upload gives no guarantee about its values… actually right now I have no idea about how Friendliness of such AI could be ensured.
Maybe not harder, but less probable - ‘paperclipping’ seems to be a more likely failure of friendliness than AI wanting to torture humans forever.
I have to admit I haven’t thought much about this, though.
Paperclipping is a relatively simple failure. The difference between paperclipping and evil is mainly just that—a matter of complexity. Evil is complex, turning the universe into tuna is decidedly not.
On the scale of friendliness, I ironically see an “evil” failure (meaning, among other things, that we’re still in some sense around to notice it being evil) becoming more likely as friendliness increases. As we try to implement our own values, failures become more complex, and less likely to be total—thus letting us stick around to see them.
“Where in this code do I need to put this “-ve” sign again?”
The two are approximately equal in difficulty, assuming equivalent flexibility in how “Evil” or “Friendly” it would have to be to qualify for the definition.