Agreed, AI based on a human upload gives no guarantee about its values… actually right now I have no idea about how Friendliness of such AI could be ensured.
Why do you think that an evil AI would be harder to achieve than a Friendly one?
Maybe not harder, but less probable - ‘paperclipping’ seems to be a more likely failure of friendliness than AI wanting to torture humans forever.
I have to admit I haven’t thought much about this, though.
Paperclipping is a relatively simple failure. The difference between paperclipping and evil is mainly just that—a matter of complexity. Evil is complex, turning the universe into tuna is decidedly not.
On the scale of friendliness, I ironically see an “evil” failure (meaning, among other things, that we’re still in some sense around to notice it being evil) becoming more likely as friendliness increases. As we try to implement our own values, failures become more complex, and less likely to be total—thus letting us stick around to see them.
Agreed, AI based on a human upload gives no guarantee about its values… actually right now I have no idea about how Friendliness of such AI could be ensured.
Maybe not harder, but less probable - ‘paperclipping’ seems to be a more likely failure of friendliness than AI wanting to torture humans forever.
I have to admit I haven’t thought much about this, though.
Paperclipping is a relatively simple failure. The difference between paperclipping and evil is mainly just that—a matter of complexity. Evil is complex, turning the universe into tuna is decidedly not.
On the scale of friendliness, I ironically see an “evil” failure (meaning, among other things, that we’re still in some sense around to notice it being evil) becoming more likely as friendliness increases. As we try to implement our own values, failures become more complex, and less likely to be total—thus letting us stick around to see them.