So “good” creatures have a mechanism which simulates the thoughts and feelings of others, making it have similar thoughts and feelings, whether they are pleasant or bad. (Well, we have a “but this is the Enemy” mode, some others could have a “but now it’s time to begin making paperclips at last” mode...)
For me, feeling the same seems to be much more important. (See dogs, infants...) So thinking in AI terms, there must be a coupling between the creature’s utility function and ours. It wants us to be happy in order to be happy itself. (Wireheading us is not sufficient, because the model of us in its head would feel bad about it, unchanged in the process… it’s some weak form of CEV.)
So is an AI sympathetic if it has this coupling in its utility function? And with whose utilities? Humans? Sentient beings? Anything with an utility function? Chess machines? (Losing makes them really really sad...) Or what about rocks? Utility functions are just a way to predict some parts of the world, after all...
My point is that a definition of sympathy also needs a function to determine who or what to feel sympathy for. For us, this seems to be “everyone who looks like a living creature or acts like one”, but it’s complicated in the same way as our values. Accepting “sympathy” and “personlike” for the definition of “friendly” could be easily turtles all the way down.
So “good” creatures have a mechanism which simulates the thoughts and feelings of others, making it have similar thoughts and feelings, whether they are pleasant or bad. (Well, we have a “but this is the Enemy” mode, some others could have a “but now it’s time to begin making paperclips at last” mode...)
For me, feeling the same seems to be much more important. (See dogs, infants...) So thinking in AI terms, there must be a coupling between the creature’s utility function and ours. It wants us to be happy in order to be happy itself. (Wireheading us is not sufficient, because the model of us in its head would feel bad about it, unchanged in the process… it’s some weak form of CEV.)
So is an AI sympathetic if it has this coupling in its utility function? And with whose utilities? Humans? Sentient beings? Anything with an utility function? Chess machines? (Losing makes them really really sad...) Or what about rocks? Utility functions are just a way to predict some parts of the world, after all...
My point is that a definition of sympathy also needs a function to determine who or what to feel sympathy for. For us, this seems to be “everyone who looks like a living creature or acts like one”, but it’s complicated in the same way as our values. Accepting “sympathy” and “personlike” for the definition of “friendly” could be easily turtles all the way down.