We do have real-world examples of things which do not themselves have anything humans would typically consider values, but do determine the values of the rest of some system. Evolution determining human values is a good example: evolution does not itself care about anything, yet it produced human values. Of course, if we just evolve some system, we don’t expect it to robustly end up with Good values—e.g. the Babyeaters (from Three Worlds Collide) are a plausible outcome as well. Just because we have a value-less system which produces values, does not mean that the values produced are Good.
This example generalizes: we have some subsystem which does not itself contain anything we’d consider values. It determines the values of the rest of the system. But then, what reason do we have to expect that the values produced will be Good? The most common reason to believe such a thing is to predict that the subsystem will produce values similar to our own moral intuitions. But if that’s the case, then we’re using our own moral intuitions as the source-of-truth to begin with, which is exactly the opposite of moral realism.
To reiterate: the core issue with this setup is why we expect the value-less subsystem to produce something Good. How could we possibly know that, without using some other source-of-truth about Goodness to figure it out?
“How the physical world works” seems, to me, a plausible source-of-truth. In other words: I consider some features of the environment (e.g. consciousness) as a reason to believe that some AI systems might end up caring about a common set of things, after they’ve spent some time gathering knowledge about the world and reasoning. Our (human) moral intuitions might also be different from this set.
We do have real-world examples of things which do not themselves have anything humans would typically consider values, but do determine the values of the rest of some system. Evolution determining human values is a good example: evolution does not itself care about anything, yet it produced human values. Of course, if we just evolve some system, we don’t expect it to robustly end up with Good values—e.g. the Babyeaters (from Three Worlds Collide) are a plausible outcome as well. Just because we have a value-less system which produces values, does not mean that the values produced are Good.
This example generalizes: we have some subsystem which does not itself contain anything we’d consider values. It determines the values of the rest of the system. But then, what reason do we have to expect that the values produced will be Good? The most common reason to believe such a thing is to predict that the subsystem will produce values similar to our own moral intuitions. But if that’s the case, then we’re using our own moral intuitions as the source-of-truth to begin with, which is exactly the opposite of moral realism.
To reiterate: the core issue with this setup is why we expect the value-less subsystem to produce something Good. How could we possibly know that, without using some other source-of-truth about Goodness to figure it out?
“How the physical world works” seems, to me, a plausible source-of-truth. In other words: I consider some features of the environment (e.g. consciousness) as a reason to believe that some AI systems might end up caring about a common set of things, after they’ve spent some time gathering knowledge about the world and reasoning. Our (human) moral intuitions might also be different from this set.