Why do we need moral facts to tell us how our preferences should be extrapolated? Can’t we want our preferences to be extrapolated in a certain way?
I guess the relevant difference between “I want my preferences to be extrapolated in a certain way” and something more mundane, like “I want to drive to the store”, is that the latter is a more informed preference. But neither is perfectly informed, because driving to the store can have unintended effects too. So ideally we want to get to the point where launching an AI feels like a reasonably informed decision, like driving to the store. That will take a lot of research, but I’m pretty sure it doesn’t require the existence of moral facts, anymore than driving to the store requires moral facts.
The trouble is that moral facts do not just mean realist notions of moral facts, but include anti-realist and non-cognitivist notions of moral propositions like “do what happens when you extrapolate my preferences in a certain way”. In fact the closer I look the more it seems like moral facts are a different sort of confusion than the one I originally thought they were.
And yes we may be able to build AGI we consider aligned without fully resolving issues of meta-ethical uncertainty (in fact, I’m counting on it, because meta-ethical uncertainty cannot be fully resolved!), but I’d just like to address your claim that driving to the store does not require moral facts. There is a weak sense in which it doesn’t because you can just drive to the store and do what you want to do this, but so can we also build AGI that just does what it wants without need of moral facts. I doubt we would call such an AGI aligned, though, because its behavior is not constrained by the wants of others.
We might say “well, let’s just build an AGI that does what we want not what it wants”, but then we have to decide what we want and, importantly, figure out what to do about places where what we want comes into conflict with itself. Because we’re not trying to build AGI that is aligned with just a single human but with all of humanity (and maybe with all moral agents) we’re forced to figure out how to resolve those conflicts in a way independent of any individual, and this is solidly in the realm of what people consider ethics.
Perhaps we can approach this in a new way, but we’d be foolish to fail to notice and appreciate the many ways people have already tried to address this, and these approaches are grounded in the notion of moral facts even if moral facts are a polymorphic category that is metaphysically quite different depending on your ethics.
Yeah, that makes sense. I think resolving any conflict is a judgment call one way or the other, and our judgment calls when launching the AI should be as informed as possible. Maybe some of them can be delegated to humans after the AI is launched, though preventing the AI from influencing them is a problem of its own.
Why do we need moral facts to tell us how our preferences should be extrapolated? Can’t we want our preferences to be extrapolated in a certain way?
I guess the relevant difference between “I want my preferences to be extrapolated in a certain way” and something more mundane, like “I want to drive to the store”, is that the latter is a more informed preference. But neither is perfectly informed, because driving to the store can have unintended effects too. So ideally we want to get to the point where launching an AI feels like a reasonably informed decision, like driving to the store. That will take a lot of research, but I’m pretty sure it doesn’t require the existence of moral facts, anymore than driving to the store requires moral facts.
The trouble is that moral facts do not just mean realist notions of moral facts, but include anti-realist and non-cognitivist notions of moral propositions like “do what happens when you extrapolate my preferences in a certain way”. In fact the closer I look the more it seems like moral facts are a different sort of confusion than the one I originally thought they were.
And yes we may be able to build AGI we consider aligned without fully resolving issues of meta-ethical uncertainty (in fact, I’m counting on it, because meta-ethical uncertainty cannot be fully resolved!), but I’d just like to address your claim that driving to the store does not require moral facts. There is a weak sense in which it doesn’t because you can just drive to the store and do what you want to do this, but so can we also build AGI that just does what it wants without need of moral facts. I doubt we would call such an AGI aligned, though, because its behavior is not constrained by the wants of others.
We might say “well, let’s just build an AGI that does what we want not what it wants”, but then we have to decide what we want and, importantly, figure out what to do about places where what we want comes into conflict with itself. Because we’re not trying to build AGI that is aligned with just a single human but with all of humanity (and maybe with all moral agents) we’re forced to figure out how to resolve those conflicts in a way independent of any individual, and this is solidly in the realm of what people consider ethics.
Perhaps we can approach this in a new way, but we’d be foolish to fail to notice and appreciate the many ways people have already tried to address this, and these approaches are grounded in the notion of moral facts even if moral facts are a polymorphic category that is metaphysically quite different depending on your ethics.
Yeah, that makes sense. I think resolving any conflict is a judgment call one way or the other, and our judgment calls when launching the AI should be as informed as possible. Maybe some of them can be delegated to humans after the AI is launched, though preventing the AI from influencing them is a problem of its own.