Overall, I think the question “which AIs are good successors?” is both neglected and time-sensitive, and is my best guess for the highest impact question in moral philosophy right now.
Interesting… my model of Paul didn’t assign any work in moral philosophy high priority.
I agree this is high impact. My idea of the kind of work to do here is mostly trying to solving the hardish problem of consciousness so that we can have some more informed guess as to the quantity and valence of experience that different possible futures generate.
Interesting… my model of Paul didn’t assign any work in moral philosophy high priority.
That makes it easier for any particular question to top the list.
so that we can have some more informed guess as to the quantity and valence of experience that different possible futures generate
It seems like the preferences of the AI you build are way more important than its experience (not sure if that’s what you mean).
I have the further view that if you aren’t intrinsically happy with it getting what it wants, you probably won’t be happy because the goals happen to overlap enough (e.g. if it wants X’s to exist, and it turns out that X’s are conscious and have valuable experiences, you probably still aren’t going to get a morally-relevant amount of morally valuable experience this way, because no one is optimizing for it).
Interesting… my model of Paul didn’t assign any work in moral philosophy high priority.
That makes it easier for any particular question to top the list.
To confirm my understanding (and to clarify for others), this is because you think most questions in moral philosophy can be deferred until after we solve AI alignment, whereas the particular question in the OP can’t be deferred this way? If this is correct, what about my idea here which also can’t be deferred (without losing a lot of value as time goes on) and potentially buys a lot more than reducing the AI risk in this universe?
I agree that literal total utilitarianism doesn’t care about any worlds at all except infinite worlds (and for infinite worlds its preferences are undefined). I think it is an unappealing moral theory for a number of reasons (as are analogs with arbitrary but large bounds), and so it doesn’t have much weight in my moral calculus. In particular, I don’t think that literal total utilitarianism is the main component of the moral parliament that cares about astronomical waste.
(To the extent it was, it would still advocate getting “normal” kinds of influence in our universe, which are probably dominated by astronomical waste, in order to engage in trade, so it also doesn’t seem to me like this argument would change our actions too much, unless we are making a general inference about the “market price” of astronomical resources across a broad basket of value systems.)
Is your more general point that we might need to make moral trades now, from behind the veil of ignorance?
I agree that some value is lost that way. I tend to think it’s not that large, since:
I don’t see particular ways we are losing large amounts of value.
My own moral intuitions are relatively strong regarding “make the trades you would have made from behind the veil of ignorance,” I don’t think that I literally need to remain behind the veil. I expect most people have similar views or would have. (I agree this isn’t 100%.)
It seems like we can restore most of the gains with acausal trade at any rate, though I agree not all of them.
If your point is that we should figure out what fraction of our resources to allocate towards being selfish in this world: I agree there is some value lost here, but again it seems pretty minor to me given:
The difficulty of doing such trades early in history (e.g. the parts of me that care about my own short-term welfare are not effective at making such trades based on abstract reasoning, since their behavior is driven by what works empirically). Even though I think this will be easy eventually it doesn’t seem easy now.
The actual gains from being more selfish are not large. (I allocate my resources roughly 50⁄50 between impartial and self-interested action. I could perhaps make my life 10-20% better by allocation all to self-interested action, which implies that I’m effectively paying a 5x penalty to spend more resources in this world.)
Selfish values are still heavily influenced by what happens in simulations, by the way that my conduct is evaluated by our society after AI is developed, etc.
It seems like the preferences of the AI you build are way more important than its experience (not sure if that’s what you mean).
This is because the AIs preferences are going to have a much larger downstream impact?
I’d agree, but caveat that there may be likely possible futures which don’t involve the creation of hyper-rational AIs with well-defined preferences, but rather artificial life with messy incomplete, inconsistent preferences but morally valuable experiences. More generally, the future of the light cone could be determined by societal/evolutionary factors rather than any particular agent or agent-y process.
I found your 2nd paragraph unclear...
the goals happen to overlap enough
Is this referring to the goals of having “AIs that have good preferences” and “AIs that have lots of morally valuable experience”?
Interesting… my model of Paul didn’t assign any work in moral philosophy high priority.
I agree this is high impact. My idea of the kind of work to do here is mostly trying to solving the hardish problem of consciousness so that we can have some more informed guess as to the quantity and valence of experience that different possible futures generate.
That makes it easier for any particular question to top the list.
It seems like the preferences of the AI you build are way more important than its experience (not sure if that’s what you mean).
I have the further view that if you aren’t intrinsically happy with it getting what it wants, you probably won’t be happy because the goals happen to overlap enough (e.g. if it wants X’s to exist, and it turns out that X’s are conscious and have valuable experiences, you probably still aren’t going to get a morally-relevant amount of morally valuable experience this way, because no one is optimizing for it).
To confirm my understanding (and to clarify for others), this is because you think most questions in moral philosophy can be deferred until after we solve AI alignment, whereas the particular question in the OP can’t be deferred this way? If this is correct, what about my idea here which also can’t be deferred (without losing a lot of value as time goes on) and potentially buys a lot more than reducing the AI risk in this universe?
I agree that literal total utilitarianism doesn’t care about any worlds at all except infinite worlds (and for infinite worlds its preferences are undefined). I think it is an unappealing moral theory for a number of reasons (as are analogs with arbitrary but large bounds), and so it doesn’t have much weight in my moral calculus. In particular, I don’t think that literal total utilitarianism is the main component of the moral parliament that cares about astronomical waste.
(To the extent it was, it would still advocate getting “normal” kinds of influence in our universe, which are probably dominated by astronomical waste, in order to engage in trade, so it also doesn’t seem to me like this argument would change our actions too much, unless we are making a general inference about the “market price” of astronomical resources across a broad basket of value systems.)
Is your more general point that we might need to make moral trades now, from behind the veil of ignorance?
I agree that some value is lost that way. I tend to think it’s not that large, since:
I don’t see particular ways we are losing large amounts of value.
My own moral intuitions are relatively strong regarding “make the trades you would have made from behind the veil of ignorance,” I don’t think that I literally need to remain behind the veil. I expect most people have similar views or would have. (I agree this isn’t 100%.)
It seems like we can restore most of the gains with acausal trade at any rate, though I agree not all of them.
If your point is that we should figure out what fraction of our resources to allocate towards being selfish in this world: I agree there is some value lost here, but again it seems pretty minor to me given:
The difficulty of doing such trades early in history (e.g. the parts of me that care about my own short-term welfare are not effective at making such trades based on abstract reasoning, since their behavior is driven by what works empirically). Even though I think this will be easy eventually it doesn’t seem easy now.
The actual gains from being more selfish are not large. (I allocate my resources roughly 50⁄50 between impartial and self-interested action. I could perhaps make my life 10-20% better by allocation all to self-interested action, which implies that I’m effectively paying a 5x penalty to spend more resources in this world.)
Selfish values are still heavily influenced by what happens in simulations, by the way that my conduct is evaluated by our society after AI is developed, etc.
This is because the AIs preferences are going to have a much larger downstream impact?
I’d agree, but caveat that there may be likely possible futures which don’t involve the creation of hyper-rational AIs with well-defined preferences, but rather artificial life with messy incomplete, inconsistent preferences but morally valuable experiences. More generally, the future of the light cone could be determined by societal/evolutionary factors rather than any particular agent or agent-y process.
I found your 2nd paragraph unclear...
Is this referring to the goals of having “AIs that have good preferences” and “AIs that have lots of morally valuable experience”?