The most important reason for our view is that we are optimistic about the following:
The following action is quite natural and hence salient to many different agents: commit to henceforth doing your best to benefit the aggregate values of the agents you do ECL with.
Commitment of this type is possible.
All agents are in a reasonably similar situation to each other when it comes to deciding whether to make this abstract commitment.
We’ve discussed this before, but I want to flag the following, both because I’m curious how much other readers share my reaction to the above and I want to elaborate a bit on my position:
The above seems to be a huge crux for how common and relevant to us ECL is. I’m glad you’ve made this claim explicit! (Credit to Em Cooper for making me aware of it originally.) And I’m also puzzled why it hasn’t been emphasized more in ECL-keen writings (as if it’s obvious?).
While I think this claim isn’t totally implausible (it’s an update in favor of ECL for me, overall), I’m unconvinced because:
I think genuinely intending to do X isn’t the same as making my future self do X. Now, of course my future self can just do X; it might feel very counterintuitive, but if a solid argument suggests this is the right decision, I like to think he’ll take that argument seriously. But we have to be careful here about what “X” my future self is doing:
Let’s say my future self finds himself in a concrete situation where he can take some action A that is much better for [broad range of values] than for his values.
If he does A, is he making it the case that current-me is committed to [help a broad range of values] (and therefore acausally making it the case that others in current-me’s situation act according to such a commitment)?
It’s not clear to me that he is. This is philosophically confusing, so I’m not confident in the following, but: I think the more plausible model of the situation is that future-me decides to do A in that concrete situation, and so others who make decisions like him in that concrete situation will do their analogue of A. His knowledge of the fact that his decision to do A wasn’t the output of argmax E(U_{broad range of values}) screens off the influence on current-me. (So your third bullet point wouldn’t hold.)
In principle I can do more crude nudges to make my future self more inclined to help different values, like immerse myself in communities with different values. But:
I’d want to be very wary about making irreversible values changes based on an argument that seems so philosophically complex, with various cruxes I might drastically change my mind on (including my poorly informed guesses about the values of others in my situation). An idealized agent could do a fancy conditional commitment like “change my values, but revert back to the old ones if I come to realize the argument in favor of this change was confused”; unfortunately I’m not such an agent.
I’d worry that the more concrete we get in specifying the decision of what crude nudges to make, the more idiosyncratic my decision situation becomes, such that, again, your third bullet point would no longer hold.
These crude nudges might be quite far from the full commitment we wanted in the first place.
We’ve discussed this before, but I want to flag the following, both because I’m curious how much other readers share my reaction to the above and I want to elaborate a bit on my position:
The above seems to be a huge crux for how common and relevant to us ECL is. I’m glad you’ve made this claim explicit! (Credit to Em Cooper for making me aware of it originally.) And I’m also puzzled why it hasn’t been emphasized more in ECL-keen writings (as if it’s obvious?).
While I think this claim isn’t totally implausible (it’s an update in favor of ECL for me, overall), I’m unconvinced because:
I think genuinely intending to do X isn’t the same as making my future self do X. Now, of course my future self can just do X; it might feel very counterintuitive, but if a solid argument suggests this is the right decision, I like to think he’ll take that argument seriously. But we have to be careful here about what “X” my future self is doing:
Let’s say my future self finds himself in a concrete situation where he can take some action A that is much better for [broad range of values] than for his values.
If he does A, is he making it the case that current-me is committed to [help a broad range of values] (and therefore acausally making it the case that others in current-me’s situation act according to such a commitment)?
It’s not clear to me that he is. This is philosophically confusing, so I’m not confident in the following, but: I think the more plausible model of the situation is that future-me decides to do A in that concrete situation, and so others who make decisions like him in that concrete situation will do their analogue of A. His knowledge of the fact that his decision to do A wasn’t the output of argmax E(U_{broad range of values}) screens off the influence on current-me. (So your third bullet point wouldn’t hold.)
In principle I can do more crude nudges to make my future self more inclined to help different values, like immerse myself in communities with different values. But:
I’d want to be very wary about making irreversible values changes based on an argument that seems so philosophically complex, with various cruxes I might drastically change my mind on (including my poorly informed guesses about the values of others in my situation). An idealized agent could do a fancy conditional commitment like “change my values, but revert back to the old ones if I come to realize the argument in favor of this change was confused”; unfortunately I’m not such an agent.
I’d worry that the more concrete we get in specifying the decision of what crude nudges to make, the more idiosyncratic my decision situation becomes, such that, again, your third bullet point would no longer hold.
These crude nudges might be quite far from the full commitment we wanted in the first place.