or more centrally, long after I finish the course of action.
I don’t understand why the more central thing is “long after I finish the course of action” as opposed to “in ways that are clearly ‘external to’ the process called ‘me’, that I used to take the actions.”
Hmm, yeah that too. What I had in mind was the idea that “consequentialist” usually has a connotation of “long-term consequentialist”, e.g. taking multiple actions over time that consistently lead to something happening.
For example:
Instrumental convergence doesn’t bite very hard if your goals are 15 seconds in the future.
If an AI acts to maximize long-term paperclips at 4:30pm, and to minimize long-term paperclips at 4:31pm, and to maximize them at 4:32pm, etc., and to minimize them at 4:33pm, etc., then we wouldn’t intuitively think of that AI as a consequentialist rational agent, even if it is technically a consequentialist rational agent at each moment.
I don’t understand why the more central thing is “long after I finish the course of action” as opposed to “in ways that are clearly ‘external to’ the process called ‘me’, that I used to take the actions.”
Hmm, yeah that too. What I had in mind was the idea that “consequentialist” usually has a connotation of “long-term consequentialist”, e.g. taking multiple actions over time that consistently lead to something happening.
For example:
Instrumental convergence doesn’t bite very hard if your goals are 15 seconds in the future.
If an AI acts to maximize long-term paperclips at 4:30pm, and to minimize long-term paperclips at 4:31pm, and to maximize them at 4:32pm, etc., and to minimize them at 4:33pm, etc., then we wouldn’t intuitively think of that AI as a consequentialist rational agent, even if it is technically a consequentialist rational agent at each moment.