There is a big difference between having time inconsistent preferences, and time inconsistent strategies because of the strategic incentives of the game you are playing.
I can see why a human would have time-inconsistent strategies—because of inconsistent preferences between their past and future self, hyperbolic discounting functions, that sort of thing. I am quite at a loss to understand why an agent with a constant, external utility function should experience inconsistent strategies under any circumstance, regardless of strategic incentives. Expected utility lets us add up conflicting incentives and reduce to a single preference: a multiplicity of strategic incentives is not an excuse for inconsistency.
I am a Bayesian; I don’t believe in probability calculations that come out different ways when you do them using different valid derivations. Why should I believe in decisional calculations that come out in different ways at different times?
I’m not sure that even a causal decision theorist would agree with you about strategic inconsistency being okay—they would just insist that there is an important difference between deciding to take only box B at 7:00am vs 7:10am, if Omega chooses at 7:05am, because in the former case you cause Omega’s action while in the latter case you do not. In other words, they would insist the two situations are importantly different, not that time inconsistency is okay.
And I observe again that a self-modifying AI which finds itself with time-inconsistent preferences, strategies, what-have-you, will not stay in this situation for long—it’s not a world I can live in, professionally speaking.
Trying to find a set of preferences that avoids all strategic conflicts between your different actions seems a fool’s errand.
I guess I completed the fool’s errand, then...
Do you at least agree that self-modifying AIs tend not to contain time-inconsistent strategies for very long?
There is a big difference between having time inconsistent preferences, and time inconsistent strategies because of the strategic incentives of the game you are playing.
I can see why a human would have time-inconsistent strategies—because of inconsistent preferences between their past and future self, hyperbolic discounting functions, that sort of thing. I am quite at a loss to understand why an agent with a constant, external utility function should experience inconsistent strategies under any circumstance, regardless of strategic incentives. Expected utility lets us add up conflicting incentives and reduce to a single preference: a multiplicity of strategic incentives is not an excuse for inconsistency.
I am a Bayesian; I don’t believe in probability calculations that come out different ways when you do them using different valid derivations. Why should I believe in decisional calculations that come out in different ways at different times?
I’m not sure that even a causal decision theorist would agree with you about strategic inconsistency being okay—they would just insist that there is an important difference between deciding to take only box B at 7:00am vs 7:10am, if Omega chooses at 7:05am, because in the former case you cause Omega’s action while in the latter case you do not. In other words, they would insist the two situations are importantly different, not that time inconsistency is okay.
And I observe again that a self-modifying AI which finds itself with time-inconsistent preferences, strategies, what-have-you, will not stay in this situation for long—it’s not a world I can live in, professionally speaking.
Trying to find a set of preferences that avoids all strategic conflicts between your different actions seems a fool’s errand.
I guess I completed the fool’s errand, then...
Do you at least agree that self-modifying AIs tend not to contain time-inconsistent strategies for very long?