This seems better described as a variant of the traditional paradox of hedonism. That is, some goals (e.g. long term happiness) are best achieved by agents who do not explicitly aim only at this goal, and who can instead be trusted to keep to their commitments even if it turns out that they’d benefit from defecting.
That doesn’t really sound like a paradox, just more evidence that people are very suboptimal optimizers. If the goal is long-term happiness, and some actions are more conducive to that than the actions most people come up with when aiming for long-term happiness, then that only indicates we’re bad at reasoning about long-term goals.
Hmm, I think you’ve missed something if you can’t tell this apart from the general phenomenon of being “very suboptimal optimizers”. The problem isn’t that bad consequences result from our seeking pleasure ineptly. It’s instead that bad consequences result from our seeking pleasure (even if all our means-end calculations are perfectly accurate).
I agree that it’s a rather loose use of the term ‘paradox’, but this is the standard term for the phenomenon, dating back more than a century now. For more background, see the Stanford encyclopedia and wikipedia. (Parfit’s ‘rational irrationality’ is also related.)
It’s instead that bad consequences result from our seeking pleasure (even if all our means-end calculations are perfectly accurate).
That sounds like a contradiction. If you’re perfect at doing means-end calculations, and the best way to attain pleasure or happiness is something other than seeking it directly, then your calculations will tell you that, and you will do it.
Maybe I’m missing something, but this sounds more like an aesop about the perils of hedonism, and I’m not sure it would apply to perfect decision-makers.
It’s no contradiction. Perfect means-end calculations merely ensures that you’ll choose the best of the options available given that you’ve made a means-end calculation. But you might have different (and better) options if you never made any such calculation. (For a crude illustration, imagine that God exists and will reward people who never make any attempt at instrumental reasoning.) By the time your calculations tell you that you never should have calculated in the first place, it’s too late.
Perfect decision-makers, with perfect information, should always be able to take the optimal outcome in any situation. Likewise, perfect decision-makers with limited information should always be able to choose the outcome with the best expected payoff under strict Bayesian reasoning.
However, when the actor’s decision-making process becomes part of the situation under consideration, as happens when Katemega scrutinises Joe’s potential for leaving her in the future, then the perfect decision-maker is only able to choose the optimal outcome if he is also capable of perfect self-modification. Without that ability, he’s vulnerable to his own choices and preferences changing in the future, which he can’t control right now.
I’d also like to draw a distinction between a practical pre-commitment (of the form “leaving this marriage will cause me -X utilons due to financial penalty or cognitive dissonance for breaking my vows”), and an actual self-modification to a mind state where “I promised I would never leave Kate, but I’m going to do it anyway now” is not actually an option. I don’t think humans are capable of the latter. An AI might be, I don’t know.
Also, what about decisions Joe made in the past (for example, deciding when he was eighteen that there was no way he was ever going to get married, because being single was too much fun)? If you want your present state to influence your future state strongly, you have to accept the influence of your past state on your present state just as strongly, and you can’t just say “Oh, but I’m older and wiser now” in one instance but not the other.
Without the ability to self-modify into a truly sincere state wherein he’ll never leave Kate no matter what, Joe can’t be completely sincere, and (by the assumptions of the problem) Kate will sense this and his chances of his proposal being accepted will diminish. And there’s nothing he can do about that.
I have to note that an agent using one of the new decision theories sometimes discussed around here, like UDT, wouldn’t leave Katemega and wouldn’t need self-modification or precommitment to not leave her.
This seems better described as a variant of the traditional paradox of hedonism. That is, some goals (e.g. long term happiness) are best achieved by agents who do not explicitly aim only at this goal, and who can instead be trusted to keep to their commitments even if it turns out that they’d benefit from defecting.
That doesn’t really sound like a paradox, just more evidence that people are very suboptimal optimizers. If the goal is long-term happiness, and some actions are more conducive to that than the actions most people come up with when aiming for long-term happiness, then that only indicates we’re bad at reasoning about long-term goals.
Hmm, I think you’ve missed something if you can’t tell this apart from the general phenomenon of being “very suboptimal optimizers”. The problem isn’t that bad consequences result from our seeking pleasure ineptly. It’s instead that bad consequences result from our seeking pleasure (even if all our means-end calculations are perfectly accurate).
I agree that it’s a rather loose use of the term ‘paradox’, but this is the standard term for the phenomenon, dating back more than a century now. For more background, see the Stanford encyclopedia and wikipedia. (Parfit’s ‘rational irrationality’ is also related.)
That sounds like a contradiction. If you’re perfect at doing means-end calculations, and the best way to attain pleasure or happiness is something other than seeking it directly, then your calculations will tell you that, and you will do it.
Maybe I’m missing something, but this sounds more like an aesop about the perils of hedonism, and I’m not sure it would apply to perfect decision-makers.
It’s no contradiction. Perfect means-end calculations merely ensures that you’ll choose the best of the options available given that you’ve made a means-end calculation. But you might have different (and better) options if you never made any such calculation. (For a crude illustration, imagine that God exists and will reward people who never make any attempt at instrumental reasoning.) By the time your calculations tell you that you never should have calculated in the first place, it’s too late.
Perfect decision-makers, with perfect information, should always be able to take the optimal outcome in any situation. Likewise, perfect decision-makers with limited information should always be able to choose the outcome with the best expected payoff under strict Bayesian reasoning.
However, when the actor’s decision-making process becomes part of the situation under consideration, as happens when Katemega scrutinises Joe’s potential for leaving her in the future, then the perfect decision-maker is only able to choose the optimal outcome if he is also capable of perfect self-modification. Without that ability, he’s vulnerable to his own choices and preferences changing in the future, which he can’t control right now.
I’d also like to draw a distinction between a practical pre-commitment (of the form “leaving this marriage will cause me -X utilons due to financial penalty or cognitive dissonance for breaking my vows”), and an actual self-modification to a mind state where “I promised I would never leave Kate, but I’m going to do it anyway now” is not actually an option. I don’t think humans are capable of the latter. An AI might be, I don’t know.
Also, what about decisions Joe made in the past (for example, deciding when he was eighteen that there was no way he was ever going to get married, because being single was too much fun)? If you want your present state to influence your future state strongly, you have to accept the influence of your past state on your present state just as strongly, and you can’t just say “Oh, but I’m older and wiser now” in one instance but not the other.
Without the ability to self-modify into a truly sincere state wherein he’ll never leave Kate no matter what, Joe can’t be completely sincere, and (by the assumptions of the problem) Kate will sense this and his chances of his proposal being accepted will diminish. And there’s nothing he can do about that.
I have to note that an agent using one of the new decision theories sometimes discussed around here, like UDT, wouldn’t leave Katemega and wouldn’t need self-modification or precommitment to not leave her.