It’strulyamazingjusthowmuch of the posts and discussions on LW you repeatedly ignore, Phil. There is a plurality opinion here that it can be rational to execute a strategy which includes actions that don’t maximize utility when considered as one-shot actions, but such that the overall strategy does better.
I can genuinely understand disagreement on this proposal, but could you at least acknowledge that the rest of us exist and say things like “first-order rationality finds revenge irrational” or “altruistic sacrifices that violate causal decision theory” instead?
I’m not sure what you mean by “first order rationality”. But whatever the definition, it seems that it’s not first order rationality itself that finds revenge irrational, but your own judgment of value, that depends on preferences. An agent may well like hurting people who previously hurt it (people who have a property of having previously hurt it).
Huh— a Google search returns muddled results. I had understood first-order (instrumental) rationality to mean something like causal decision theory: that given a utility function, you extrapolate out the probable consequences of your immediate options and maximize the expected utility. The problem with this is that it doesn’t take into account the problems with being modeled by others, and thus leaves you open to being exploited (Newcomblike problems, Chicken) or losing out in other ways (known-duration Prisoner’s Dilemma).
I was also taking for granted what I assumed to be the setup with the revenge scenario: that the act of revenge would be a significant net loss to you (by your utility function) as well as to your target. (E.g. you’re the President, and the Russians just nuked New York but promised to stop there if you don’t retaliate; do you launch your nukes at Russia?)
Phil’s right that a known irrational disposition towards revenge (which evolved in us for this reason) could have deterred the Russians from nuking NYC in the first place, whereas they knew they could get away with it if they knew you’re a causal decision theorist. But the form of decision process I’m considering (optimizing over strategies, not actions, while taking into account others’ likely decision algorithms given a known strategy for me) also knowably avenges New York, and thus deters the Russians.
EDIT: First paragraph was a reply to Vladimir’s un-edited comment, in which he also asked what definition of first-order rationality I meant.
Sorry for the confusion with re-editing. I took out the question after deciding that by first-order rational decisions you most likely meant those that don’t require you to act as if you believe something you don’t (that is, believe to be false), which is often practically impossible. On reflection, this doesn’t fit either.
Okay. First-order rationality finds revenge irrational. I’m not ignoring it. It is simply irrelevant to the point I was making. A person who does your will because it makes them happy to do so, or because they are irrationally biased to do so, is more reliable than one who does your will as long as his calculus tells him to.
A person who does your will because it makes them happy to do so, or because they are irrationally biased to do so, is more reliable than one who does your will as long as his calculus tells him to.
Not if the latter explicitly exhibits the form of that calculus; then you can extrapolate their future decisions yourself, more easily than you can extrapolate the decisions of the former. Higher-order rationality includes finding a decision algorithm which can’t be exploited if known in this manner.
Of course, actually calculating and reliably acting accordingly is a high standard for unmodified humans, and it’s a meaningful question whether incremental progress toward that ideal will lead to a more reliable or less reliable agent. But that’s an empirical question, not a logical one.
Not if the latter explicitly exhibits the form of that calculus; then you can extrapolate their future decisions yourself, more easily than you can extrapolate the decisions of the former.
More easily? It’s more easy to predict decisions based on a calculus, than decisions based on stimulus-response? That’s simply false.
Note that in the fMRI example, it is impossible to examine the calculus. You can only examine the level of bias. There is no way for somebody to say, “Oh, he’s unbiased, but he has an elaborate Yudkowskian utility function that will lead him to act in ways favorable to me.”
It’s truly amazing just how much of the posts and discussions on LW you repeatedly ignore, Phil. There is a plurality opinion here that it can be rational to execute a strategy which includes actions that don’t maximize utility when considered as one-shot actions, but such that the overall strategy does better.
I can genuinely understand disagreement on this proposal, but could you at least acknowledge that the rest of us exist and say things like “first-order rationality finds revenge irrational” or “altruistic sacrifices that violate causal decision theory” instead?
What he said.
I’m not sure what you mean by “first order rationality”. But whatever the definition, it seems that it’s not first order rationality itself that finds revenge irrational, but your own judgment of value, that depends on preferences. An agent may well like hurting people who previously hurt it (people who have a property of having previously hurt it).
Huh— a Google search returns muddled results. I had understood first-order (instrumental) rationality to mean something like causal decision theory: that given a utility function, you extrapolate out the probable consequences of your immediate options and maximize the expected utility. The problem with this is that it doesn’t take into account the problems with being modeled by others, and thus leaves you open to being exploited (Newcomblike problems, Chicken) or losing out in other ways (known-duration Prisoner’s Dilemma).
I was also taking for granted what I assumed to be the setup with the revenge scenario: that the act of revenge would be a significant net loss to you (by your utility function) as well as to your target. (E.g. you’re the President, and the Russians just nuked New York but promised to stop there if you don’t retaliate; do you launch your nukes at Russia?)
Phil’s right that a known irrational disposition towards revenge (which evolved in us for this reason) could have deterred the Russians from nuking NYC in the first place, whereas they knew they could get away with it if they knew you’re a causal decision theorist. But the form of decision process I’m considering (optimizing over strategies, not actions, while taking into account others’ likely decision algorithms given a known strategy for me) also knowably avenges New York, and thus deters the Russians.
EDIT: First paragraph was a reply to Vladimir’s un-edited comment, in which he also asked what definition of first-order rationality I meant.
Sorry for the confusion with re-editing. I took out the question after deciding that by first-order rational decisions you most likely meant those that don’t require you to act as if you believe something you don’t (that is, believe to be false), which is often practically impossible. On reflection, this doesn’t fit either.
Okay. First-order rationality finds revenge irrational. I’m not ignoring it. It is simply irrelevant to the point I was making. A person who does your will because it makes them happy to do so, or because they are irrationally biased to do so, is more reliable than one who does your will as long as his calculus tells him to.
Not if the latter explicitly exhibits the form of that calculus; then you can extrapolate their future decisions yourself, more easily than you can extrapolate the decisions of the former. Higher-order rationality includes finding a decision algorithm which can’t be exploited if known in this manner.
Of course, actually calculating and reliably acting accordingly is a high standard for unmodified humans, and it’s a meaningful question whether incremental progress toward that ideal will lead to a more reliable or less reliable agent. But that’s an empirical question, not a logical one.
More easily? It’s more easy to predict decisions based on a calculus, than decisions based on stimulus-response? That’s simply false.
Note that in the fMRI example, it is impossible to examine the calculus. You can only examine the level of bias. There is no way for somebody to say, “Oh, he’s unbiased, but he has an elaborate Yudkowskian utility function that will lead him to act in ways favorable to me.”