“a rational agent is specifically defined as an agent which always chooses the action which maximises its expected performance, given all of the knowledge it currently possesses.”
Expected performance. Not actual performance. Whether its actual performance is good or not depends on other factors—such as how malicious the environment is, whether the agent’s priors are good—and so on.
Problem with that in human practice is that it leads to people defending their ruined plans, saying, “But my expected performance was great!” Vide the failed trading companies saying it wasn’t their fault, the market had just done something that it shouldn’t have done once in the lifetime of the universe. Achieving a win is much harder than achieving an expectation of winning (i.e. something that it seems you could defend as a good try).
Expected performance is what rational agents are actually maximising.
Whether that corresponds to actual performance depends on what their expectations are. What their expectations are typically depends on their history—and the past is not necessarily a good guide to the future.
Highly rational agents can still lose. Rational actions (that follow the laws of induction and deduction applied to their sense data) are not necessarily the actions that win.
Rational agents try to win—and base their efforts on their expectations. Whether they actually win depends on whether their expectations are correct. In my view, attempts to link rationality directly to “winning” miss the distinction between actual and expected utility.
There are reasons for associations between expected performance and actual performance. Indeed, those associations are why agents have the expectations they do. However, the association is statistical in nature.
Dissect the brain of a rational agent, and it is its expected utility that is being maximised. Its actual utility is usually not something that is completely under its control.
It’s important not to define the “rational action” as “the action that wins”. Whether an action is rational or not should be a function of an agent’s sense data up to that point—and should not vary depending on environmental factors which the agent knows nothing about. Otherwise, the rationality of an action is not properly defined from an agent’s point of view.
I don’t think that the excuses humans use for failures is an issue here.
Behaving rationally is not the only virtue needed for success. For example, you also need to enter situations with appropriate priors.
Only if you want rationality to be the sole virtue, should “but I was behaving rationally” be the ultimate defense against an inquisition.
Rationality is good, but to win, you also need effort, persistence, good priors, etc—and it would be very, very bad form to attempt to bundle all those into the notion of being “rational”.
Expected performance is what rational agents are actually maximising.
Does that mean that I should mechanically overwrite my beliefs about the chance of a lottery ticket winning, in order to maximize my expectation of the payout? As Nesov says, rationality is about utility; which is why a rational agent in fact maximizes their expectation of utility, while trying to maximize utility (not their expectation of utility!).
It may help to understand this and some of the conversations below if you realize that the word “try” behaves a lot like “quotation marks” and that having an extra “pair” of quotation “marks” can really make “your” sentences seem a bit odd.
I offer you a bet, I’ll toss a coin, and give you £100 if it comes up heads, you give me £50 if it comes up tails. Presumably you take the bet right? Because your expected return is £50 - surely this is the sense in which rationalists maximise expected utility. We don’t mean “the amount of utility they expect to win”, but expectation in the technical sense—ie, the product of the likelihood of various events happening with their utility in the univserses in which those events happen (or probably more properly an integral...)
If you expect to lose £50 and you are wrong, that doesn’t actually say anything about the expectation of your winnings.
If you expect to lose £50 and you are wrong, that doesn’t actually say anything about the expectation of your winnings.
It does, however, say something about your expectation of your winnings. Expectation can be very knowledge dependent. Let’s say someone rolls two six sided dice, and then offers you a bet where you win $100 if the sum of the dice is less than 5, but lose $10 if the sum is greater than 5. You might perform various calculations to determine your expected value of accepting the bet. But if I happen to peak and see one of the dice has landed on 6, then I will calculate a different expected value than you will.
So we have different expected values for calculating the bet, because we have different information.
So EY’s point is that if a rational agent’s only purpose was to maximize (their) expected utility, they could easily do this by selectively ignoring information, so that their calculations turn out a specific way.
But actually rational agents are not interested in maximizing (their) expected utility. They are interested in maximizing real utility. Except it’s impossible to do this without perfect information, and so what agents end up doing is maximizing expected utility, although they are trying to maximize real utility.
It’s like if I’m taking a history exam in school. I am trying to achieve 100% on the exam, but end up instead achieving only 60% because I have imperfect information. My goal wasn’t 60%, it was 100%. But the actual actions I took (the answers I selected) led to to arrive at 60% instead of my true goal.
Rational agents are trying to maximize real utility, but end up maximizing expected utility (by definition), even though that’s not their true goal.
Re: Does that mean that I should mechanically overwrite my beliefs about the chance of a lottery ticket winning, in order to maximize my expectation of the payout?
No, it doesn’t. It means that the process going on in the brains of intelligent agents can be well modelled as calculating expected utilities—and then selecting the action that corresponds to the largest one.
Intelligent agents are better modelled as Expected Utility Maximisers than Utility Maximisers. Whether they actually maximise utility depends on whether they are in an environment where their expectations pan out.
Intelligent agents are better modelled as Expected Utility Maximisers than Utility Maximisers.
By definition, intelligent agents want to maximize total utility. In the absence of perfect knowledge, they act on expected utility calculations. Is this not a meaningful distinction?
Re: Does that mean that I should mechanically overwrite my beliefs about the chance of a lottery ticket winning, in order to maximize my expectation of the payout?
No, it doesn’t. It means that the process going on in the brains of intelligent agents heads can be accurately modelled as calculating expected utilities—and then selecting the action that corresponds to the largest of these.
Agents are better modelled as Expected Utility Maximisers than as Utility Maximisers. Whether an Expected Utility Maximiser actually maximises utility depends on whether it is in an environment where its expectations pan out.
Problem with that in human practice is that it leads to people defending their ruined plans, saying, “But my expected performance was great!”
It’s true that people make this kind of response, but that doesn’t make it valid, or mean that we have to throw away the notion of rationality as maximizing expected performance, rather than actual performance.
In the case of failed trading companies, can’t we just say that despite their fantasies, their expected performance shouldn’t have been so great as they thought? And the fact that their actual results differed from their expected results should cast suspicion on their expectations.
Perhaps we can say that expectations about performance be epistemically rational, and only then can an agent who maximizes their expected performance be instrumentally rational.
Achieving a win is much harder than achieving an expectation of winning (i.e. something that it seems you could defend as a good try).
Some expectations win. Some expectations lose. Yet not all expectations are created equal. Non-accidental winning starts with something that seems good to try (can accidental winning be rational?). At least, there is some link between expectations and rationality, such that we can call some expectations more rational than others, regardless of whether they actually win or lose.
An example SoullessAutomaton made was that we shouldn’t consider lottery winners rational, even though they won, because they should not have expected to. Conversely, all sorts of inductive expectations can be rational, even though sometimes they will fail due to the problem of induction. For instance, it’s rational to expect that the sun will rise tomorrow. If Omega decides to blow up the sun, my expectation will still have been rational, even though I turned out to be wrong.
Yet not all expectations are created equal. Non-accidental winning starts with something that seems good to try (can accidental winning be rational?).
In the real world, of course, most things are some mixture of controllable and randomized. Depending on your definition of accidental, it can be rational to make low-cost steps to position yourself to take advantage of possible events you have no control over. I wouldn’t call this accidental, however, because the average expected gain should be net positive, even if one expects (id est, with confidence greater than 50%) to lose.
I used the lottery as an example because it’s generally a clear-cut case where the expected gain minus the cost of participating is net negative and the controllable factor (how many tickets you buy) has extremely small impact.
Yes, and I liked your example for exactly this reason: the expected value of buying lottery tickets is negative.
I think that this shows that it is irrational to take an action where it’s clear-cut that the expected value is negative, even though due to chance, one iteration of that action might produce a positive result. You are using accidental the same way I am: winning from an action with a negative expected value is what I would call accidental, and winning with a positive expected value is non-accidental.
Things are a bit more complicated when we don’t know the expected value of an action. For example, in Eliezer’s examples of failed trading companies, we don’t know the correct expected value of their trading strategies, or whether they were positive or negative.
In cases where the expected value of an action is unknown, perhaps the instrumental rationality of the action is contingent on the epistemic rationality of our estimation of its expected value.
I like your definition of an accidental win, it matches my intuitive definition and is stated more clearly than I would have been able to.
In cases where the expected value of an action is unknown, perhaps the instrumental rationality of the action is contingent on the epistemic rationality of our estimation of its expected value.
Yes. Actually, I think the “In cases where the expected value of an action is unknown” clause is likely unnecessary, because the accuracy of an expected value calculation is always at least slightly uncertain.
Furthermore, the second-order calculation of the expected value of expending resources to increase epistemological rationality should be possible; and in the case that acting on a proposition is irrational due to low certainty, and the second-order value of increasing certainty is negative, the rational thing to do is shrug and move on.
It sounds like the objection you’re giving here is that “some people will misinterpret expected performance in the technical sense as expected performance in the colloquial sense (i.e., my guess as to how things will turn out).” That doesn’t seem like much of a criticism though, and it doesn’t sound severe enough to throw out what is a pretty standard definition. People will also misinterpret your alternate definition, as we have seen.
What you say is important: the vast majority of whining “rationalists” weren’t done dirty by a universe that “nobody could have foreseen” (the sub-prime mortgage crisis/piloting jets into buildings). If you sample a random loser claiming such (my reasoning was flawless, my priors incorporated all feasibly available human knowledge), an impartial judge would in nearly all cases correctly call them to task.
But clearly it’s not always the case that my reasoning (and/or priors) is at fault when I lose. My updates shouldn’t overshoot based on empirical noise and false humility. I think what you want to say is that most likely even (especially?) the most proud rationalists probably shield themselves from attributing their loss to their own error (“eat less salt”).
I’d like some quantifiable demonstration of an externalizing bias, some calibration of my own personal tendency to deny evidence of my own irrationality (or of my wrong priors).
I’m not sure how you can implement an admonition to Win and not just to (truly, sincerely) try. What is the empirical difference?
I suppose you could use an expected regret measure (that is, the difference between the ideal result and the result of the decision summed across the distribution of probable futures) instead of an expected utility measure.
Expected regret tends to produce more robust strategies than expected utility. For instance, in Newcomb’s problem, we could say that two-boxing comes from expected utility but one-boxing comes from regret-minimizing (since a “failed” two-box gives $1,000,000-$1,000=$999,000 of regret, if you believe Omega would have acted differently if you had been the type of person to one-box, where a “failed” one-box gives $1000-$0=$1,000 of regret).
Using more robust strategies may be a way to more consistently Win, though perhaps the true goal should be to know when to use expected utility and when to use expected regret (and therefore to take advantage both of potential bonanzas and of risk-limiting mechanisms).
I’m quite confident there is only a language difference
between eliezer’s description and the point a number of you
have just made. Winning versus trying to win are clearly
two different things, and it’s also clear that
“genuinely trying to win” is the best one can do, based
on the definition those in this thread are using. But
Eli’s point on ob was that telling oneself “I’m genuinely
trying to win” often results in less than genuinely trying.
It results in “trying to try”...which means being
satisfied by a display of effort rather than utility maximizing. So instead, he arguesn why not say to oneself the imperative
“Win!”, where he bakes the “try” part into the implicit imperative.
I agree eli’s language usage here may be
slightly non standard for most of us (me included)
and therefore perhaps misleading to the uninitiated,
but I’m doubtful we need to stress about it too
much if the facts are as I’ve stated. Does anyone
disagree? Perhaps one could argue eli should have to say,
“Rational agents should win_eli” and link to an
Explanation like this thread, if we are genuinely concerned
about people getting confused.
Yes! This functional difference is very important!
In Logic, you begin with a set of non-contradicting assumptions and then build a consistent theory based on those assumptions. The deductions you make are analogous to being rational. If the assumptions are non-contradicting, then it is impossible to deduce something false in the system. (Analogously, it is impossible for rationality not to win.) However, you can get a paradox by having a self-referential statement. You can prove that every sufficiently complex theory is not closed—there are things that are true that you can’t prove from within the system. Along the same lines, you can build a paradox by forcing the system to try to talk about it itself.
What Grobstein has presented is a classic paradox and is the closest you can come to rationality not winning.
I understand all that, but I still think it’s impossible to operationalize an admonition to Win. If
Omega says that Box B is empty if you try to win what’s inside it.
then you simply cannot implement a strategy that will give you the proceeds of Box B (unless you’re using some definition of “try” that is inconsistent with “choose a strategy that has a particular expected result”).
I think that falls under the “ritual of cognition” exception that Eliezer discussed for a while: when Winning depends directly on the ritual of cognition, then of course we can define a situation in which rationality doesn’t Win. But that is perfectly meaningless in every other situation (which is to say, in the world), where the result of the ritual is what matters.
Agents do try to win. The don’t necessarily actually win. For example, if they face a superior opponent. Kasparov was behaving in a highly rational manner in his battle with Deep Blue. He didn’t win. He did try to, though. Thus the distinction between trying to win and actually winning.
I’m not sure how you can implement an admonition to Win and not just to (truly, sincerely) try. What is the empirical difference?
Based on the above, I believe the distinction was between two different kinds of admonitions. I was pointing out that an admonition to win will cause someone to try to win, and an admonition to try will cause someone to try to try.
Right, but again, the topic is the definition of instrumental rationality, and whether it refers to “trying to win” or “actually winning”.
What do “admonitions” have to do with things? Are you arguing that because telling someone to “win” may some positive effect that telling someone to “try to win” lacks that we should define “instrumental rationality” to mean “winning”—and not “trying to win”?
Isn’t that an idiosyncracy of human psychology—which surely ought to have nothing to do with the definition of “instrumental rationality”.
Consider the example of handicap chess. You start with no knight. You try to win. Actually you lose. Were you behaving rationally? I say: you may well have been. Rationality is more about the trying, than it is about the winning.
I consider it to be a probably-true fact about human psychology that if you tell someone to “try” rather than telling them to “win” then that introduces failure possibilites into their mind. That may have a positive effect, if they are naturally over-confident—or a negative one, if they are naturally wracked with self-doubt.
It’s the latter group who buy self-help books: the former group doesn’t think it needs them. So the self-help books tell you to “win”—and not to “try” ;-)
Right, but again, the topic is the definition of instrumental rationality, and whether it refers to “trying to win” or “actually winning”.
What do “admonitions” have to do with things? Are you arguing that because telling someone to “win” may some a positive effect that telling someone to “try to win” lacks that we should define “instrumental rationality” to mean “winning” and not “trying to win”?
Isn’t that an idiosyncracy of human psychology—which surely ought to have nothing to do with the definition of “instrumental rationality”.
Consider the example of handicap chess. You start with no knight. You try to win. Actually you lose. Were you behaving rationally? I say: you may well have been. Rationality is more about the trying, than it is about the winning.
I agree. I’m just noting that an admonition to Win is strictly an admonition to try, phrased more strongly. Winning is not an action—it is a result. All I can suggest are actions that get you to that result.
I can tell you “don’t be satisfied with trying and failing,” but that’s not quite the same.
As for the “Trying-to-try” page—an argument from Yoda and the Force? It reads like something out of a self-help manual!
Sure: if you are trying to inspire confidence in yourself in order to improve your performance, then you might under some circumstances want to think only of winning—and ignore the possibility of trying and failing. But let’s not get our subjects in a muddle, here—the topic is the definition of instrumental rationality, not how some new-age self-help manual might be written.
If it were also the case that your friends all agreed with you, but the “mainstream/dominant position in modern philosophy and decision theory” disagreed with you, then yes, you should probably feel a bit worried.
Good point, my reply didn’t take it into account. It all depends on the depth of understanding, so to answer your remark consider e.g. supernatural, UFOs.
Is there really such a disagreement about Newcomb’s problem?
The issue seems to be whether agents can convincingly signal to a powerful agent that they will act in some way in the future—i.e. whether it is possible to make credible promises to such a powerful agent.
I think that this is possible—at least in principle. Eliezer also seems to think this is possible. I personally am not sure that such a powerful agent could achieve the proposed success rate on unmodified humans—but in the context of artificial agents, I see few problems—especially if Omega can leave the artificial agent with the boxes in a chosen controlled environment, where Omega can be fairly confident that they will not be interfered with by interested third parties.
Do many in “modern philosophy and decision theory” really disagree with that?
More to the point, do they have a coherent counter-argument?
Thanks for mentioning artificial agents. If they can run arbitrary computations, Omega itself isn’t implementable as a program due to the halting problem. Maybe this is relevant to Newcomb’s problem in general, I can’t tell.
Surely not a serious problem: if the agent is going to hang around until the universal heat death before picking a box, then Omega’s predcition of its actions doesn’t matter.
Wikipedia has this right:
“a rational agent is specifically defined as an agent which always chooses the action which maximises its expected performance, given all of the knowledge it currently possesses.”
http://en.wikipedia.org/wiki/Rationality
Expected performance. Not actual performance. Whether its actual performance is good or not depends on other factors—such as how malicious the environment is, whether the agent’s priors are good—and so on.
Problem with that in human practice is that it leads to people defending their ruined plans, saying, “But my expected performance was great!” Vide the failed trading companies saying it wasn’t their fault, the market had just done something that it shouldn’t have done once in the lifetime of the universe. Achieving a win is much harder than achieving an expectation of winning (i.e. something that it seems you could defend as a good try).
Expected performance is what rational agents are actually maximising.
Whether that corresponds to actual performance depends on what their expectations are. What their expectations are typically depends on their history—and the past is not necessarily a good guide to the future.
Highly rational agents can still lose. Rational actions (that follow the laws of induction and deduction applied to their sense data) are not necessarily the actions that win.
Rational agents try to win—and base their efforts on their expectations. Whether they actually win depends on whether their expectations are correct. In my view, attempts to link rationality directly to “winning” miss the distinction between actual and expected utility.
There are reasons for associations between expected performance and actual performance. Indeed, those associations are why agents have the expectations they do. However, the association is statistical in nature.
Dissect the brain of a rational agent, and it is its expected utility that is being maximised. Its actual utility is usually not something that is completely under its control.
It’s important not to define the “rational action” as “the action that wins”. Whether an action is rational or not should be a function of an agent’s sense data up to that point—and should not vary depending on environmental factors which the agent knows nothing about. Otherwise, the rationality of an action is not properly defined from an agent’s point of view.
I don’t think that the excuses humans use for failures is an issue here.
Behaving rationally is not the only virtue needed for success. For example, you also need to enter situations with appropriate priors.
Only if you want rationality to be the sole virtue, should “but I was behaving rationally” be the ultimate defense against an inquisition.
Rationality is good, but to win, you also need effort, persistence, good priors, etc—and it would be very, very bad form to attempt to bundle all those into the notion of being “rational”.
Does that mean that I should mechanically overwrite my beliefs about the chance of a lottery ticket winning, in order to maximize my expectation of the payout? As Nesov says, rationality is about utility; which is why a rational agent in fact maximizes their expectation of utility, while trying to maximize utility (not their expectation of utility!).
It may help to understand this and some of the conversations below if you realize that the word “try” behaves a lot like “quotation marks” and that having an extra “pair” of quotation “marks” can really make “your” sentences seem a bit odd.
I’m not sure I get this at all.
I offer you a bet, I’ll toss a coin, and give you £100 if it comes up heads, you give me £50 if it comes up tails. Presumably you take the bet right? Because your expected return is £50 - surely this is the sense in which rationalists maximise expected utility. We don’t mean “the amount of utility they expect to win”, but expectation in the technical sense—ie, the product of the likelihood of various events happening with their utility in the univserses in which those events happen (or probably more properly an integral...)
If you expect to lose £50 and you are wrong, that doesn’t actually say anything about the expectation of your winnings.
It does, however, say something about your expectation of your winnings. Expectation can be very knowledge dependent. Let’s say someone rolls two six sided dice, and then offers you a bet where you win $100 if the sum of the dice is less than 5, but lose $10 if the sum is greater than 5. You might perform various calculations to determine your expected value of accepting the bet. But if I happen to peak and see one of the dice has landed on 6, then I will calculate a different expected value than you will.
So we have different expected values for calculating the bet, because we have different information.
So EY’s point is that if a rational agent’s only purpose was to maximize (their) expected utility, they could easily do this by selectively ignoring information, so that their calculations turn out a specific way.
But actually rational agents are not interested in maximizing (their) expected utility. They are interested in maximizing real utility. Except it’s impossible to do this without perfect information, and so what agents end up doing is maximizing expected utility, although they are trying to maximize real utility.
It’s like if I’m taking a history exam in school. I am trying to achieve 100% on the exam, but end up instead achieving only 60% because I have imperfect information. My goal wasn’t 60%, it was 100%. But the actual actions I took (the answers I selected) led to to arrive at 60% instead of my true goal.
Rational agents are trying to maximize real utility, but end up maximizing expected utility (by definition), even though that’s not their true goal.
Re: Does that mean that I should mechanically overwrite my beliefs about the chance of a lottery ticket winning, in order to maximize my expectation of the payout?
No, it doesn’t. It means that the process going on in the brains of intelligent agents can be well modelled as calculating expected utilities—and then selecting the action that corresponds to the largest one.
Intelligent agents are better modelled as Expected Utility Maximisers than Utility Maximisers. Whether they actually maximise utility depends on whether they are in an environment where their expectations pan out.
By definition, intelligent agents want to maximize total utility. In the absence of perfect knowledge, they act on expected utility calculations. Is this not a meaningful distinction?
Re: Does that mean that I should mechanically overwrite my beliefs about the chance of a lottery ticket winning, in order to maximize my expectation of the payout?
No, it doesn’t. It means that the process going on in the brains of intelligent agents heads can be accurately modelled as calculating expected utilities—and then selecting the action that corresponds to the largest of these.
Agents are better modelled as Expected Utility Maximisers than as Utility Maximisers. Whether an Expected Utility Maximiser actually maximises utility depends on whether it is in an environment where its expectations pan out.
I am inclined to argue along exactly the same lines as Tim, though I worry there is something I am missing.
It’s true that people make this kind of response, but that doesn’t make it valid, or mean that we have to throw away the notion of rationality as maximizing expected performance, rather than actual performance.
In the case of failed trading companies, can’t we just say that despite their fantasies, their expected performance shouldn’t have been so great as they thought? And the fact that their actual results differed from their expected results should cast suspicion on their expectations.
Perhaps we can say that expectations about performance be epistemically rational, and only then can an agent who maximizes their expected performance be instrumentally rational.
Some expectations win. Some expectations lose. Yet not all expectations are created equal. Non-accidental winning starts with something that seems good to try (can accidental winning be rational?). At least, there is some link between expectations and rationality, such that we can call some expectations more rational than others, regardless of whether they actually win or lose.
An example SoullessAutomaton made was that we shouldn’t consider lottery winners rational, even though they won, because they should not have expected to. Conversely, all sorts of inductive expectations can be rational, even though sometimes they will fail due to the problem of induction. For instance, it’s rational to expect that the sun will rise tomorrow. If Omega decides to blow up the sun, my expectation will still have been rational, even though I turned out to be wrong.
In the real world, of course, most things are some mixture of controllable and randomized. Depending on your definition of accidental, it can be rational to make low-cost steps to position yourself to take advantage of possible events you have no control over. I wouldn’t call this accidental, however, because the average expected gain should be net positive, even if one expects (id est, with confidence greater than 50%) to lose.
I used the lottery as an example because it’s generally a clear-cut case where the expected gain minus the cost of participating is net negative and the controllable factor (how many tickets you buy) has extremely small impact.
Yes, and I liked your example for exactly this reason: the expected value of buying lottery tickets is negative.
I think that this shows that it is irrational to take an action where it’s clear-cut that the expected value is negative, even though due to chance, one iteration of that action might produce a positive result. You are using accidental the same way I am: winning from an action with a negative expected value is what I would call accidental, and winning with a positive expected value is non-accidental.
Things are a bit more complicated when we don’t know the expected value of an action. For example, in Eliezer’s examples of failed trading companies, we don’t know the correct expected value of their trading strategies, or whether they were positive or negative.
In cases where the expected value of an action is unknown, perhaps the instrumental rationality of the action is contingent on the epistemic rationality of our estimation of its expected value.
I like your definition of an accidental win, it matches my intuitive definition and is stated more clearly than I would have been able to.
Yes. Actually, I think the “In cases where the expected value of an action is unknown” clause is likely unnecessary, because the accuracy of an expected value calculation is always at least slightly uncertain.
Furthermore, the second-order calculation of the expected value of expending resources to increase epistemological rationality should be possible; and in the case that acting on a proposition is irrational due to low certainty, and the second-order value of increasing certainty is negative, the rational thing to do is shrug and move on.
It sounds like the objection you’re giving here is that “some people will misinterpret expected performance in the technical sense as expected performance in the colloquial sense (i.e., my guess as to how things will turn out).” That doesn’t seem like much of a criticism though, and it doesn’t sound severe enough to throw out what is a pretty standard definition. People will also misinterpret your alternate definition, as we have seen.
Do you have other objections?
What you say is important: the vast majority of whining “rationalists” weren’t done dirty by a universe that “nobody could have foreseen” (the sub-prime mortgage crisis/piloting jets into buildings). If you sample a random loser claiming such (my reasoning was flawless, my priors incorporated all feasibly available human knowledge), an impartial judge would in nearly all cases correctly call them to task.
But clearly it’s not always the case that my reasoning (and/or priors) is at fault when I lose. My updates shouldn’t overshoot based on empirical noise and false humility. I think what you want to say is that most likely even (especially?) the most proud rationalists probably shield themselves from attributing their loss to their own error (“eat less salt”).
I’d like some quantifiable demonstration of an externalizing bias, some calibration of my own personal tendency to deny evidence of my own irrationality (or of my wrong priors).
I’m not sure how you can implement an admonition to Win and not just to (truly, sincerely) try. What is the empirical difference?
I suppose you could use an expected regret measure (that is, the difference between the ideal result and the result of the decision summed across the distribution of probable futures) instead of an expected utility measure.
Expected regret tends to produce more robust strategies than expected utility. For instance, in Newcomb’s problem, we could say that two-boxing comes from expected utility but one-boxing comes from regret-minimizing (since a “failed” two-box gives $1,000,000-$1,000=$999,000 of regret, if you believe Omega would have acted differently if you had been the type of person to one-box, where a “failed” one-box gives $1000-$0=$1,000 of regret).
Using more robust strategies may be a way to more consistently Win, though perhaps the true goal should be to know when to use expected utility and when to use expected regret (and therefore to take advantage both of potential bonanzas and of risk-limiting mechanisms).
I’m quite confident there is only a language difference between eliezer’s description and the point a number of you have just made. Winning versus trying to win are clearly two different things, and it’s also clear that “genuinely trying to win” is the best one can do, based on the definition those in this thread are using. But Eli’s point on ob was that telling oneself “I’m genuinely trying to win” often results in less than genuinely trying. It results in “trying to try”...which means being satisfied by a display of effort rather than utility maximizing. So instead, he arguesn why not say to oneself the imperative “Win!”, where he bakes the “try” part into the implicit imperative. I agree eli’s language usage here may be slightly non standard for most of us (me included) and therefore perhaps misleading to the uninitiated, but I’m doubtful we need to stress about it too much if the facts are as I’ve stated. Does anyone disagree? Perhaps one could argue eli should have to say, “Rational agents should win_eli” and link to an Explanation like this thread, if we are genuinely concerned about people getting confused.
Eliezer seems to be talking about actually winning—e.g.: “Achieving a win is much harder than achieving an expectation of winning”.
He’s been doing this pretty consistently for a while now—including on his administrator’s page on the topic:
“Instrumental rationality: achieving your values.”
http://lesswrong.com/lw/31/what_do_we_mean_by_rationality/
That is why this discussion is still happening.
Here’s a functional difference: Omega says that Box B is empty if you try to win what’s inside it.
Yes! This functional difference is very important!
In Logic, you begin with a set of non-contradicting assumptions and then build a consistent theory based on those assumptions. The deductions you make are analogous to being rational. If the assumptions are non-contradicting, then it is impossible to deduce something false in the system. (Analogously, it is impossible for rationality not to win.) However, you can get a paradox by having a self-referential statement. You can prove that every sufficiently complex theory is not closed—there are things that are true that you can’t prove from within the system. Along the same lines, you can build a paradox by forcing the system to try to talk about it itself.
What Grobstein has presented is a classic paradox and is the closest you can come to rationality not winning.
I understand all that, but I still think it’s impossible to operationalize an admonition to Win. If
then you simply cannot implement a strategy that will give you the proceeds of Box B (unless you’re using some definition of “try” that is inconsistent with “choose a strategy that has a particular expected result”).
I think that falls under the “ritual of cognition” exception that Eliezer discussed for a while: when Winning depends directly on the ritual of cognition, then of course we can define a situation in which rationality doesn’t Win. But that is perfectly meaningless in every other situation (which is to say, in the world), where the result of the ritual is what matters.
Agents do try to win. The don’t necessarily actually win. For example, if they face a superior opponent. Kasparov was behaving in a highly rational manner in his battle with Deep Blue. He didn’t win. He did try to, though. Thus the distinction between trying to win and actually winning.
see http://www.overcomingbias.com/2008/10/trying-to-try.html
It’s really easy to convince yourself that you’ve truly, sincerely tried—trying to try is not nearly as effective as trying to win.
The intended distinction was originally between trying to win and actually winning. You are comparing two kinds of trying.
Based on the above, I believe the distinction was between two different kinds of admonitions. I was pointing out that an admonition to win will cause someone to try to win, and an admonition to try will cause someone to try to try.
Thomblake’s interpretation of my post matches my own.
Right, but again, the topic is the definition of instrumental rationality, and whether it refers to “trying to win” or “actually winning”.
What do “admonitions” have to do with things? Are you arguing that because telling someone to “win” may some positive effect that telling someone to “try to win” lacks that we should define “instrumental rationality” to mean “winning”—and not “trying to win”?
Isn’t that an idiosyncracy of human psychology—which surely ought to have nothing to do with the definition of “instrumental rationality”.
Consider the example of handicap chess. You start with no knight. You try to win. Actually you lose. Were you behaving rationally? I say: you may well have been. Rationality is more about the trying, than it is about the winning.
The question was about admonitions. I commented based on that. I didn’t mean anything further about instrumental rationality.
OK. I don’t think we have a disagreement, then.
I consider it to be a probably-true fact about human psychology that if you tell someone to “try” rather than telling them to “win” then that introduces failure possibilites into their mind. That may have a positive effect, if they are naturally over-confident—or a negative one, if they are naturally wracked with self-doubt.
It’s the latter group who buy self-help books: the former group doesn’t think it needs them. So the self-help books tell you to “win”—and not to “try” ;-)
Right, but again, the topic is the definition of instrumental rationality, and whether it refers to “trying to win” or “actually winning”.
What do “admonitions” have to do with things? Are you arguing that because telling someone to “win” may some a positive effect that telling someone to “try to win” lacks that we should define “instrumental rationality” to mean “winning” and not “trying to win”?
Isn’t that an idiosyncracy of human psychology—which surely ought to have nothing to do with the definition of “instrumental rationality”.
Consider the example of handicap chess. You start with no knight. You try to win. Actually you lose. Were you behaving rationally? I say: you may well have been. Rationality is more about the trying, than it is about the winning.
I agree. I’m just noting that an admonition to Win is strictly an admonition to try, phrased more strongly. Winning is not an action—it is a result. All I can suggest are actions that get you to that result.
I can tell you “don’t be satisfied with trying and failing,” but that’s not quite the same.
As for the “Trying-to-try” page—an argument from Yoda and the Force? It reads like something out of a self-help manual!
Sure: if you are trying to inspire confidence in yourself in order to improve your performance, then you might under some circumstances want to think only of winning—and ignore the possibility of trying and failing. But let’s not get our subjects in a muddle, here—the topic is the definition of instrumental rationality, not how some new-age self-help manual might be written.
Of course, this isn’t the first time I have pointed this out—see:
http://lesswrong.com/lw/33/comments_for_rationality/
Nobody seemed to have any coherent criticism the last time around—and yet now we have the same issue all over again.
It would seem we don’t appreciate your genius. Perhaps you should complain about this some more.
I’m not complaining, just observing. I see you are using the “royal we” again.
I wonder whether being surrounded by agents that agree with you is helping.
I agree with you that people shouldn’t drink fatal poison, and that 2+2=4. Should you feel worried because of that?
If it were also the case that your friends all agreed with you, but the “mainstream/dominant position in modern philosophy and decision theory” disagreed with you, then yes, you should probably feel a bit worried.
Good point, my reply didn’t take it into account. It all depends on the depth of understanding, so to answer your remark consider e.g. supernatural, UFOs.
Is there really such a disagreement about Newcomb’s problem?
The issue seems to be whether agents can convincingly signal to a powerful agent that they will act in some way in the future—i.e. whether it is possible to make credible promises to such a powerful agent.
I think that this is possible—at least in principle. Eliezer also seems to think this is possible. I personally am not sure that such a powerful agent could achieve the proposed success rate on unmodified humans—but in the context of artificial agents, I see few problems—especially if Omega can leave the artificial agent with the boxes in a chosen controlled environment, where Omega can be fairly confident that they will not be interfered with by interested third parties.
Do many in “modern philosophy and decision theory” really disagree with that?
More to the point, do they have a coherent counter-argument?
Thanks for mentioning artificial agents. If they can run arbitrary computations, Omega itself isn’t implementable as a program due to the halting problem. Maybe this is relevant to Newcomb’s problem in general, I can’t tell.
Surely not a serious problem: if the agent is going to hang around until the universal heat death before picking a box, then Omega’s predcition of its actions doesn’t matter.