The most important decisions are before starting a war, and there the mistakes have very different costs. Overestimating your enemy results in peace (or cold war) which basically means that you just lose out on some opportunistic conquests but underestimating your enemy results in a bloody, unexpectedly long war that can disrupt you for a decade or more—there are many nice examples of that in 20th century history.
Peace or cold war are not the only possible outcomes. Surrender is another. An example is the conquest of the Aztecs by Cortez, discussed here, here, and here. Surrender can (but need not) have disastrous consequences too.
What you want is to have a distribution. You will expect your opposition to be about as strong as it is. You will prepare for the possibility that it is stronger or weaker.
A distribution is nice but often you have to commit to a choice. In such cases you generally want to minimize your expected loss (or maximize the gain) and if the loss function is lopsided, the forecast implied by the choice can be very biased indeed.
Even with a very skewed loss function you want to have an accurate estimate of your opposition, which will be an underestimate about half of the time, and then take excessive precautions. Your loss function does not influence your beliefs, only your actions.
If your loss function is severely skewed, you do NOT want to be unbiased.
.
Actions is what you should care about—if these are determined, your beliefs (which in this case do not pay rent) don’t matter much.
So why would I want to bias myself after I’ve decided to take excessive precautions?
I think we’re in agreement btw, we care about actions, and if you have a very skewed loss function then it is rational to spend a lot of effort on improbable scenarios in which you lose heavily, which from the outside looks similar to a person with a less skewed loss function thinking that those scenarios are actually plausible. I was just trying to point out that DanielLC’s reply was correct and your previous one is not—even with a skewed loss function this should not produce feedback to the actual beliefs, only to your actions. So no, you DO want to be unbiased, it’s just that an unbiased estimate/posterior distribution can still lead to asymmetric behaviour (by which I mean spending an amount of time/effort to prepare for a possible future disproportionate to the actual probability to this future occurring).
Imagine that you need to estimate a single value, a real number, and your loss function is highly skewed. For me this would work as follows:
Get a rough unbiased estimate
Realize that I don’t care about the unbiased estimate because of my loss function
Construct a known-biased estimate that takes into account my loss function
Take this known-biased estimate as the estimate that I’ll use from now on
Formulate a course of action on the basis of the the biased estimate
The point is that on the road to deciding on the course of action it’s very convenient to have a biased estimate that you will take as your working hypothesis.
Yes. My point is that this new biased estimate is not your ‘real estimate’ - this is simply not your best guess/posterior distribution given your information. But as I remarked above your rational actions given a skewed loss function resemble the actions of a rational agent with a less risk-averse loss function with a different estimate, so in order to determine your actions you can compute what [an agent with a less skewed loss function and your (deliberately) biased estimate] would do, and then just copy those actions.
But despite all of this, you still want to be unbiased. It’s fine to use the computational shortcut mentioned above to deal with skewed loss functions, but you need your beliefs to stay as accurate as possible to not get strange future behaviour. A small, simplified example:
Suppose you are in possesion of 1001$ total (all your assets included), and it costs $1000 to buy a cure for a fatal disease you happen to have/a ticket to heaven/insurance for cryonics. You most definitely don’t want to lose more than one dollar. Then a guy walks up to you and offers a bet—you pay 2$, after which you are given a box which contains between 0$ and 10$, with uniform probability (this strange guy is losing money, yes). Clearly you don’t take the bet—since you don’t actually care much whether you have 1000$ or 1001$ or 1009$, but would be terribly sad if you had only 999$. But instead of doing the utility calculation you can also absorb this into your probability distribution of the box—you only care about scenarios where the box contains less than a dollar, so you focus most of your attention on this, and estimate that the box will contain less than a dollar. The problem now arises if you happen to find a dollar on the street—it is now a good idea to buy a box, although the agents who have started to believe the box contains at most a dollar will not buy it.
To summarise: absorbing sharp effects of your utility function into biased estimates can be a decent temporary computational hack, but it is dangerous to call the partial results you work with in the process ‘estimates’, since they in no way represent your beliefs.
P.S.: The example above isn’t all that great, it was the best I could come up with right now. If it is unclear, or unclear how the example is (supposedly) related to the discussion above, I can try to find a better example.
To summarise: absorbing sharp effects of your utility function into biased estimates can be a decent temporary computational hack, but it is dangerous to call the partial results you work with in the process ‘estimates’, since they in no way represent your beliefs.
It seems to me that it’s best to use “your beliefs” to refer to the entire underlying distribution. Yes, you should not bias your beliefs—but the point of estimates is to compress the entire underlying distribution into “the useful part,” and what is the useful part will depend primarily on your application’s loss function, not a generalized unbiased loss function.
My point is that this new biased estimate is not your ‘real estimate’ - this is simply not your best guess/posterior distribution given your information.
Sure it is my “real” estimate—because I take real action on its basis.
Let me make a few observations.
First, any “best” estimate narrower than a complete probability distribution implies some loss function which you are minimizing in order to figure out which estimate is “best”. Let’s take the plain-vanilla case of estimating the central point of a distribution which produced some sample of real numbers. The usual estimate for that is the average of the sample numbers (the sample mean) and it is indeed optimal (“the best”) for a particular, quadratic, loss function. But, for example, change the loss function to absolute deviation (L1) and now the median becomes “the best estimate”.
The point is that to prefer any estimate over some other estimate, you must have a loss function already. If you are calling some estimate “best”, this implies a particular loss function.
Second, the usefulness of any estimate is determined by the use you intend for it. “Suitability for a purpose” is an overriding criterion for estimates you produce. Different purposes (“produce an unbiased estimate” and “select a course of action” are different purposes) often require different estimates.
Third, “unbiased” is not an unalloyed blessing. In many situations you face the bias-variance tradeoff and sometimes you do want to have some bias.
History teaches us, gentlemen, that great generals remain generals by never underestimating their opposition.
Gen. Antonio Lopez de Santa Anna: The Alamo: Thirteen Days to Glory (1987) (TV)
Overestimating can be costly too. That’s why bluffing can work, in poker as in war.
Examples/articles:
Empty Fort Strategy
100 horsemen and the empty city (gated). Here are two articles summarizing the original paper: Miami SBA and ScienceDaily
The most important decisions are before starting a war, and there the mistakes have very different costs. Overestimating your enemy results in peace (or cold war) which basically means that you just lose out on some opportunistic conquests but underestimating your enemy results in a bloody, unexpectedly long war that can disrupt you for a decade or more—there are many nice examples of that in 20th century history.
Peace or cold war are not the only possible outcomes. Surrender is another. An example is the conquest of the Aztecs by Cortez, discussed here, here, and here. Surrender can (but need not) have disastrous consequences too.
Generals are not the people who decide whether or not a war gets fought but who decide over individual battles.
If you’re unbiased then you should be underestimating your opposition about half the time.
If your loss function is severely skewed, you do NOT want to be unbiased.
What you want is to have a distribution. You will expect your opposition to be about as strong as it is. You will prepare for the possibility that it is stronger or weaker.
A distribution is nice but often you have to commit to a choice. In such cases you generally want to minimize your expected loss (or maximize the gain) and if the loss function is lopsided, the forecast implied by the choice can be very biased indeed.
Even with a very skewed loss function you want to have an accurate estimate of your opposition, which will be an underestimate about half of the time, and then take excessive precautions. Your loss function does not influence your beliefs, only your actions.
Yes, but actions is what you should care about—if these are determined, your beliefs (which in this case do not pay rent) don’t matter much.
.
So why would I want to bias myself after I’ve decided to take excessive precautions?
I think we’re in agreement btw, we care about actions, and if you have a very skewed loss function then it is rational to spend a lot of effort on improbable scenarios in which you lose heavily, which from the outside looks similar to a person with a less skewed loss function thinking that those scenarios are actually plausible. I was just trying to point out that DanielLC’s reply was correct and your previous one is not—even with a skewed loss function this should not produce feedback to the actual beliefs, only to your actions. So no, you DO want to be unbiased, it’s just that an unbiased estimate/posterior distribution can still lead to asymmetric behaviour (by which I mean spending an amount of time/effort to prepare for a possible future disproportionate to the actual probability to this future occurring).
Well, let me unroll what I had in mind.
Imagine that you need to estimate a single value, a real number, and your loss function is highly skewed. For me this would work as follows:
Get a rough unbiased estimate
Realize that I don’t care about the unbiased estimate because of my loss function
Construct a known-biased estimate that takes into account my loss function
Take this known-biased estimate as the estimate that I’ll use from now on
Formulate a course of action on the basis of the the biased estimate
The point is that on the road to deciding on the course of action it’s very convenient to have a biased estimate that you will take as your working hypothesis.
Yes. My point is that this new biased estimate is not your ‘real estimate’ - this is simply not your best guess/posterior distribution given your information. But as I remarked above your rational actions given a skewed loss function resemble the actions of a rational agent with a less risk-averse loss function with a different estimate, so in order to determine your actions you can compute what [an agent with a less skewed loss function and your (deliberately) biased estimate] would do, and then just copy those actions.
But despite all of this, you still want to be unbiased. It’s fine to use the computational shortcut mentioned above to deal with skewed loss functions, but you need your beliefs to stay as accurate as possible to not get strange future behaviour. A small, simplified example:
Suppose you are in possesion of 1001$ total (all your assets included), and it costs $1000 to buy a cure for a fatal disease you happen to have/a ticket to heaven/insurance for cryonics. You most definitely don’t want to lose more than one dollar. Then a guy walks up to you and offers a bet—you pay 2$, after which you are given a box which contains between 0$ and 10$, with uniform probability (this strange guy is losing money, yes). Clearly you don’t take the bet—since you don’t actually care much whether you have 1000$ or 1001$ or 1009$, but would be terribly sad if you had only 999$. But instead of doing the utility calculation you can also absorb this into your probability distribution of the box—you only care about scenarios where the box contains less than a dollar, so you focus most of your attention on this, and estimate that the box will contain less than a dollar. The problem now arises if you happen to find a dollar on the street—it is now a good idea to buy a box, although the agents who have started to believe the box contains at most a dollar will not buy it.
To summarise: absorbing sharp effects of your utility function into biased estimates can be a decent temporary computational hack, but it is dangerous to call the partial results you work with in the process ‘estimates’, since they in no way represent your beliefs.
P.S.: The example above isn’t all that great, it was the best I could come up with right now. If it is unclear, or unclear how the example is (supposedly) related to the discussion above, I can try to find a better example.
It seems to me that it’s best to use “your beliefs” to refer to the entire underlying distribution. Yes, you should not bias your beliefs—but the point of estimates is to compress the entire underlying distribution into “the useful part,” and what is the useful part will depend primarily on your application’s loss function, not a generalized unbiased loss function.
Sure it is my “real” estimate—because I take real action on its basis.
Let me make a few observations.
First, any “best” estimate narrower than a complete probability distribution implies some loss function which you are minimizing in order to figure out which estimate is “best”. Let’s take the plain-vanilla case of estimating the central point of a distribution which produced some sample of real numbers. The usual estimate for that is the average of the sample numbers (the sample mean) and it is indeed optimal (“the best”) for a particular, quadratic, loss function. But, for example, change the loss function to absolute deviation (L1) and now the median becomes “the best estimate”.
The point is that to prefer any estimate over some other estimate, you must have a loss function already. If you are calling some estimate “best”, this implies a particular loss function.
Second, the usefulness of any estimate is determined by the use you intend for it. “Suitability for a purpose” is an overriding criterion for estimates you produce. Different purposes (“produce an unbiased estimate” and “select a course of action” are different purposes) often require different estimates.
Third, “unbiased” is not an unalloyed blessing. In many situations you face the bias-variance tradeoff and sometimes you do want to have some bias.
This is a good point. A helpful discussion of asymmetric loss functions is here.
Only if you have no margin within which you can be considered to be “correctly estimating.”