Knightian Uncertainty and Ambiguity Aversion: Motivation
Recently, I found myself in a conversation with someone advocating the use of Knightian uncertainty. I admitted that I’ve never found the concept compelling. We went back and forth for a little while. His points were crisp and well-supported, my objections were vague. We didn’t have enough time to reach consensus, but it became clear that I needed to research his viewpoint and flesh out my objections before being justified in my rejection.
So I did. This is the first in a short series of posts during which I explore what it means for an agent to reason using Knightian uncertainty.
In this first post, I’ll present a number of arguments claiming that Bayesian reasoning fails to capture certain desirable behavior. I’ll discuss a proposed solution, maximization of minimum expected utility, which is advocated by my friend and others.
In the second post, I’ll discuss some more general arguments against Bayesian reasoning as an idealization of human reasoning. What role should “unknown unknowns” play in a bounded Bayesian reasoner? Is “Knightian uncertainty” a useful concept that is not captured by the Bayesian framework?
In the third post, I’ll discuss the proposed solution: can rational agents display ambiguity aversion? What does it mean to have a rational agent that does not maximize expected utility, maximizing “minimum expected utility” instead?
In the final post, I’ll apply these insights to humans and articulate my objections to ambiguity aversion in general. I’ll conclude that while it is possible for agents to be ambiguity-averse, ambiguity aversion in humans is a bias. The maximization of minimum expected utility may be a useful concept for explaining how humans actually act, but probably isn’t how you should act.
The following is a stylized conversation that I had at the Stanford workshop on Logic, Rationality, and Intelligent Interaction. I’ll anonymize my friend as ‘Sir Percy’, which seems a fitting pseudonym for someone advocating Knightian uncertainty.
“I think that’s repugnant”, Sir Percy said. “I can’t assign a probability to the simulation hypothesis, because I have Knightian uncertainty about it.”
“I’ve never found Knightian uncertainty compelling” I replied with a shrug. “I don’t see how it helps to claim uncertainty about your credence. I know what it means to feel very uncertain (e.g. place a low probability on many different scenarios), and I even know what it means to expect that I’m wildly incorrect (though I never know the direction of my error). But eventually I have to act, and this involves cashing my out my uncertainty into an actual credence and weighing the odds. Even if I’m uncomfortable producing a sufficiently precise credence, even if I feel like I don’t have enough information, even though I’m probably misusing the information that I do have, I have to pick the most accurate credence I can anyway when it comes time to act.”
“Sure”, Sir Percy answered. “If you’re maximizing expected utility, then you should strive to be a perfect Bayesian, and you should always act like you assign a single credence to any given event. But I’m not maximizing expected utility.”
Woah. I blinked. I hadn’t even considered that someone could object to the concept of expected utility maximization. Expected utility maximization seemed fundamental: I understand risk aversion, and I understand caution, but at the end of the day, if I honestly expect more utility in the left branch than the right branch, then I’m taking the left branch. No further questions.
“Uh”, I said, deploying all wits to articulate my grave confusion, “wat?”
“I maximize the minimum expected utility, given my Knightian uncertainty.”
My brain struggled to catch up. Is it even possible for a rational agent to refuse to maximize expected utility? Under the assumption that people are risk-neutral with respect to utils, what does it mean for an agent to rationally refuse an outcome where they expect to get more utils? Doesn’t that merely indicate that they picked the wrong thing to call “utility”?
“Look”, Sir Percy continued. “Consider the following ‘coin toss game’. There was a coin flip, and the coin came up either heads (H) or tails (T). You don’t know whether or not the coin was weighted, and if it was, you don’t know which way it was weighted. In fact, all you know is that your credence of event H is somewhere in the interval [0.4, 0.6]
.”
“That sounds like a failure of introspection”, I replied. “I agree that you might not be able to generate credences with arbitrary precision, but if you have no reason to believe that your interval is skewed towards one end or the other, then you should just act like your credence of H is in the middle of your interval (or the mean of your distribution), e.g. 50%.”
“Not so fast. Consider the following two bets:”
Pay 50¢ to be payed $1.10 if the coin came up heads
Pay 50¢ to be payed $1.10 if the coin came up tails
“If you’re a Bayesian, then for any assignment of credence to H, you’ll want to take at least one of these bets. For example, if your credence of H is 50% then each bet has a payoff of 5¢. But if you pick any arbitrary credence out of your confidence interval then at least one of these bets will have positive expected value.
On the other hand, I’m maximizing the minimum expected utility. Given bet (1), I notice that perhaps the probability of H is only 40%, in which case the expected utility of bet (1) is −6¢, so I reject it. Given bet (2), I notice that perhaps the probability of H is 60%, in which case the expected utility of bet (2) is −6¢, so I reject that too.”
“Uh”, I replied, “you do understand that I’ll be richer than you, right? Why ain’t you rich?”
“Don’t be so sure”, he answered. “I reject each bet individually, but I gladly accept the pair together, and walk away with 10¢. You’re only richer if bets can be retracted, and that’s somewhat of unreasonable. Besides, I do better than you in the worst case.”
Something about this felt fishy to me, and I objected halfheartedly. It’s all well and good to say you don’t maximize utility for one reason or another, but when somebody tells me that they actually maximize “minimum expected utility”, my first inclination is to tell them that they’ve misplaced their “utility” label.
Furthermore, every choice in life can be viewed as a bet about which available action will lead to the best outcome, and on this view, it is quite reasonable to expect that many bets will be “retracted” (e.g., the opportunity will pass).
Still, these complaints are rather weak, and my friend had presented a consistent alternative viewpoint that came from completely outside of my hypothesis space (and which he backed up with a number of references). The least I could do was grant it my honest consideration.
And as it turns out, there are several consistent arguments for maximizing minimum expected utility.
The Ellsberg Paradox
Consider the Ellsberg “Paradox”. There is an urn containing 90 balls. 30 of the balls are red, and the other 60 are either black or yellow. You don’t know how many of the 60 balls are black: it may be zero, it may be 60, it may be anywhere in between.
I am about to draw balls out of the urn and pay you according to their color. You get to choose how I pay out, but you have to pick between two payoff structures:
1a) I pay you $100 if I draw a red ball.
1b) I pay you $100 if I draw a black ball.
How do you choose? (I’ll give you a moment to pick.)
Afterwards, we play again with a second urn (which also has 30 red balls and 60 either-black-or-yellow balls), but this time, you have to choose between the following two payoff structures:
2a) I pay you $100 if I draw a red or yellow ball.
2b) I pay you $100 if I draw a black or yellow ball.
How do you choose? (I’ll give you a moment to pick.)
A perfect Bayesian (with no reason to believe that the 60 balls are more likely to be black than yellow) is indifferent between these pairs. However, most people prefer 1a to 1b, but also prefer 2b to 2a.
These preferences seem strange through a Bayesian lens, given that the b bets are just the a bets altered to also pay out on yellow balls as well. Why do people’s preferences flip when you add a payout on yellow balls to the mix?
One possible answer is that people have ambiguity aversion. People prefer 1a to 1b because 1a guarantees 30:60 odds (while selecting 1b when faced with an urn containing only yellow balls means that you have no chance of being paid at all). People prefer 2b to 2a because 2b guarantees 60:30 odds, while 2a may be as bad as 30:60 odds when facing the urn with no yellow balls.
If you reason in this way (and I, for one, feel the allure) then you are ambiguity averse.
And if you’re ambiguity averse, then you have preferences where a perfect Bayesian reasoner does not, and it looks a little bit like you’re maximizing minimum expected utility.
Three games of tennis
Gärdenfors and Sahlin discuss this problem in their paper Unreliable Probabilities, Risk Taking, and Decision Making
It seems to us […] that it is possible to find decision situations which are identical in all the respects relevant to the strict Bayesian, but which nevertheless motivate different decisions.
These are the people who coined the decision rule of maximizing minimum expected utility (“the MMEU rule”), and it’s worth understanding the example that motivates their argument.
Consider three tennis games each about to be played: the balanced game, the mysterious game, and the unbalanced game.
The balanced game will be played between two players Loren and Lauren who are very evenly matched. You happen to know that both players are well-rested, that they are in good health, and that they are each at the top of their mental game. Neither you nor anyone else has information that makes one of them seem more likely to win than the other, and your credence on the event “Loren wins” is 50%.
The mysterious game will be played between John and Michael, about whom you know nothing. On priors, it’s likely to be a normal tennis game where the players are matched as evenly as average. One player might be a bit better than the other, but you don’t know which. Your credence on the event “John wins” is 50%.
The unbalanced game will be played between Anabel and Zara. You don’t know who is better at tennis, but you have heard that one of them is far better than the other, and know that everybody considers the game to be a sure thing, with the outcome practically already decided. However, you’re not sure whether Anabel or Zara is the superior player, so your credence on the event “Anabel wins” is 50%.
A perfect Bayesian would be indifferent between a bet with 1:1 odds on Loren, a bet with 1:1 odds on John, and a bet with 1:1 odds on Anabel. Yet people are likely to prefer 1:1 bets on the balanced game. This is not necessarily a bias: people may rationally prefer the bet on the balanced game. This seems to imply that Bayesian expected utility maximization is not an idealization of the human reasoning process.
As these tennis games illustrate, humans treat different types of uncertainty differently. This motivates the distinction between “normal” uncertainty and “Knightian” uncertainty: we treat them differently, specifically by being averse to the latter.
The tennis games show humans displaying preferences where a Bayesian would be indifferent. On the view of Gärdenfors and Sahlin, this means that Bayesian expected utility maximization can’t capture actual human preferences; humans actually want to have preferences where Bayesians cannot. How, then, should we act? If Bayesian expected utility maximization does not capture an idealization of our intended behavior, what decision rule should we be approximating?
Gärdenfors and Sahlin propose acting such that in the worst case you still do pretty well. Specifically, they suggest maximizing the minimum expected utility given our Knightian uncertainty. This idea is discussed in the paper Unreliable Probabilities, Risk Taking, and Decision Making, which further motivates this new decision rule, which I’ll refer to as the “MMEU rule”.
We have now seen three scenarios (the Ellsburg urn, the tennis games, and Sir Percy’s coin toss) where the Bayesian decision rule of ‘maximize expected utility’ seems insufficient.
In the Ellsberg paradox, most people display an aversion to ambiguity, even though a Bayesian agent (with a neutral prior) is indifferent.
In the three tennis games, people act as if they’re trying to maximize their utility in the least convenient world, and thus they allow different types of uncertainty (whether Anabel is the stronger player vs whether Loren will win the balanced game) to affect their actions in different ways.
Most alarmingly, in the coin toss game, we see Sir Percy rejecting both bets (1) and (2) but accepting their conjunction. Sir Percy knows that his expected utility is lower, but seems to have decided that this is acceptable given his preferences about ambiguity (using reasoning that is not obviously flawed). Sir Percy acts like he has a credence interval, and there is simply no credence that a Bayesian agent can assign to H such that the agent acts as Sir Percy prefers.
All these arguments suggest that there are rational preferences that the strict Bayesian framework cannot capture, and so perhaps expected utility maximization is not always rational.
Reasons for skepticism
Let’s not throw expected utility maximization out the window at the first sign of trouble. While it surely seems like humans have a gut-level aversion to ambiguity, there are a number of factors that explain the phenomenon without sacrificing expected utility maximization.
There are some arguments in favor of using the MMEU rule, but the real arguments are easily obscured by a number of fake arguments. For example, some people might prefer a bet on the balanced tennis game over the unbalanced tennis game for reasons completely unrelated to ambiguity aversion: when considering the arguments in favor of ambiguity aversion, it is important to separate out the preferences that Bayesian reasoning can capture from the preferences it cannot.
Below are four cases where it may look like humans are acting ambiguity averse, but where Bayesian expected utility maximizers can (and do) display the same preferences.
Caution. If you enjoy bets for their own sake, and someone comes up to you offering 1:1 odds on Lauren in the balanced tennis game, then you are encouraged to take the bet.
If, however, a cheerful bookie comes up to you offering 1:1 odds on Zara in the unbalanced game, then the first thing you should do is laugh at them, and the second thing you should do is update your credence that Zara will lose.
Why? Because in the unbalanced game, one of the players is much better than the other, and the bookie might know which. If the bookie, hearing that you have no idea whether Anabel is better or worse than Zara, offers you a bet with 1:1 odds in favor of Zara, then this is pretty good evidence that Zara is the worse player.
In fact, if you’re operating under the assumption that anyone offering you a bet thinks that they are going to make money, then even as a Bayesian expected utility maximizer you should be leery of people offering bets about the mysterious game or the unbalanced game. Actual bets are usually offered to people by other people, and people tend to only offer bets that they expect to win. It’s perfectly natural to assume that the bookie is adversarial, and given this assumption, a strict Bayesian will also refuse bets on the unbalanced game.
Similarly, in the Ellsberg game, if a Bayesian agent believes that the person offering the bet is adversarial and gets to choose how many black balls there are, then the Bayesian will pick bets 1a and 2b.
Humans are naturally inclined to be suspicious of bets. Bayesian reasoners with those same suspicions are averse to many bets in a way that looks a lot like ambiguity aversion. It’s easy to look at a bet on the unbalanced game and feel a lot of suspicion and then, upon hearing that a Bayesian has no preferences in the matter, decide that you don’t want to be a Bayesian. But a Bayesian with your suspicions will also avoid bets on the unbalanced game, and it’s important to separate suspicion from ambiguity aversion.
Risk aversion. Most people would prefer a certainty of $1 billion to a 50% chance of $10 billion. This is not usually due to ambiguity aversion, though: dollars are not utils, and preferences are not generally linear in dollars. You can prefer $1 billion with certainty to a chance of $10 billion on grounds of risk aversion, without ever bringing ambiguity aversion into the picture.
The Ellsberg urn and the tennis games are examples that target ambiguity aversion explicitly, but be careful not to take these examples to heart and run around claiming that your prefer a certainty of $1 billion to a chance of $10 billion because you’re ambiguity averse. Humans are naturally very risk-averse, so we should expect that most cases of apparent ambiguity aversion are actually risk aversion. Remember that a failure to maximize expected dollars does not imply a failure to maximize expected utility.
Loss aversion. When you consider a bet on the balanced game, you might visualize a tight and thrilling match where you won’t know whether you won the bet until the bitter end. When you consider a bet on the unbalanced game, you might visualize a match where you immediately figure out whether you won or lost, and then you have to sit through a whole boring tennis game either bored and waiting to collect your money (if you chose correctly) or with that slow sinking feeling of loss as you realize that you don’t have a chance (if you chose incorrectly).
Because humans are strongly loss averse, sitting through a game where you know you’ve lost is more bad than sitting through a game where you know you’ve won is good. In other words, ambiguity may be treated as disutility. The expected utility of a bet for money in the unbalanced game may be less than a similar bet on the balanced game: the former bet has more expected negative feelings associated with it, and thus less expected utility.
This is a form of ambiguity aversion, but this portion of ambiguity aversion is a known bias that should be dealt with, not a sufficient reason to abandon expected utility maximization.
Possibility compression. The three tennis games actually are different, and the ‘strict Bayesian’ does treat them differently. Three Bayesians sitting in the stands before each of the three tennis games all expect different experiences. The Bayesian at the balanced game expects to see a close match. The Bayesian at the mysterious game expects the game to be fairly average. The Bayesian at the unbalanced game expects to see a wash.
When we think about these games, it doesn’t feel like they all yield the same probability distributions over futures, and that’s because they don’t, even in a Bayesian.
When you’re forced to make a bet only about whether the 1st player will win, you’ve got to project your distribution over all futures (which includes information about how exciting the game will be and so on) onto a much smaller binary space (player 1 either wins or loses). This feels lossy because it is lossy. It should come as no surprise that many highly different distributions over futures project onto the same distribution over the much smaller binary space of whether player 1 wins or loses.
There is some temptation to accept the MMEU rule because, well, the games feel different, and Bayesians treat the bets identically, so maybe we should switch to a decision rule that treats the bets differently. Be wary of this temptation: Bayesians do treat the games differently. You don’t need “Knightian uncertainty” to capture this.
I am not trying to argue that we don’t have ambiguity aversion. Humans do in fact seem averse to ambiguity. However, much of the apparent aversion is probably a combination of suspicion, risk aversion, and loss aversion. The former is available to Bayesian reasoners, and the latter two are known biases. Insofar as your ambiguity aversion is caused by a bias, you should be trying to reduce it, not endorse it.
Ambiguity Aversion
But for all those disclaimers, humans still exhibit ambiguity aversion.
Now, you could say that whatever aversion remains (after controlling for risk aversion, loss aversion, and suspicion) is irrational. We know that humans suffer from confirmation bias, hindsight bias, and many other biases, but we don’t try to throw expected utility maximization out the window to account for those strange preferences.
Perhaps ambiguity aversion is merely a good heuristic. In a world where people only offer you bets when the odds are stacked against you but you don’t know it yet, ambiguity aversion is a fine heuristic. Or perhaps ambiguity aversion is a useful countermeasure against the planning fallacy: if we tend to be overconfident in our predictions, then attempting to maximize utility in the least convenient world may counterbalance our overconfidence. Maybe. (Be leery of evolutionary just-so stories.)
But this doesn’t have to be the case. Even if my own ambiguity aversion is a bias, isn’t it still possible that there could exist an ambiguity-averse rational agent?
An ideal rational agent had better not have confirmation bias or hindsight bias, but it seems like you should be able to build a rational agent that disprefers ambiguity. Ambiguity aversion is about preferences, not epistemics. Even if human ambiguity aversion is a bias, shouldn’t it be possible to design a rational agent with preferences about ambiguity? This seems like a preference that a rational agent should be able to have, at least in principle.
But if a rational agent disprefers ambiguity, then it rejects bets (1) and (2) in the coin toss game, but accepts their agglomeration. And if this is so, then there is no credence it can assign to H that to make its actions consistent, so how could it possibly be a Bayesian?
What gives? Is the Bayesian framework unable to express agents with preferences about ambiguity?
And if so, do we need a different framework that can capture a broader class of “rational” agents, including maximizers of minimum expected utility?
- Knightian uncertainty in a Bayesian framework by 24 Jul 2014 14:31 UTC; 55 points) (
- Knightian uncertainty: a rejection of the MMEU rule by 26 Aug 2014 3:03 UTC; 41 points) (
- Knightian Uncertainty: Bayesian Agents and the MMEU rule by 4 Aug 2014 14:05 UTC; 24 points) (
- Risk and uncertainty: A false dichotomy? by 18 Jan 2020 3:09 UTC; 6 points) (
- 14 May 2019 16:23 UTC; 4 points) 's comment on Coherent decisions imply consistent utilities by (
- 21 Jun 2019 23:29 UTC; 4 points) 's comment on Is your uncertainty resolvable? by (
- 10 Jan 2020 12:37 UTC; 3 points) 's comment on ozziegooen’s Shortform by (
MMEU isn’t stable upon reflection. Suppose that in addition to the mysterious [0.4, 0.6] coin, you had a fair coin, and I tell you that all offer bet 1 (“pay 50¢ to be payed $1.10 if the coin came up heads”) if the fair coin comes up heads and bet 2 if the fair coin comes up tails, but you have to choose whether to accept or reject before flipping the fair coin to decide which bet will be chosen. In this case, the Knighian uncertainty cancels out, and your expected winnings are +5¢ no matter which value is [0.4, 0.6] is taken to be the true probabilty of the mysterious coin, so you would take this bet on MMEU.
Upon seeing how the fair coin turns out, however, MMEU would tell you to reject whichever of bets 1 and 2 is offered. Thus, if I offer to let you see the result of the fair coin before deciding whether to accept the bet, you will actually prefer not to see the coin, for an expected outcome of +5¢, rather than see the coin, reject the bet, and win nothing with certainty. Alternatively, if given the chance, you would prefer to self-modify so as to not exhibit ambiguity aversion in this scenario.
In general, any agent using a decision rule that is not generalized Bayesian performs strictly worse than some generalized Bayes decision rule. Note, though, that this does not mean that such an agent is forced to accept at least one of bets 1 and 2, since rejecting whichever of them is offered is a Bayes rule; for example, a Bayesian agent who believes that the bookie knows something that they don’t will behave in this way. It does mean, though, that there are many situations where MMEU cannot work, such as in my example above, since in such scenarios it is not equivalent to any Bayes rule.
Does it? You still know that you will only be able to take one of the two bets; you just don’t know which one. The Knightian uncertainty only cancels out if you know you can take both bets.
This looks more like a problem with updating than with MMEU though. It seems possible to design a variant of UDT that uses MMEU, without it wanting to self-modify into something else (at least not for this reason).
I can’t see how this would work. Wouldn’t the UDT-ish approach be to ask an MMEU agent to pick a strategy once, before making any updates? The MMEU agent would choose a strategy that makes it equivalent to a Bayesian agent, as I describe. The characteristic ambiguity-averse behaviour only appears if the agent is allowed to update.
Given a Cartesian boundary between agent and environment, you could make an agent that prefers to have its future actions be those that are prescribed by MMEU, and you’d then get MMEU-like behaviour persisting upon reflection, but I assume this isn’t what you mean since it isn’t UDT-ish at all.
Suppose you program a UDT-MMEU agent to care about just one particular world defined by some world program. The world program takes a single bit as input, representing the mysterious coin, and the agent represents uncertainty about this bit using a probability interval. You think that in this world the agent will either be offered only bet 1, or only bet 2, or the world will split into two copies with the agent being offered a different bet in each copy (analogous to your example). You have logical uncertainty as to which is the case, but the UDT-MMEU agent can compute and find out for sure which is the case. (I’m assuming this agent isn’t updateless with regard to logical facts but just computes as many of them as it can before making decisions.) Then UDT-MMEU would reject the bet unless it turns out that the world does split in two.
Unless I made a mistake somewhere, it seems like UDT-MMEU does retain “ambiguity-averse behaviour” and isn’t equivalent to any standard UDT agent, except in the sense that if you did know which version of the bet would be offered in this world, you could design a UDT agent that does the same thing as the UDT-MMEU agent.
Does anyone know how much people are typically willing to pay to switch options in the Ellsberg paradox? Among those that would pay to switch, my expectation is around 10%, not the ~50% predicted by max-min. This sort of mild ambiguity aversion is probably better captured by prospect theory.
This is a very general point. Most of the uncertainty people face is of the sort that they would naively classify as Knighian, so if people actually behaved according to MMEU, then they would essentially be playing minimax against the world.
Yeah that could lead to some pretty dumb behavior. For a silly example, “I don’t know the skill of other drivers, so I’ll just never use a road, because never using a road has higher minimum utility than dying in a car crash.”
MMEU fails as a decision theory that we actually want for the same reason that laypeople’s intuitions about AI fail- it’s rare to have a proper understanding of how powerful the phrases “maximum” and “minimum” are. As a quick example, actually following MMEU means that a vacuum metastability event is the best thing that could possibly happen to the universe, because it removes the possibility of humanity being tortured for eternity. Add in the fact that it doesn’t allow you to deal with infinitesimals correctly (e.g. Pascal’s Wager should never fail to convince an MMEU agent), and I’m seriously confused as to the usefulness of this.
How does one decide which uncertainty is Knightian? My hunch is that we tend to label something Knightian uncertainty iff it’s something the bookie might already know; if this is the case, it’s a sign Knightian uncertainty is really about suspicion.
(On a side note, I propose we label the rule of being suspicious of bets the Cider-Ear Principle or perhaps Masterson’s Law)
Knightian uncertainty is the one which you have no idea what it looks like. Speaking a bit more technically, you don’t know anything about the distribution—neither its class, nor the shape, nor the parameters.
Is it knightian if you have some idea what it looks like, but not exactly? If you know the mean but not the shape, or that it’s skewed normal shape, but not the direction of skew, is that Knightian and therefore not usable for utility calculation?
Generally speaking, it’s Knightian if you have no idea.
Example: what is the probability that an alien civilization has been surreptitiously observing Earth for a while?
If you, say, know that the distribution is skewed normal but don’t know the skew sign, that’s not Knightian at all.
Maybe we need a new term then, because the examples above (weighted coin between .4 and .6, unknown number of black balls) don’t seem to meet your definition of Knightian.
Now I’m really confused. It seems like my knowledge (confidence in my probability assessment) of the shape of a distribution is continuous in the same way as my knowledge (the probability assessment itself) about a discrete future experience. I never know absolutely nothing about it (alien spies: I at least know that I can’t assign 0 to it). I also never know absolutely everything (there are very few actually perfect fair coins).
Are you saying that your belief in probability distributions is binary (or at least quantized to a small number of states)? You know it perfectly or you know nothing about it?
I don’t get it well enough to be certain that I don’t buy it, but that’s where I’m currently leaning. Especially if you bite the bullet that uncertainty is about knowledge rather than about reality (probability is a limitation of a decision agent, not present in the base reality), this just makes no sense.
You are right. Knightian uncertainty isn’t a separate discrete category, it’s an endpoint of a particular interval on the other end of which sits uncertainty that you know everything about, e.g. the probability of drawing a red ball from an urn into which you have just placed 10 red and 10 black balls.
Knight himself called known uncertainty “risk” and unknown uncertainty “uncertainty”. He wrote: Uncertainty must be taken in a sense radically distinct from the familiar notion of Risk, from which it has never been properly separated.… The essential fact is that ‘risk’ means in some cases a quantity susceptible of measurement, while at other times it is something distinctly not of this character; and there are far-reaching and crucial differences in the bearings of the phenomena depending on which of the two is really present and operating.… It will appear that a measurable uncertainty, or ‘risk’ proper, as we shall use the term, is so far different from an unmeasurable one that it is not in effect an uncertainty at all.”
You don’t mention what seems like the most important problem with a bet on the unbalanced game: you’ll look like a fool if you guess wrong. This could have far-reaching monetary consequences in real finance, and status consequences in general.
This is very important. Feeling like there are status consequences has an effect on decision making in humans regardless of whether there are actual status consequences.
Some putatively Knightian uncertainty and ambiguity aversion can be explained as maximising expected utillity when playing against an adversary.
For the Ellsberg paradox, the person offering the first bet can minimise his payout by putting no black balls in the urn. If I expect him to do that (and he can do so completely honestly, since he volunteers no information about the method used to fill the urn) then I should bet on red, for a 1⁄3 chance of winning, and not black, for a zero chance.
The person offering the second bet can minimise his payout by putting no yellow balls in the urn. Then black-or-yellow has a 2⁄3 chance and red-or-yellow a 1⁄3 chance and I should bet on black-or-yellow.
The lesson here is, don’t take strange bets from strangers. I’d quote again the lines from Guys And Dolls about this, but the Google box isn’t helping me find when it was last in a quotes thread. (Is there some way the side bar could be excluded from Google’s search spiders? It’s highly volatile content and shouldn’t be indexed.)
In the tennis example, someone betting on the mysterious game or the unbalanced game is in the position of someone betting on horse races who knows nothing about horses. He should decline to bet, because while it is possible to beat the bookies, it’s a full-time job to maintain the necessary knowledge of horse-racing.
Sadly, can’t be done with tagging. Yandex and Yahoo! supported such a thing; Googlebot does not and likely won’t. The sidebar could be built on a separate noindexed URL and included as an iframe; I suspect this would be obnoxious to implement.
This is exactly what I was thinking the whole time. Is there any example of supposed “ambiguity aversion” that isn’t explained by this effect?
To speak of Knightian uncertainty we need a way of separating our uncertainty into “Knightian” and “Bayesian” components. There is actually a natural candidate for that. Namely, Bayesian uncertainty comes from a Solomonoff ensemble and Knightian uncertainty comes from our limited ability to compute the Solomonoff expectation value. For example if the agent is reasoning within a certain formal system, this system will generally allow proving “a ⇐ E(U) ⇐ b” for some a and b but the bounds won’t coincide in non-trivial cases because of the halting problem. We might have hope that a theory of logical uncertainty would allow producing a crisp expectation value but I’m not so sure. My feeling is that “logical uncertainty” (i.e. Bayesian reasoning with limited computing resources) only works well for computable sentences / quantities (see e.g. my attempt to define it) whereas for uncomputable quantities we need to use a sequence of computable approximations. In the case of the Solomonoff ensemble the natural computable approximations seem to be imposing cutoffs on the running time of the programs within the ensemble. This, however, means we need to bundle together programs that produce definite predictions for all times with programs that produce definite predictions for a given physical time span and fail to halt later. In particular, we need a way to assign utilities to partial possible histories. It seems to me that the natural way to do it would be combining time discount with worst-case assumptions regarding the undefined future, so that problems like the procrastination paradox are avoided. These worst-case assumptions looks quite like the MMEU rule.
This said, I’m not sure this “pure” Knightian uncertainty explains most ambiguity aversion in real life scenarios: the latter might still be a bias.
I mostly agree with this, though I’m not yet sold on assigning utilities to partial histories using worst-case assumptions. What exactly do you mean by combining partial histories with worst-case assumptions? How is ‘worst case’ defined?
For every partial history, there is a turing machine which turns it into a hellscape in the “undefined future.” This comes with a huge complexity penalty, of course, but these are precisely the sorts of things you need to watch out for when you start maximizing worst case scenarios (which neglect complexity penalties) rather than likely scenarios.
In order to avoid the procrastination paradox, we want our utility function to be upper semicontinuous in the natural topology on histories. Given such a utility function U on the space of infinite histories, there is a natural way to extend it to the space of finite & infinite histories preserving semicontinuity. Namely, we define U(x) = inf U(xy) where x is a finite history and y is an infinite continuation.
However, this prescription is not necessary: we can have an upper semicontinuous function in the space of finite & infinite histories which doesn’t arise in this way. Coming to think about it, it isn’t very attractive since intuitively the universe coming to end is a better outcome than the universe turning into hell.
Possibly relevant previous discussion:
http://lesswrong.com/lw/9e4/the_savage_theorem_and_the_ellsberg_paradox/
http://lesswrong.com/lw/9m3/the_ellsberg_paradox_and_money_pumps/
Thanks for including reasons for skepticism. I find them quite likely—there is no actual argument that the MMEU rule gives better outcomes than classical EV maximization in these thought experiements, just that many humans seem to prefer them in the real world. And in the real world, where this kind of perfect knowledge of uncertainty and idealized payoffs are not possible, all of those biases are pretty reasonable, so it’s easy to understand why they’re baked in for most of us.
The lack of rationality is most clear in your first example. If you reject bet 1 or bet 2 in the case where you can’t take both, you lose utility. This seems pretty clear. That’s not “alternate rationality”, it’s just an error.
Some of the other cases, where there’s no average utility reason to prefer one choice over the other, I have less-strong objections. I don’t think it adds any rationality to have those preferences, but it doesn’t cost anything.
You ask:
and then later on say:
But you don’t seem to have actually answered your own question: how are you defining ‘rationality’ in this post? If Sir Percy knows that his expected utility is lower, then his actions clearly can’t be VNM-rational, but you haven’t offered an alternative definition that would allow us to verify that Sir Percy’s decisions are, indeed, rational.
This really isn’t how I understand credences to work. Firstly, they don’t take ranges, and secondly, they aren’t dictated to me by the background information, they’re calculated from it. This isn’t immediately fatal, because you can say something like:
This is something you could actually tell me, and would have the effect that I think is intended. Under this background information X, my credence P(H | X) is just 0.5, but I have that P(H | X, A=a) = a for any a in [0.4, 0.6].
This is more than just a nitpick. We’ve demoted the range [0.4, 0.6] from being a priori privileged as the credence, to just another unknown value in the background information. When you then say “I’m maximising minimum expected utility”, the obvious objection is then—why have you chosen to minimise only over A, rather than any of the other unknown values in the background information? In particular, why aren’t you minimising over the value C, which represents the side the coin lands on?
But of course, if you minimise over all the unknowns, it’s a lot less interesting as a decision framework, because as far as I can tell it reduces to “never accept any risk of a loss, no matter how small the risk or the loss”.
Is there a principled reason to use the MMEU strategy is the face of Knightian uncertainty? Why not maximize maximum expected utility, or minimized expected regret (i.e. the difference between the expected utility obtained by your action and the best expected utility you could have achieved if you knew the results of the Knightian uncertainty ahead of time).
Also, if we have two different types of uncertainty, is there a good reason that there shouldn’t be more than that? Maybe, ) Here’s a thing that I can confidently assign a probability to (e.g. the outcome of a coin flip) ) Here’s a thing that I cannot usually assign a precise probability to, but that it should be possible in principle to make such an assignment (e.g. the number of yellow balls in the bin) *) Here’s a thing that that I would have trouble even in principle assigning a meaningful probability to (e.g. the simulation hypothesis)
This may be slightly off-topic, but I don’t think it makes sense to use dollars as a proxy for the measure of utility, because dollars have diminishing returns.
For example, if you offered me a choice between winning a billion dollars with 100% certainty, and winning 5 billion dollars with 20% certainty, I’ll pick the certain choice every time. Once I have a billion dollars, four more won’t make much more of a difference, so my real choice is between two expected values: X 1, vs. (X + epsilon) 0.2. The first choice looks a lot more attractive, obviously.
Note that ambiguity aversion becomes risk aversion with repetition. Suppose the tennis players are going to play a set of five matches and you must place all five bets before seeing any match. The expected values are still same, but the payoff pdf curves are quite different.
Do other people display ambiguity aversion in all cases, or only when there are personal resources at stake?
Example. You’ve just found a discarded ticket to play two draws of the charity lottery! Here’s how it works.
There 90 balls, 30 red, 60 either black or yellow in some distribution. You may choose either:
1a) I pay Givewell’s top charity $100 if you draw a red ball.
1b) I pay Givewell’s top charity $100 if you draw a black ball.
And then on the subsequent draw we go to an entirely different urn which may have an entirely different distribution of yellow and black balls (although still 30 red, 60 either black or yellow) , and then either:
2a) I pay Givewell’s top charity $100 if you draw a red or yellow ball.
2b) I pay Givewell’s top charity $100 if you draw a black or yellow ball.
For some reason, the buttons are set to 1b and 2a. You can at no monetary cost switch options to minimize ambiguity by pressing the buttons to toggle to the other option. It’s perhaps a barely noticeable expenditure of calories, so the cost seems trivial: not even a penny.
1: Do you do switch to the less ambiguous option at a trivial cost?
2: Instead of pressing the buttons, would you pay 1 penny to the person running the charity lottery, to do so in either case?
3: Would you be willing to pay 1 penny to switch in either case if the person running the charity lottery also gave you two pennies before hand? You get to keep them if you don’t use them.
When considering the previous situation, I don’t feel particularly ambiguity averse at all, and I don’t really feel the need to make any changes to the settings. But maybe other people do, so I thought I should check. And maybe it is weird of me to not feel ambiguity aversion about this, and I should check that as well.
Edit: Formatting and Grammar.
1°
If we are going to build an artificial mind that reason with Bayesian probability, we should be able to ask it the probability of any sentence, independently from the fact that it must act on that sentence or not. Think, for example, of an oracular AI.
For this reason, I think that denying the concept of Knightian uncertainty on the basis of a decision theoretic criterion is misguided: we, as an ideal, should be able to assign to any sentence some kind of number illustrating our degree of belief. As an ideal meaning that, building a concrete finite robot, we might improve its efficiency by cutting unnecessary calculation. What we are doing here though is talking about Knightian uncertainty in principle.
I think the problem has been already solved quite nicely by the notion of Ap distribution (Jaynes, chapter 18). Knightian uncertainty about A is simply the uncertainty we get when we have a smooth Ap distribution.
2°
In the bets proposed about the coin toss, you not only have 2 bets, you have a third bet surreptitiously used by Sir Percy:
1 – pay 0.5 and receive 1.1 on head
2 – pay 0.5 and receive 1.1 on tail
3 – one AND two, that is: pay 1 and receive 1.1
Now, if on the event H we have a uniform Ap distribution between .4 and .6, it is possible to show that the probability of A is .5.
Thus the expected return:
1 e 2 - (0.5 1.1 + 0.5 0) – 0.5 = 0.05
3 – 1.1*1 − 0.5 = 0.1
It is clear that even from a simple expected utility maximization perspective, taking both bets is better. Knightian uncertainty is not involved at all.
3°
Three games of tennis, these are very clearly distinct from a Bayesian point of view, using Ap distributions:
the balanced game: a sharp Ap distribution centered at 0.5
the mysterious game: a uniform Ap distribution;
the unbalanced game: a Jeffrey Ap distribution.
If your bets only involve A, then surely all these Ap have first momentum 0.5, so a Bayesian reasoner has no preference. But if the bets involve the Ap’s, then surely a Bayesian has very good reason to distinguish between them. Indeed, people tend to bet on what they have much more information on, that is where the Ap distribution is sharper, because it’s stabler under further evidence.
A very rational behavior indeed.
In most cases, suspicion seems like the best defense of acting ambiguity-averse. However, suspicion does not support ambiguity-aversion in all cases. For example, as in the set-up to the Ellsberg Paradox, suppose you have an urn with 30 red balls and 60 black or yellow balls, with the balance between black and yellow balls unknown to you. You can flip a fair coin, and then draw a ball from the urn. You win $100 if either the coin lands heads and then you draw a black ball, or if the coin lands tails and then you draw a yellow ball. Otherwise you win nothing.
You flip the coin and it lands tails. You are about to reach into the urn when a casino employee, after glancing at the coin, interupts you and says “Wait! Of course you still have the right to play the game by the rules agreed to previously if you want, but I just remembered that I heard you are ambiguity averse, so out of the kindness of my heart, I’m willing to let you change the rules so that you win $100 if you draw a red ball, giving you a known 1⁄3 chance of winning, instead of if you draw a yellow ball, which gives you an unknown chance of winning in [0, 2⁄3].” To an agent following MMEU, this looks like a pretty good deal. However, you might be suspicious that the employee made that deal because ze knows that there are more yellow balls than black balls, and is trying to trick you into decreasing your chances of winning.
My first inclination when somebody says they don’t maximize utility is that they’ve misplaced their “utility” label… can you give an example of a (reasonable?) agent which really couldn’t be (reasonably?) reframed to some sort of utility maximizer?
MMEU makes some sense in a world with death. When there’s a lower bound where negative utility doesn’t mean you’re just having a bad time but that you’re dead and can never recover from that negative utility then it makes sense to raise the minimum expected utility at least above the threshold of death, and preferably as far above death as possible.
If you take a MMEU approach to Utilitarianism (not MMEU over a single VNM utility function, but maximizing the minimum expected VNM utility function of every individual) it answers the torture vs specks question with specks, will only accept Pascal’s Muggings that threaten negative utility, won’t reduce most people’s utility to achieve the repugnant conclusion or to feed utility monsters, won’t take the garden path in the lifespan dilemma (this also applies to individual VNM utility functions), etc. In short it sounds like most people’s intuitive reaction to those dilemmas.
Well of course. Finite ideal rational agents don’t exist. If you were designing decision-theory-optimal AI, that optimality is a property of its environment, not any ideal abstract computing space. I can think of at least one reason why ambiguity aversion could be the optimal algorithm in environments with limited computing resources:
Consider a self-modification algorithm that adapts to new problem domains. Restructuring (learning) is considered the hardest of tasks, and so the AI modifies scarcely. Thus, as it encounters new decision-theoretic problems, it often does not choose self-modification, instead clodging together old circuitry and/or answers to conserve compute cycles. And so when choosing answers to your 3 problems, it would fear solutions which, when repeating the answer multiple times, maximizes expected value in its environment, which includes its own source code.
Ambiguity aversion then would be commitment-risk aversion, where future compounded failures change the value of dollars per ulility. Upon each iteration of the problem, the value of a dollar can change, and if you don’t maximize minimum expected value, you may end up with betting all of your $100, which is worth infinite value to you, versus gaining $100, which is worth far less, even if you started with $1000.
We see this in ourselves all the time. If you make a decision, expect to be more likely to make the decision in the future, and if you change your lifestyle, expect it to be hard to change back, even if you later know that changing back is the deletion of a bias.
Rational agents have source code whose optimality is a function of their environments. There is no finite cross-domain Bayesian in compute-space; only in the design-space that includes environments.
Sure there is. To someone with ambiguity aversion, “heads in situation A” and “heads in situation B” are different things. They wouldn’t be indistinguishable H’s and assigning a credence to one doesn’t imply assigning the same credence to the other just because they both have heads in them.
(I’m not sure that “credence” is really the right word for what you are describing, either.)
Given identical money payoffs between two options (even when adjusting for non-linear utility of money), choosing the non-ambiguous has the added advantage of giving a limited rationality agent less possible futures to spend computing resources on while the process of generating utility runs.
Consider two options: a) You wait one year and get 1 million dollars. b) You wait one year and get 3 million dollars with 0.5 probability (decided after this year).
If you take option b), depending on the size of your “utils”, all planning for after the year must essentially be done twice, once for the case with 3 million dollars available and once for the case without.
In the Ellsberg paradox, mightn’t people just be preferring not to do all the math involved in figuring out whether the uncertain option is worth it, and therefore just be sticking to the “safe bet” of 2a? Although I suppose that that would be a sort of ambiguity aversion itself.
Typing this before reading because I want to “predict ahead of time”: have you considered the arguments for shifting from classical Bayesianism to intuitionistic/constructive Bayesianism for reasons such as these? The long and short of it is that you can have probabilities which don’t add normally (may add up to more or less than one) because you’re also uncertain as to the space of possible events. There are Dutch Book arguments showing one should use such a probability model if one’s bets “resolve” to a definite payout at some indeterminate time after they are made, which may be never.
Yeah, this would sound like the kind of situation where you use nonclassical Bayesianisms: when you’re not actually sure about to what set of mutually-exclusive propositions you’re assigning measure 1. When you have uncertainty over what can happen as well as what will happen, minimum expected utility and ambiguity aversion make sense (pick the worst possible event you’re very confident can actually happen, and maximize utility for it, assuming that less-possible events will be better than that).
Typo:
You understate, So8res, for our good Sirs at the College of Psychiatry have decided to honor it with the classification of a disorder
(Just a silly thought.) If I really wanted the money, and I really suspected the bet-maker to want the money, then I would bet on the black ball. Because I would be sceptical about there being a SINGLE red ball in that wretched urn.