Zack_M_Davis comments on Bet or update: fixing the will-to-wager assumption

Zack_M_Davis 7 Jun 2017 19:49 UTC
8 points

Can my utility function include risk aversion?

That would be missing the point. The vNM theorem says that if you have preferences over “lotteries” (probability distributions over outcomes; like, 20% chance of winning $5 and 80% chance of winning $10) that satisfy the axioms, then your decisionmaking can be represented as maximizing expected utility for some utility function over outcomes. The concept of “risk aversion” is about how you react to uncertainty (how you decide between lotteries) and is embodied in the utility function; it doesn’t apply to outcomes known with certainty. (How risk-averse are you about winning $5?)

See “The Allais Paradox” for how this was covered in the vaunted Sequences.

In my hypothetical the two 50% probabilites are different. I want to express the difference between them. There are no sequences involved.

Obviously you’re allowed to have different beliefs about Coin 1 and Coin 2, which could be expressed in many ways. But your different beliefs about the coins don’t need to show up in your probability for a single coinflip. The reason for mentioning sequences of flips, is because that’s when your beliefs about Coin 1 vs. Coin 2 would start making different predictions.
- Lumifer 7 Jun 2017 20:24 UTC
  0 points
  Parent
  
  That would be missing the point.
  
  Would it? My interest is in constructing a framework which provides useful, insightful, and reasonably accurate models for actual human decision-making. The vNM theorem is quite useless in this respect—I don’t know what my (or other people’s) utility function is, I cannot calculate or even estimate it, a great deal of important choices can be expressed as a set of lotteries only in very awkward ways, etc. And this is even besides the fact that empirical human preferences tend to not be coherent and they change with time.
  
  Risk aversion is an easily observable fact. Every day in financial markets people pay very large amounts of money in order to reduce their risk (for the same expected return). If you think they are all wrong, by all means, go and become rich off these misguided fools.
  
  But your different beliefs about the coins don’t need to show up in your probability for a single coinflip.
  
  Why not? As I said, I want a richer way to talk about probabilities, more complex than taking them as simple scalars. Do you think it’s a bad idea? Does St.Bayes frown upon it?
  - Zack_M_Davis 7 Jun 2017 23:45 UTC
    7 points
    Parent
    
    As I said, I want a richer way to talk about probabilities, more complex than taking them as simple scalars. Do you think it’s a bad idea?
    
    That’s right, I think it’s a bad idea: it sounds like what you actually want is a richer way to talk about your beliefs about Coin 2, but you can do that using standard probability theory, without needing to invent a new field of math from scratch.
    
    Suppose you think Coin 2 is biased and lands heads some unknown fraction _r_ of the time. Your uncertainty about the parameter _r_ will be represented by a probability distribution: say it’s normally distributed with a mean of 0.5 and a standard deviation of 0.1. The point is, the probability of _r_ having a particular value is a different question from the the probability of getting heads on your first toss of Coin 2, which is still 0.5. You’d have to ask a different question than “What is the probability of heads on the first flip?” if you want the answer to distinguish the two coins. For example, the probability of getting exactly _k_ heads in _n_ flips is C(_n_, _k_)(0.5)^_k_(0.5)^(_n_−_k_) for Coin 1, but (I think?) ∫₀¹ (1/√(0.02π))_e_^−((_p_−0.5)^2/0.02) C(_n_, _k_)(_p_)^_k_(_p_)^(_n_−_k_) _dp_ for Coin 2.
    
    Does St.Bayes frown upon it?
    
    St. Cox probably does.
    - Unnamed 8 Jun 2017 0:43 UTC
      5 points
      Parent
      
      Suppose you think Coin 2 is biased and lands heads some unknown fraction r of the time. Your uncertainty about the parameter r will be represented by a probability distribution: say it’s normally distributed with a mean of 0.5 and a standard deviation of 0.1. The point is, the probability of r having a particular value is a different question from the the probability of getting heads on your first toss of Coin 2, which is still 0.5.
      
      A standard approach is to use the beta distribution to represent your uncertainty over the value of r.
    - Lumifer 8 Jun 2017 1:13 UTC
      0 points
      Parent
      
      but you can do that using standard probability theory
      
      Of course I can. I can represent my beliefs about the probability as a distribution, a meta- (or a hyper-) distribution. But I’m being told that this is “meta-uncertainty” which right-thinking Bayesians are not supposed to have.
      
      No one is talking about inventing new fields of math
      
      say it’s normally distributed
      
      Clearly not since the normal distribution goes from negative infinity to positive infinity and the probability goes merely from 0 to 1.
      
      the probability of r having a particular value is a different question from the the probability of getting heads on your first toss of Coin 2, which is still 0.5
      
      That 0.5 is conditional on the distribution of r, isn’t it? That makes it not a different question at all.
      
      Notably, if I’m risk-averse, the risk of betting on Coin 1 looks different to me from the risk of betting on Coin2.
      
      St. Cox probably does.
      
      Can you elaborate? It’s not clear to me.
      - Zack_M_Davis 8 Jun 2017 3:19 UTC
        0 points
        Parent
        
        But I’m being told that this is “meta-uncertainty” which right-thinking Bayesians are not supposed to have.
        
        Hm. Maybe those people are wrong??
        
        Clearly not since the normal distribution goes from negative infinity to positive infinity
        
        That’s right; I should have either said “approximately”, or chosen a different distribution.
        
        That 0.5 is conditional on the distribution of r, isn’t it? That makes it not a different question at all.
        
        Yes, it is averaging over your distribution for _r_. Does it help if you think of probability as relative to subjective states of knowledge?
        
        Can you elaborate?
        
        (Attempted humorous allusion to how Cox’s theorem derives probability theory from simple axioms about how reasoning under uncertainty should work, less relevant if no one is talking about inventing new fields of math.)
        Douglas_Knight 8 Jun 2017 21:42 UTC
        0 points
        Parent
        
        But I’m being told that this is “meta-uncertainty” which right-thinking Bayesians are not supposed to have.
        
        Hm. Maybe those people are wrong??
        
        Nope.
        Lumifer 8 Jun 2017 4:29 UTC
        0 points
        Parent
        
        Maybe those people are wrong?
        
        That’s what I thought, too, and that disagreement led to this subthread.
        
        But if we both say that we can easily talk about distributions of probabilities, we’re probably in agreement :-)
        Oscar_Cunningham 8 Jun 2017 9:17 UTC
        2 points
        Parent
        It seems like you’ve come to an agreement, so let me ruin things by adding my own interpretation.
        
        The coin has some propensity to come up heads. Say it will in the long run come up heads r of the time. The number r is like a probability in that it satisfies the mathematical rules of probability (in particular the rate at which the coin comes up heads plus the rate at which it comes up tails must sum to one). But it’s a physical property of the coin; not anything to do with our opinion of it. The number r is just some particular number based on the shape of the coin (and the way it’s being tossed), it doesn’t change with our knowledge of the coin. So r isn’t a “probability” in the Bayesian sense—a description of our knowledge—it’s just something out there in the world.
        
        Now if we have some Bayesian agent who doesn’t know r, then the must have some probability distribution over it. It could also be uncertain about the weight, w, and have a probability distribution over w. The distribuiton over r isn’t “meta-uncertainty” because it’s a distribution over a real physical thing in the world, not over our own internal probability assignments. The probability distribution over r is conceptually the same as the one over w.
        
        Now suppose someone is about to flip the coin again. If we knew for certain what the value of r was we would then assign that same value as the probability of the coin coming up heads. If we don’t know for certain what r is then we must therefore average over all values of r according to our distribution. The probability of the coin landing heads is its expected value, E(r).
        
        Now E(r) actually is a Bayesian probability—it is our degree of belief that the coin will come up heads. This transformation from r being a physical property to E(r) being a probability is produced by the particular question that we are asking. If we had instead asked about the probability of the coin denting the floor then this would depend on the weight and would be expressed as E(f(w)) for some function f representing how probable it was that the floor got dented at each weight. We don’t need a similar f in the case of r because we were free to choose the units of r so that this was unnecessary. If we had instead let r be the average number of heads in 1000 flips then we would have to have calculated the probability as E(f(r)) using f(r)=r/1000.
        
        But the distribution over r does give you the extra information you wanted to describe. Coin 1 would have an r distribution tightly clustered around ¹⁄₂, whereas our distribution for Coin 2 would be more spread out. But we would have E(r) = ¹⁄₂ in both cases. Then, when we see more flips of the coins, our distributions change (although our distribution for Coin 1 probably doesn’t change very much; we are already quite certain) and we might no longer have that E(r_1) = E(r_2).
        Lumifer 8 Jun 2017 15:00 UTC
        0 points
        Parent
        
        But it’s a physical property of the coin; not anything to do with our opinion of it.
        
        Well, coin + environment, but sure, you’re making the point that r is not a random variable in the underlying reality. That’s fine, if we climb the turtles all the way down we’d find a a philosophical debate about whether the universe is deterministic and that’s not quite what we are interested in right now.
        
        The distribuiton over r isn’t “meta-uncertainty” because it’s a distribution over a real physical thing in the world
        
        I don’t think describing r as a “real physical thing” is useful in this context.
        
        For example, we treat the outcome of each coin flip as stochastic, but you can easily make an argument that it is not, being a “real physical thing” instead, driven by deterministic physics.
        
        For another example, it’s easy to add more meta-levels. Consider Alice forming a probability distribution of what Bob believes the probability distribution of r is...
        
        This transformation from r being a physical property to E(r) being a probability is produced by the particular question that we are asking.
        
        Isn’t r itself “produced by the particular question that we are asking”?
        
        But the distribution over r does give you the extra information you wanted to describe.
        
        Yes.
  - cousin_it 7 Jun 2017 22:21 UTC
    0 points
    Parent
    I’m mostly interested in prescriptive rationality, and vNM is the right starting point for that (with game theory being the right next step, and more beyond, leading to MIRI’s research among other things). If you want a good descriptive alternative to vNM, check out prospect theory.