b1shop comments on Probability and Politics

b1shop 24 Nov 2010 19:18 UTC
3 points
I don’t like the coin model because it ignores replacement.

Assume there’s ten other people in a room. Six like red and four like blue. Four of them will go to the polls, and you’re trying to decide if you should, too. What’s the probability your vote will be the deciding factor?

It’s tempting to use the binomial distribution. p=0.5, n=4. Your vote matters if x=2.

So it’ll be tied without you about 35% of the time.

But this is incorrect. If the first person who votes casts a red ballot, then the probability the next vote is red falls to ⁵⁄₉, and the probability the next vote is blue increases to ⁴⁄₉. The correct model is the Hypergeometric model because it doesn’t assume replacement.

It computes a higher 43%.

As n increases from 10 to 300000000, I imagine the effect is more dramatic.
- CarlShulman 24 Nov 2010 20:40 UTC
  4 points
  Parent
  Either way, with large electorates, the sampling error will be swamped (by orders of magnitude) by correlated changes across voters. For instance, the swings in voting behavior from economic conditions regularly move results by a number of percentage points.
  - b1shop 24 Nov 2010 20:47 UTC
    0 points
    Parent
    Move relative to what? Last year’s results?
    
    I was imagining getting the probabilities a single voter would vote for candidate X from Gallop.
    - CarlShulman 24 Nov 2010 20:55 UTC
      0 points
      Parent
      I meant that that local stochastic things affecting individual voters are not important in the year-to-year variation in election outcomes, compared to systematic effects like the economy.
      
      If you had an exact fraction of voters who would break for which candidate (which polling isn’t accurate enough to give), you still would face uncertainty about turnout.
      - b1shop 24 Nov 2010 21:26 UTC
        0 points
        Parent
        The standard error of polling is usually pretty small.
- AnnaSalamon 24 Nov 2010 19:35 UTC
  2 points
  Parent
  Cool example. I’m still confused, though; why model our uncertainty about the electoral outcome as stemming form which folks will go to the polls (while assuming for simplicity that each person has fixed preferences), rather than as stemming from our uncertainty as to how a fixed set of voters will vote (while assuming for simplicity that the set of voters is fixed)?
  
  ETA: Sorry, I edited this after it was replied to, without noticing the reply.
  - b1shop 24 Nov 2010 19:44 UTC
    0 points
    Parent
    I assume the randomness comes from sampling error, not from uncertainty about who people will vote for. My parents will always vote for Republicans, but they don’t always participate.
- b1shop 25 Nov 2010 0:02 UTC
  0 points
  Parent
  Let me refocus on my point. I want to estimate the probability my vote will matter.
  
  With population n, participation rate v, and pre-election polling showing r support for the policy, the probability your vote will matter is equal to:
  
  (C[nv/2,nr]C[nv/2,n(1-r)])/C[n,nv]