Probability and Politics

Follow-up to: Politics as Charity

Can we think well about courses of action with low probabilities of high payoffs?

Giving What We Can (GWWC), whose members pledge to donate a portion of their income to most efficiently help the global poor, says that evaluating spending on political advocacy is very hard:

Such changes could have enormous effects, but the cost-effectiveness of supporting them is very difficult to quantify as one needs to determine both the value of the effects and the degree to which your donation increases the probability of the change occurring. Each of these is very difficult to estimate and since the first is potentially very large and the second very small [1], it is very challenging to work out which scale will dominate.

This sequence attempts to actually work out a first approximation of an answer to this question, piece by piece. Last time, I discussed the evidence, especially from randomized experiments, that money spent on campaigning can elicit marginal votes quite cheaply. Today, I’ll present the state-of-the-art in estimating the chance that those votes will directly swing an election outcome.

Disclaimer

Politics is a mind-killer: tribal feelings readily degrade the analytical skill and impartiality of otherwise very sophisticated thinkers, and so discussion of politics (even in a descriptive empirical way, or in meta-level fashion) signals an increased probability of poor analysis. I am not a political partisan and am raising the subject primarily for its illustrative value in thinking about small probabilities of large payoffs.


Two routes from vote to policy: electing and affecting

In thinking about the effects of an additional vote on policy, we can distinguish between two ways to affect public policy: electing politicians disposed to implement certain policies, or affecting [2] the policies of existing and future officeholders who base their decisions on electoral statistics (including that marginal vote and its effects). Models of the probability of a marginal vote swaying an election are most obviously relevant to the electing approach, but the affecting route will also depend on such models, as they are used by politicians.

The surprising virtues of naive Fermi calculation

In my previous post I linked to Eric Schwitzgebel’s discussion of politics as charity, in which he guesstimated that the probability of a U.S. Presidential election being tied was 1/​n where n is the number of voters. So with an estimate of 100 million U.S. voters in presidential elections he gave a 1100,000,000 probability of a marginal vote swaying the election. This is a suspiciously available number. It seems to be derived from a simple model in which we imagine drawing randomly from all the possible divisions of the electorate between two candidates, when only one division would make the marginal vote decisive. But of course we know that voting won’t involve a uniform distribution.

One objection comes from modeling each vote as a flip of a biased coin. If the coin is exactly fair, then the chance of a tie goes with 1/​(sqrt(n)). But if the coin is even slightly removed from exact fairness, then the chance of a tie rapidly falls to neglible levels. This was actually one of the first models in the literature, and recapitulated by LessWrongers in comments last time.

However, if we instead think of the bias of the coin itself as sampled from a uniform distribution, then we get the same result as Schwitzgebel. In the electoral context, we can think of the coin’s bias as reflecting factors with correlated effects on many voters, e.g. the state of the economy, with good economic results favoring incumbents and their parties.
Of course, it’s clear that electoral outcomes are not uniformly sampled: we see few 90%-10% outcomes in national American elections. Electoral competition and Median Voter Theorem effects, along with the stability of partisan identifications, will tend to keep candidates roughly balanced and limit the quantity of true swing voters. Within that range, unpredictable large “wild card” influences like the economy will shift the result from year to year, forcing us to spread our probability mass fairly evenly over a large region. Depending on our estimates of that range, we would need to multiply Schwitzgebel’s estimate by a fudge factor c to get a probability of a tie of c/​n for a random election, with 1<c<100 if we bound from above based on the idea that elections are very unlikely fought in a band of 1% of the electorate.

Fermi, meet data

How well does this hold up against empirical data? In two papers from 1998 and 2009, Andrew Gelman and coauthors attempt to estimate the probability a voter going into past U.S. Presidential elections should have assigned to casting a decisive vote. They use standard models that take inputs like party self-identification, economic growth, and incumbent approval ratings to predict electoral outcomes. These models have proven quite reliable in predicting candidate vote share and no more accurate methods are known. So we can take their output as a first approximation of the individual voter’s rational estimates [3].

Their first paper considers:
… the 1952-1988 elections. For six of the elections, the probability is fairly independent of state size (slightly higher for the smallest states) and is near 1 in 10 million. For the other three elections (1964, 1972, and 1984, corresponding to the landslide victories of Johnson, Nixon, and Reagan [incumbents with favorable economic conditions]), the probability is much smaller, on the order of 1 in hundreds of millions for all of the states.
The result for 1992 was near 1 in 10 million. In 2008, which had economic and other conditions strongly favoring Obama, they found the following:

probabilities a week before the 2008 presidential election, using state-by-state election forecasts based on the latest polls. The states where a single vote was most likely to matter are New Mexico, Virginia, New Hampshire, and Colorado, where your vote had an approximate 1 in 10 million chance of determining the national election outcome. On average, a[n actual] voter in America had a 1 in 60 million chance of being decisive in the presidential election.
All told, these place the average value of c a little under the middle of the range given by the Fermi calculation above, and are very far from Pascal’s Mugging territory.

Voting vs campaign contributions
What are the implications for a causal decision theorist who wants to dedicate a modest effort to efficient do-gooding? The exact value of voting depends on many other factors, e.g. the value of policies, but we can at least compare ways to deliver votes.

Which has more bang per buck: voting in your jurisdiction or taking the hour or so to earn money and make campaign contributions? Last time I estimated a cost of $50 to $500 per vote from contributions, more in more competitive races (diminishing returns). So unless you have a high opportunity cost, you’d do better to vote yourself than contribute to a campaign in your own jurisdiction. The standard heuristic that everyone should vote seems to have been defended.

But let’s avoid motivated stopping. The above data indicate frequent differences of 1-2 orders of magnitude across jurisdictions. So someone in an uncompetitive New York district would often do better to donate less than $50 (to a competitive race) than to vote. (On the other hand, if you live in a competitive district [4], replacing your vote with donations might cost a sizable portion of your charitable budget.)

When we take into differences between election cycles, usually another 1-2 orders of magnitude, the value of voting in a “safe” jurisdiction in an election which is not close winds up negligible (if your reaction to this fact is not independent of others’). For those spending on political advocacy, this provides a route for increased cost-effectiveness: by switching from an even distribution of spending to focus on the (forecast) closest third of elections, you can nearly double your expected effectiveness. Even more extreme “wait-in-reserve” strategies could pay off, but are limited by the imperfection of forecasting methods.

Ties, recounts, and lawyers

Does the possibility of recounts disrupt the above analysis?
It turns out that it doesn’t. In countries with reasonably clean elections, a candidate with a large enough margin of victory is almost certain to be declared the winner. Say that a “large enough” margin is 5,000 votes, and that a candidate is 99% likely to be declared the winner given that margin. Then Candace the Candidate must go from a 1% probability of victory to a 99% probability of victory as we consider vote totals from a 5,000 vote shortfall to a 5,000 vote lead. So, on average within that range, each marginal vote must increase her probability of victory by 0.0098%. Since there are 10,000 possibilities to hit within the range, so long as they have roughly similar prospective probabilities the expected value of the marginal vote will be almost the same as the single “deciding vote” model.

Summary

It is possible to make sensible estimates of the probability of at least some events that have never happened before, like tied presidential elections, and use them in attempting efficient philanthropy.


[1] At least for two-boxers. More on one-boxing decision theorists at a later date.

[2] There are a number of arguments that voters’ role in affecting policies is more important, e.g. in this Less Wrong post by Eliezer. More on this later.

[3] Although for very low values, the possibility that our models are fundamentally mistaken looms progressively larger. See Ord et al.

[4] Including other relevant sorts of competitiveness, e.g. California is typically a safe state in Presidential elections, but there are usually competitive ballot initiatives.