Only if the propositions “your vote is decisive for California” and “the outcome in California is decisive for the election” are independent. They aren’t.
Consider a possible world in which a Californian’s vote is decisive. That’s one in which California splits almost exactly 50:50, which means that either something very unusual has happened in California or else something very unusual has happened in the country as a whole.
Then the outcome in California decides the whole election if the overall results outside California are close enough. In the “something very unusual in the country as a whole” case, they probably won’t be. In the “something very unusual in California” case, though, they might well be. At present 538 is predicting an Obama win by about 80 EVs, or about 25 without California. It doesn’t have to be far wrong for a hypothetically-upset California to become decisive.
(Actually, of course, those “something very unusual in …” options are just two ends of a continuum. The point is that there’s a non-negligible region of that continuum in which, conditional on your vote being decisive in California, California is quite likely to be decisive overall.)
If my understanding of how 538 defines the decisive state is correct, then “probability of a single vote anywhere deciding the election” is the same as “probability of a single vote in California deciding the election conditional on California being the decisive state”. It’s possible that he does not [Edit: he doesn’t] conditionalize on close electoral college outcomes when giving the probability that each state is decisive, and that this would cause those 2 probabilities to be a little different, but probably not by very much. If we assume that they will be within a factor of 3 of each other, which seems reasonable, then the probability of a vote in California being decisive is still less than 1 in 300 million.
If my understanding of how 538 defines the decisive state is correct, then “probability of a single vote anywhere deciding the election” is the same as “probability of a single vote in California deciding the election conditional on California being the decisive state”.
Why? (In any case, that might be true with 538′s model but not with that of Gelman et al. It wouldn’t be that surprising to find that combining bits of one model with bits of another leads to wrong conclusions.)
Gelman et al got a range of Pr(your vote is decisive), with a figure of about 1 in 100M for the “worst” state. Perhaps indeed the least-evenly-poised states now are more unbalanced than then.
By the way: It’s not clear to me what point you’re making. “Things have changed since the paper by Gelman et al, and voting is less reasonable now for Californians than it was then for anyone”? “Gelman et al must be wrong because their figures are inconsistent with 538′s”? “538 must be wrong because their figures are inconsistent with those of Gelman et al”? Or what? The first of those might be right, though I haven’t yet fully grasped your reasoning. The second and third don’t seem like reasonable conclusions; at most you could say “The two models can’t both be right in every detail” which is surely true and not very surprising.
Gelman et al didn’t have a definition of “the decisive state”. The only thing I got from the Gelman et al model is that the a priori probability of a randomly selected vote being decisive is 1 in 10 million. I don’t see any opportunity for error due to differences between the models there.
Gelman et al may have been right about the 1992 election (although I am a little suspicious about the fact that the spread between states is so narrow), but I am suggesting that Academian was wrong to use their results in the context of the 2012 election. So yes, the first of those 3 points that you suggested is what I mean.
Intuitively, Gelman et al say that a vote in California is more likely to swing the whole election than a randomly selected vote. This may have been true in 1992, but it can’t possibly be true now, as shown by the fact that neither campaign has made a serious effort to increase their vote totals in California, and no one considers it a swing state.
Gelman et al do have a definition of decisive state (though not exactly of “the decisive state):
given that it is tied, neither party must have an electoral vote majority.
This isn’t quite the same as the 538 definition, which applies even when a state is not tied .
Gelman et al only got to the conclusion that the probability of a random vote being decisive is about 10^-7 by having a model of how different states’ votes relate to one another. They give a not-terribly-complete description of their model: each state’s vote is a linear function of a bunch of predictors, plus a per-state error, a per-region error, and a national error. This isn’t a million miles away from the 538 model, but it certainly isn’t identical. So the relationship between (e.g.) the probability of California being decisive, and the probability of California being evenly split, might be quite different in the Gelman et al and 538 models.
In Gelman et al’s model (see Figure 3 in their paper), states less likely to be tied are more likely to be decisive if tied. (Because the states that are less likely to be tied are the larger ones, with more electoral votes.) Roughly, these factors cancel out, which is why they don’t see huge variations in Pr(your vote matters) according to state. Accordingly, they have California as very unlikely to be tied, and really quite likely to be decisive if tied. I repeat: this is not at all the same as saying that it’s at all likely actually to be decisive.
The following toy example may help. There are exactly three states. One is solidly Red and has 2EV, one is solidly Blue and has 2EV, one is a slightly bluish Purple and has 1EV. The three states’ political fluctuations are completely independent of one another. Then: (1) Almost always, Purple is the decisive / swingiest / tipping-point state. 538 would give Red and Blue only a tiny chance of playing that role. But (2) conditional on Blue being tied, Blue will almost certainly be decisive.
So maybe the probabilities of being tied are 0.1% each for R and B, and 0.2% for Purple (yes, these are terribly small states), and Pr(Blue decisive) is also 0.1%, but Pr(Blue decisive | Blue tied) might be 90%. Then Gelman et al would say that Pr(your vote matters) is negligible in R, 0.09% in B, and 0.2% in P, and summarize that by saying “in most states, your vote has on the order of 0.1% chance of mattering”. And then you’d come along and look at the FiveThirtyEight numbers—excuse me, I mean the Five numbers—and say: “Aha, so since I’m in Blue which has 2⁄5 of the population and a 0.1% chance of being decisive, clearly the chance that my vote matters is about 0.1% 5⁄2 0.1%, which is far far less than Gelman et al said. Things must have changed.” But things haven’t changed; these hypothetical numbers are all for a single election; it’s just that you simply can’t legitimately combine the Gelman et al and FiveThirtyEight numbers in the way you’re trying to. The numbers don’t work that way.
Gelman et al only got to the conclusion that the probability of a random vote being decisive is about 10^-7 by having a model of how different states’ votes relate to one another.
True, that number may have changed somewhat. It may have decreased somewhat due to the voting population being larger, or increased somewhat due to the election being projected to be somewhat closer than 1992 turned out to be. But I’d expect the probability to mainly just shift between states. So 10^-7 made a pretty good baseline.
Accordingly, they have California as very unlikely to be tied, and really quite likely to be decisive if tied. I repeat: this is not at all the same as saying that it’s at all likely actually to be decisive.
Since you press the issue, I looked up how exactly Nate Silver defines the probabilities of being decisive that he uses on 538. He says:
The most rigorous way to define this is to sort the states in order of the most Democratic to the least Democratic, or most Republican to least Republican. Then count up the number of votes the candidate accumulates as he wins successively more difficult states. The state that provides him with the 270th electoral vote, clinching an Electoral College majority, is the swingiest state — the specific term I use for it is the “tipping point state.”
So you’re right; the probability of a state being decisive is not quite the same as the probability conditional on it being tied. [Edit: actually they are not even close to the same. And they wouldn’t have been even if 538 defined tipping-point state differently.] But (probability of single vote in California being decisive) = (probability of CA being the decisive state) * (probability that CA is tied given CA is the decisive state) = (probability of CA being the decisive state) * (probability of a randomly selected vote being a vote that ties CA given CA is the decisive state) / (probability of a randomly selected vote being cast in CA).
The assumption that I made was that (probability of a random vote tying CA given that CA is the decisive state) is close to (probability of a random vote tying whatever the decisive state happens to be), which seems fairly reasonable. I did NOT assume that (probability that CA is tied given that CA is the decisive state) is close to (probability that CA is tied). ( [Edit: ignore this parenthetical comment] At least I think I didn’t, but I’m tired right now, so it is conceivable that I could have made an error on that front. If I did mess that up, then the actual probability of a vote in CA swinging the election should be greater than 1 in 1 billion, but still probably less than 1 in 100 million.)
The three states’ political fluctuations are completely independent of one another.
That assumption is not even close to correct in US presidential elections. Not that that makes much of a difference (I think).
(2) conditional on Blue being tied, Blue will almost certainly be decisive.
Almost certainly? I thought you said Red was solid and Purple was only slightly bluish, so there should be a significant chance that Red and Purple both vote red, and Blue’s surprise red vote is redundant.
Pr(Blue decisive) is also 0.1%
Sounds like you are using the Gelman et al meaning of a state being decisive. Not what 538 calls being the tipping-point state, which is what I was using. That’s why you get garbage when you plug that into my formula.
The assumption that I made was that (probability of a random vote tying CA given that CA is the decisive state) is close to (probability of a random vote tying whatever the decisive state happens to be), which seems fairly reasonable.
So, first of all, there’s a factor of 2 error in there: your last equality says, in effect, Pr(CA tied | …) = Pr(a random vote ties CA | …, and that vote is in CA) but when CA is tied only half the votes there tie it.
I’m already late for work, so will look harder at the rest of what you’re saying later. (I find myself somewhat convinced both by my toy example and by the more-detailed argument you’re now making, but the two don’t seem consistent with one another. I expect I’m missing something.)
That assumption is not even close to correct in US presidential elections.
I know. That’s one reason why the example is a toy. But no part of your argument appeals to the correlations between states’ results, and the point of the toy example is to show that the calculation you’re doing produces completely wrong results in a toy example, so in the absence of anything in your argument that makes it apply to the real election and not to the toy example something’s got to be wrong with the argument.
Almost certainly?
Sorry, “slightly bluish” was meant to describe vote share rather than win probability. I’m assuming that P is a win for the Blue candidate about 90% of the time, which with simple-but-unrealistic models of voters will happen if it’s reasonably big and, say, 55% Blue.
Sounds like you are using the Gelman et al meaning of a state being decisive. Not what 538 calls being the tipping-point state, which is what I was using.
I was intending to use the 538 meaning. Pr(Blue decisive) is small because in almost all elections the state that gets the winner that crucial third EV—the one whose EV is in the middle when you line them up in order—is Purple. What do you find wrong with this reasoning?
So, first of all, there’s a factor of 2 error in there: your last equality says, in effect, Pr(CA tied | …) = Pr(a random vote ties CA | …, and that vote is in CA) but when CA is tied only half the votes there tie it.
Nope. Half the votes prevent a Romney victory, and the other half prevent an Obama victory.
I’m already late for work, so will look harder at the rest of what you’re saying later.
Your confusion is understandable, especially since I confused myself and started bullshitting you for a while before rederiving what I did in the first place. Sorry about that.
But no part of your argument appeals to the correlations between states’ results
That’s right. Sorry, I shouldn’t have been stressing the high correlations between voting fluctuations in different states.
Sorry, “slightly bluish” was meant to describe vote share rather than win probability. I’m assuming that P is a win for the Blue candidate about 90% of the time
Ah, ok.
I was intending to use the 538 meaning. Pr(Blue decisive) is small because in almost all elections the state that gets the winner that crucial third EV—the one whose EV is in the middle when you line them up in order—is Purple. What do you find wrong with this reasoning?
Numbers from your toy example:
Pr(Blue tied) = 0.1%
Pr(Blue decisive | Blue tied) = 90%
Pr(Blue decisive) = 0.1%
implications:
Pr(Blue decisive and tied) = 0.09%
Pr(Blue decisive and not tied) = 0.01%
This is not plausible. Presumably Pr(Blue decisive and votes blue by 1 vote) is also roughly 0.09%, in which case Pr(Blue decisive and not tied) cannot possibly be less than that. Assuming Red never enters the picture, Blue is decisive whenever it ends up voting more reddish than Purple does. Given how often Blue ties, I would expect this to actually happen fairly frequently.
Apologies for the slow response; I’ve been unreasonably busy. Executive summary of what follows: Yup, you were right.
So I tried generating more realistic numbers with the general structure of my toy example, and my conclusion is: Oops, you’re right and my example is no good. Sorry. And I think I agree with your simple probability-pushing argument that 538′s probability for California being decisive isn’t consistent with the numbers from Gelman et al being applicable in the 2012 election.
So, it seems to me that there are (at least) the following possibilities. (1) Gelman et al had a good model, and it remains reasonably applicable now, and 538 had too low a probability of California being decisive. (2) Gelman et al had a good model, but the political landscape has changed, and now California is less likely to be decisive than their model said it was in 1992. (3) Gelman et al had a screwed-up model, and their probabilities weren’t right even in 1992.
I agree with you that #2 is the least likely of these, and I offer the following statistic which, if cited at the outset, might have saved us a good deal of argument :-). In 1998, California went Democratic by about 51:48. In 2012, California went Democratic by about 59:39.
I accordingly agree with you: Academian’s numbers for his own case, which used the Gelman et al figures for California, likely gave much too high an expected value for his vote in California.
I agree with you that #2 is the least likely of these, and I offer the following statistic which, if cited at the outset, might have saved us a good deal of argument :-). In 1998, California went Democratic by about 51:48. In 2012, California went Democratic by about 59:39.
I assume you meant #2 is most likely? And you’re right; I should have pointed that out initially (even though it was before the election, I could have used 2008 figures).
Only if the propositions “your vote is decisive for California” and “the outcome in California is decisive for the election” are independent. They aren’t.
Consider a possible world in which a Californian’s vote is decisive. That’s one in which California splits almost exactly 50:50, which means that either something very unusual has happened in California or else something very unusual has happened in the country as a whole.
Then the outcome in California decides the whole election if the overall results outside California are close enough. In the “something very unusual in the country as a whole” case, they probably won’t be. In the “something very unusual in California” case, though, they might well be. At present 538 is predicting an Obama win by about 80 EVs, or about 25 without California. It doesn’t have to be far wrong for a hypothetically-upset California to become decisive.
(Actually, of course, those “something very unusual in …” options are just two ends of a continuum. The point is that there’s a non-negligible region of that continuum in which, conditional on your vote being decisive in California, California is quite likely to be decisive overall.)
If my understanding of how 538 defines the decisive state is correct, then “probability of a single vote anywhere deciding the election” is the same as “probability of a single vote in California deciding the election conditional on California being the decisive state”. It’s possible that he does not [Edit: he doesn’t] conditionalize on close electoral college outcomes when giving the probability that each state is decisive, and that this would cause those 2 probabilities to be a little different, but probably not by very much. If we assume that they will be within a factor of 3 of each other, which seems reasonable, then the probability of a vote in California being decisive is still less than 1 in 300 million.
Why? (In any case, that might be true with 538′s model but not with that of Gelman et al. It wouldn’t be that surprising to find that combining bits of one model with bits of another leads to wrong conclusions.)
Gelman et al got a range of Pr(your vote is decisive), with a figure of about 1 in 100M for the “worst” state. Perhaps indeed the least-evenly-poised states now are more unbalanced than then.
By the way: It’s not clear to me what point you’re making. “Things have changed since the paper by Gelman et al, and voting is less reasonable now for Californians than it was then for anyone”? “Gelman et al must be wrong because their figures are inconsistent with 538′s”? “538 must be wrong because their figures are inconsistent with those of Gelman et al”? Or what? The first of those might be right, though I haven’t yet fully grasped your reasoning. The second and third don’t seem like reasonable conclusions; at most you could say “The two models can’t both be right in every detail” which is surely true and not very surprising.
Gelman et al didn’t have a definition of “the decisive state”. The only thing I got from the Gelman et al model is that the a priori probability of a randomly selected vote being decisive is 1 in 10 million. I don’t see any opportunity for error due to differences between the models there.
Gelman et al may have been right about the 1992 election (although I am a little suspicious about the fact that the spread between states is so narrow), but I am suggesting that Academian was wrong to use their results in the context of the 2012 election. So yes, the first of those 3 points that you suggested is what I mean.
Intuitively, Gelman et al say that a vote in California is more likely to swing the whole election than a randomly selected vote. This may have been true in 1992, but it can’t possibly be true now, as shown by the fact that neither campaign has made a serious effort to increase their vote totals in California, and no one considers it a swing state.
Gelman et al do have a definition of decisive state (though not exactly of “the decisive state):
This isn’t quite the same as the 538 definition, which applies even when a state is not tied .
Gelman et al only got to the conclusion that the probability of a random vote being decisive is about 10^-7 by having a model of how different states’ votes relate to one another. They give a not-terribly-complete description of their model: each state’s vote is a linear function of a bunch of predictors, plus a per-state error, a per-region error, and a national error. This isn’t a million miles away from the 538 model, but it certainly isn’t identical. So the relationship between (e.g.) the probability of California being decisive, and the probability of California being evenly split, might be quite different in the Gelman et al and 538 models.
In Gelman et al’s model (see Figure 3 in their paper), states less likely to be tied are more likely to be decisive if tied. (Because the states that are less likely to be tied are the larger ones, with more electoral votes.) Roughly, these factors cancel out, which is why they don’t see huge variations in Pr(your vote matters) according to state. Accordingly, they have California as very unlikely to be tied, and really quite likely to be decisive if tied. I repeat: this is not at all the same as saying that it’s at all likely actually to be decisive.
The following toy example may help. There are exactly three states. One is solidly Red and has 2EV, one is solidly Blue and has 2EV, one is a slightly bluish Purple and has 1EV. The three states’ political fluctuations are completely independent of one another. Then: (1) Almost always, Purple is the decisive / swingiest / tipping-point state. 538 would give Red and Blue only a tiny chance of playing that role. But (2) conditional on Blue being tied, Blue will almost certainly be decisive.
So maybe the probabilities of being tied are 0.1% each for R and B, and 0.2% for Purple (yes, these are terribly small states), and Pr(Blue decisive) is also 0.1%, but Pr(Blue decisive | Blue tied) might be 90%. Then Gelman et al would say that Pr(your vote matters) is negligible in R, 0.09% in B, and 0.2% in P, and summarize that by saying “in most states, your vote has on the order of 0.1% chance of mattering”. And then you’d come along and look at the FiveThirtyEight numbers—excuse me, I mean the Five numbers—and say: “Aha, so since I’m in Blue which has 2⁄5 of the population and a 0.1% chance of being decisive, clearly the chance that my vote matters is about 0.1% 5⁄2 0.1%, which is far far less than Gelman et al said. Things must have changed.” But things haven’t changed; these hypothetical numbers are all for a single election; it’s just that you simply can’t legitimately combine the Gelman et al and FiveThirtyEight numbers in the way you’re trying to. The numbers don’t work that way.
True, that number may have changed somewhat. It may have decreased somewhat due to the voting population being larger, or increased somewhat due to the election being projected to be somewhat closer than 1992 turned out to be. But I’d expect the probability to mainly just shift between states. So 10^-7 made a pretty good baseline.
Since you press the issue, I looked up how exactly Nate Silver defines the probabilities of being decisive that he uses on 538. He says:
So you’re right; the probability of a state being decisive is not quite the same as the probability conditional on it being tied. [Edit: actually they are not even close to the same. And they wouldn’t have been even if 538 defined tipping-point state differently.] But (probability of single vote in California being decisive) = (probability of CA being the decisive state) * (probability that CA is tied given CA is the decisive state) = (probability of CA being the decisive state) * (probability of a randomly selected vote being a vote that ties CA given CA is the decisive state) / (probability of a randomly selected vote being cast in CA).
The assumption that I made was that (probability of a random vote tying CA given that CA is the decisive state) is close to (probability of a random vote tying whatever the decisive state happens to be), which seems fairly reasonable. I did NOT assume that (probability that CA is tied given that CA is the decisive state) is close to (probability that CA is tied). ( [Edit: ignore this parenthetical comment] At least I think I didn’t, but I’m tired right now, so it is conceivable that I could have made an error on that front. If I did mess that up, then the actual probability of a vote in CA swinging the election should be greater than 1 in 1 billion, but still probably less than 1 in 100 million.)
That assumption is not even close to correct in US presidential elections. Not that that makes much of a difference (I think).
Almost certainly? I thought you said Red was solid and Purple was only slightly bluish, so there should be a significant chance that Red and Purple both vote red, and Blue’s surprise red vote is redundant.
Sounds like you are using the Gelman et al meaning of a state being decisive. Not what 538 calls being the tipping-point state, which is what I was using. That’s why you get garbage when you plug that into my formula.
So, first of all, there’s a factor of 2 error in there: your last equality says, in effect, Pr(CA tied | …) = Pr(a random vote ties CA | …, and that vote is in CA) but when CA is tied only half the votes there tie it.
I’m already late for work, so will look harder at the rest of what you’re saying later. (I find myself somewhat convinced both by my toy example and by the more-detailed argument you’re now making, but the two don’t seem consistent with one another. I expect I’m missing something.)
I know. That’s one reason why the example is a toy. But no part of your argument appeals to the correlations between states’ results, and the point of the toy example is to show that the calculation you’re doing produces completely wrong results in a toy example, so in the absence of anything in your argument that makes it apply to the real election and not to the toy example something’s got to be wrong with the argument.
Sorry, “slightly bluish” was meant to describe vote share rather than win probability. I’m assuming that P is a win for the Blue candidate about 90% of the time, which with simple-but-unrealistic models of voters will happen if it’s reasonably big and, say, 55% Blue.
I was intending to use the 538 meaning. Pr(Blue decisive) is small because in almost all elections the state that gets the winner that crucial third EV—the one whose EV is in the middle when you line them up in order—is Purple. What do you find wrong with this reasoning?
Nope. Half the votes prevent a Romney victory, and the other half prevent an Obama victory.
Your confusion is understandable, especially since I confused myself and started bullshitting you for a while before rederiving what I did in the first place. Sorry about that.
That’s right. Sorry, I shouldn’t have been stressing the high correlations between voting fluctuations in different states.
Ah, ok.
Numbers from your toy example: Pr(Blue tied) = 0.1% Pr(Blue decisive | Blue tied) = 90% Pr(Blue decisive) = 0.1% implications: Pr(Blue decisive and tied) = 0.09% Pr(Blue decisive and not tied) = 0.01% This is not plausible. Presumably Pr(Blue decisive and votes blue by 1 vote) is also roughly 0.09%, in which case Pr(Blue decisive and not tied) cannot possibly be less than that. Assuming Red never enters the picture, Blue is decisive whenever it ends up voting more reddish than Purple does. Given how often Blue ties, I would expect this to actually happen fairly frequently.
Apologies for the slow response; I’ve been unreasonably busy. Executive summary of what follows: Yup, you were right.
So I tried generating more realistic numbers with the general structure of my toy example, and my conclusion is: Oops, you’re right and my example is no good. Sorry. And I think I agree with your simple probability-pushing argument that 538′s probability for California being decisive isn’t consistent with the numbers from Gelman et al being applicable in the 2012 election.
So, it seems to me that there are (at least) the following possibilities. (1) Gelman et al had a good model, and it remains reasonably applicable now, and 538 had too low a probability of California being decisive. (2) Gelman et al had a good model, but the political landscape has changed, and now California is less likely to be decisive than their model said it was in 1992. (3) Gelman et al had a screwed-up model, and their probabilities weren’t right even in 1992.
I agree with you that #2 is the least likely of these, and I offer the following statistic which, if cited at the outset, might have saved us a good deal of argument :-). In 1998, California went Democratic by about 51:48. In 2012, California went Democratic by about 59:39.
I accordingly agree with you: Academian’s numbers for his own case, which used the Gelman et al figures for California, likely gave much too high an expected value for his vote in California.
I assume you meant #2 is most likely? And you’re right; I should have pointed that out initially (even though it was before the election, I could have used 2008 figures).
Yes, of course I meant most likely. Duh. I’ve edited my comment for the benefit of our thousands of future readers.
testing: blah