Recall that in the last post I said that a voting system is a function that takes in a distribution on utility functions on a set of candidates, and produces a distribution on that set of candidates, and that voting theorists tend to make four assumptions:
The set of candidates is finite.
The function only uses the preorders on candidates implied by the utility functions.
The output distribution assigns probability 1 to a single candidate.
The input distribution is the uniform distribution on some finite set.
The fourth assumption doesn’t really matter, and isn’t really used, so we can just remove it. The third assumption, on the other hand, is important for many of the voting theory conclusions, but we are now going to see what happens if we remove it. We will keep the first two assumptions.
So now, we will be considering lotteries (i.e. distributions) over candidates, since that will be the output of our voting system.
Given a set C of candidates, lotteries A,B∈Δ(C), and an electorate V∈Δ(C→[0,1]), we say that AdominatesB if Pa∼A,b∼B,v∼V(v(a)>v(b))≥Pa∼A,b∼B,v∼V(v(b)>v(a)).
Lets unpack this definition. We are given three distributions V, a distribution on voters, and A and B, distributions on candidates. We are imagining independently sampling from these three distributions to get v, a, and b respectively. When we do this, one of three things will happen:
v will prefer a to b. (v(a)>v(b))
v will prefer b to a. (v(b)>v(a))
v will be indifferent. (v(a)=v(b))
We say that A dominates B if the first outcome is at least as likely as the second outcome. Note that we don’t need to look at the actual utility functions to determine whether A dominates B, we only need to look at the partial ordering of preferences over candidates, so we are not violating assumption 2. Also note that domination need not be transitive.
Maximal Lotteries
Given a set C of candidates, and an electorate V∈Δ(C→[0,1]), a maximal lottery is an M∈Δ(C) such that for all other L∈Δ(C), M dominates L.
Sounds great, but surely that is too strong a condition, domination isn’t even transitive.
Doesn’t matter. Maximal lotteries always exist!
What? Is this some silly Brouwer’s fixed point thing, where you can’t find them, and they aren’t unique?
Nope. Maximal lotteries are basically unique. The only reason they aren’t unique is because of ties. For example, if your electorate is the uniform distribution on an odd number of voters that are never indifferent, there will be a unique maximal lottery. And you can find them quickly.
Wow, so this is like a proper voting system then: given an electorate, just output the maximal lottery.
Yep.
Is it any good?
Well, it’s Condorcet.
Wow! This is has got to be the most elegant Condorcet voting system I have ever seen! Most of the Condorcet systems I have seen have basically looked like they took the Condorcet criterion, and then wrote an algorithm around it. Is it Clone invariant?
Yep.....And it satisfies consistency and participation.
What? Those are incompatible with Condorcet!
Only for deterministic voting systems!
Wow! I don’t really have an intuition for probabilistic voting systems, There must be all sorts of cool stuff you can do once you allow randomness.
No, this is basically it. You could choose a random voter to be dictator, which is nice for being strategy-proof, but that isn’t Condorcet. In fact, Maximal Lotteries are uniquely characterized as the only probabilistic voting system that is Condorcet, consistent, and clone invariant. (See this paper by Florian Brandl, Felix Brandt, and Hans Georg Seedig.)
I’m sold. But are people ready to start flipping coins in elections?
They usually don’t have to! Remember, maximal lotteries satisfy the Condorcet Criterion. So whenever there is a Condorcet winner, that candidate will just win deterministically, and you won’t need to randomize at all!
Sketches of the Most Important Arguments
I will sketch arguments for the existence of maximal lotteries, and the fact that they are Condorcet and consistent. I won’t bother proving that they are (usually) unique, satisfy clone invariance and participation, or any unique characterization, as these are harder and also the main thrust of the miracle of maximal lotteries is the fact that they bridge the gap between Condorcet and consistency. For that, I defer here and here.
Existence of Maximal Lotteries
Imagine a two player zero-sum game between Alice and Bob. Alice and Bob each choose a candidate. Alice gets utility equal to the probability that a randomly chosen voter prefers her candidate to Bob’s candidate minus the probability that a randomly chosen voter prefers Bob’s candidate to hers. Bob, symmetrically, gets the negative of this utility.
This game has a Nash equilibrium. (Since it is a two player zero sum game, we also get nice properties, like the ability to compute the Nash equilibrium, and the fact that the set of Nash equilibria is convex.) In that Nash equilibrium, Alice and Bob must both used mixed strategies that are Maximal Lotteries.
Condorcet
Imagine there is a Condorcet winner, c. Let B be the distribution that assigns probability 1 to c, and let A be a maximal lottery. Since A dominates B, if a is sampled from A, and v is sampled from V, the probability that v(a)>v(c) must be at least the probability that v(c)>v(a). Since c is a Condorcet winner, for all d≠c, the probability that v(c)>v(d) is greater than 12. Thus, the probability that v(c)>v(a) is greater that 12 times the probability that a≠c. Thus, the probability that v(a)>v(c) is also greater than 12 times the probability that a≠c. Thus, given that a≠c, the probability that v(c)>v(a) and that v(a)>v(c) are each greater that 12. This is not possible, so the condition that a≠c must happen with probability 0. Thus, A assigns probability one to c, so the only maximal lottery assigns probability one to c.
Consistency
Imagine A is a maximal lottery for electorate V0 and for electorate V1. For any other lottery B, A dominates B for both electorates. Let V2=pV0+(1−p)V1. Imagine sampling a candidate a from A, b from B, and a voter v from V2. Imagine that v was sampled by first flipping a coin that comes up heads with probability p, and then sampling from V0 in the case of heads and V1 in the case of tails. Conditioned on heads, the probability that v(a)>v(b) is at least the probability that v(b)>v(a). Conditioned on tails this is also true. Thus, the unconditioned probability that v(a)>v(b) is at least the unconditioned probability that v(b)>v(a), so A dominates B for the electorate V2. Since B was arbitrary, A is a maximal lottery for electorate V2.
Maximal Lotteries vs Random Dictatorship
Let us now compare maximal lotteries to random dictatorship. In random dictatorship a voter is sampled at random, and their favorite candidate wins. Random dictatorship is nice because it is strategy proof. Ignoring coalitions, you should always rank you favorite candidate most highly.
I think that maximal lotteries and random dictatorship are actually incomparable. Which one you should use depends on context. Here is a quick test:
Imagine there are only two candidates c and d. Imagine that 51% of voters prefer c to d. If you think this means c should win with probability 100%, then perhaps you should use maximal lotteries. If you think you should flip a coin and c should win with probability 51%, then perhaps you should use random dictatorship.
If you think of democracy as replacing the need for war, and voting power is proportional to how many resources you would have in a war, then if the winner of the war would be determined randomly proportionally to resources, then random dictatorship makes more sense, If the side with the most resources would win with certainty, than maximal lotteries makes more sense.
The fact that c does not win with certainty in random dictatorship is what makes random dictatorship not Condorcet.
This test is not perfect. There are other factors. Often there is a trade off between strategy-proof-ness and Pareto optimality, and while neither is perfect in either criteria, Random Dictatorship is more strategy proof and maximal lotteries are more optimal. (I believe that if you have a bunch of random utility functions, maximal lotteries will lead to higher expected average utility than random dictatorship, but I haven’t checked this.) Maximal lotteries is also more robust to a few bad actors, If a minority (or even one person) wants to choose a really bad candidate, random dictatorship will choose that candidate with a small probability. Maximal lotteries will not.
Fixing U.S. Presidential Elections
So now that we have identified the best voting system as maximal lotteries, we just need to use it in practice.
You might think that given all the math and the randomness and weirdness, it is politically infeasible to implement maximal lotteries. But I think it is actually very feasible. In fact, we are actually pretty close.
One way to compute (and sample from) the maximal lottery is to simulate a two player game. If the game is set up correctly the two players will play Nash equilibria, and will thus choose candidates according to the maximal lottery. In the United States, these two players are the Democrats and the Republicans.
To more accurately compute the maximal lottery, we must:
Eliminate all third-parties. We need a two player game.
The game needs to be zero-sum. Both parties need to care about nothing other than winning. Anyone who votes in primary election should care only about electability. All that matters is that the candidate chosen by their party wins. The views of the candidates are irrelevant except in so far as the change electability. I believe we can achieve this by increasing political polarization and tribal feelings.
The two parties must choose their candidates independently. We already have some of this by having the primaries take place at the same time, but it will help if members of different parties do not talk to each other at all.
The above will only set up the game where the two parties are trying to select the candidate that would win in a one-on-one election. We actually need the two parties to try to select the candidate that would be selected by a randomly chosen voter. To achieve this, we need to decrease voter turnout in the general election. If every voter flips a coin and stays home with probability 1−ε, then with probability almost one nobody will vote, and the president will be selected from the two candidates by other means, but since it will be from the nominees, it will be according to maximal lotteries. With probability on the order of ε, one person will vote chosen at random. The probability that more than one person will vote is negligible, so both parties will select the candidate that is most likely to be preferred by one randomly selected voter, and thus select candidates according to maximal lotteries!
As a sanity check that we have done this correctly, we can just look at the policies of the nominees. Any ability to predict correlations between views of candidates and political parties means we have failed (probably by not having enough polarization). Since the game is symmetric, every election cycle the two candidates should be indistinguishable in expectation.
(As a stretch goal, we can approximate the proposal in my next post by having obnoxiously large numbers of virtually indistinguishable candidates in the primary elections.)
Maximal Lotteries
Probabilistic Voting Theory
Recall that in the last post I said that a voting system is a function that takes in a distribution on utility functions on a set of candidates, and produces a distribution on that set of candidates, and that voting theorists tend to make four assumptions:
The set of candidates is finite.
The function only uses the preorders on candidates implied by the utility functions.
The output distribution assigns probability 1 to a single candidate.
The input distribution is the uniform distribution on some finite set.
The fourth assumption doesn’t really matter, and isn’t really used, so we can just remove it. The third assumption, on the other hand, is important for many of the voting theory conclusions, but we are now going to see what happens if we remove it. We will keep the first two assumptions.
So now, we will be considering lotteries (i.e. distributions) over candidates, since that will be the output of our voting system.
Given a set C of candidates, lotteries A,B∈Δ(C), and an electorate V∈Δ(C→[0,1]), we say that A dominates B if Pa∼A,b∼B,v∼V(v(a)>v(b))≥Pa∼A,b∼B,v∼V(v(b)>v(a)).
Lets unpack this definition. We are given three distributions V, a distribution on voters, and A and B, distributions on candidates. We are imagining independently sampling from these three distributions to get v, a, and b respectively. When we do this, one of three things will happen:
v will prefer a to b. (v(a)>v(b))
v will prefer b to a. (v(b)>v(a))
v will be indifferent. (v(a)=v(b))
We say that A dominates B if the first outcome is at least as likely as the second outcome. Note that we don’t need to look at the actual utility functions to determine whether A dominates B, we only need to look at the partial ordering of preferences over candidates, so we are not violating assumption 2. Also note that domination need not be transitive.
Maximal Lotteries
Given a set C of candidates, and an electorate V∈Δ(C→[0,1]), a maximal lottery is an M∈Δ(C) such that for all other L∈Δ(C), M dominates L.
Sounds great, but surely that is too strong a condition, domination isn’t even transitive.
Doesn’t matter. Maximal lotteries always exist!
What? Is this some silly Brouwer’s fixed point thing, where you can’t find them, and they aren’t unique?
Nope. Maximal lotteries are basically unique. The only reason they aren’t unique is because of ties. For example, if your electorate is the uniform distribution on an odd number of voters that are never indifferent, there will be a unique maximal lottery. And you can find them quickly.
Wow, so this is like a proper voting system then: given an electorate, just output the maximal lottery.
Yep.
Is it any good?
Well, it’s Condorcet.
Wow! This is has got to be the most elegant Condorcet voting system I have ever seen! Most of the Condorcet systems I have seen have basically looked like they took the Condorcet criterion, and then wrote an algorithm around it. Is it Clone invariant?
Yep.....And it satisfies consistency and participation.
What? Those are incompatible with Condorcet!
Only for deterministic voting systems!
Wow! I don’t really have an intuition for probabilistic voting systems, There must be all sorts of cool stuff you can do once you allow randomness.
No, this is basically it. You could choose a random voter to be dictator, which is nice for being strategy-proof, but that isn’t Condorcet. In fact, Maximal Lotteries are uniquely characterized as the only probabilistic voting system that is Condorcet, consistent, and clone invariant. (See this paper by Florian Brandl, Felix Brandt, and Hans Georg Seedig.)
I’m sold. But are people ready to start flipping coins in elections?
They usually don’t have to! Remember, maximal lotteries satisfy the Condorcet Criterion. So whenever there is a Condorcet winner, that candidate will just win deterministically, and you won’t need to randomize at all!
Sketches of the Most Important Arguments
I will sketch arguments for the existence of maximal lotteries, and the fact that they are Condorcet and consistent. I won’t bother proving that they are (usually) unique, satisfy clone invariance and participation, or any unique characterization, as these are harder and also the main thrust of the miracle of maximal lotteries is the fact that they bridge the gap between Condorcet and consistency. For that, I defer here and here.
Existence of Maximal Lotteries
Imagine a two player zero-sum game between Alice and Bob. Alice and Bob each choose a candidate. Alice gets utility equal to the probability that a randomly chosen voter prefers her candidate to Bob’s candidate minus the probability that a randomly chosen voter prefers Bob’s candidate to hers. Bob, symmetrically, gets the negative of this utility.
This game has a Nash equilibrium. (Since it is a two player zero sum game, we also get nice properties, like the ability to compute the Nash equilibrium, and the fact that the set of Nash equilibria is convex.) In that Nash equilibrium, Alice and Bob must both used mixed strategies that are Maximal Lotteries.
Condorcet
Imagine there is a Condorcet winner, c. Let B be the distribution that assigns probability 1 to c, and let A be a maximal lottery. Since A dominates B, if a is sampled from A, and v is sampled from V, the probability that v(a)>v(c) must be at least the probability that v(c)>v(a). Since c is a Condorcet winner, for all d≠c, the probability that v(c)>v(d) is greater than 12. Thus, the probability that v(c)>v(a) is greater that 12 times the probability that a≠c. Thus, the probability that v(a)>v(c) is also greater than 12 times the probability that a≠c. Thus, given that a≠c, the probability that v(c)>v(a) and that v(a)>v(c) are each greater that 12. This is not possible, so the condition that a≠c must happen with probability 0. Thus, A assigns probability one to c, so the only maximal lottery assigns probability one to c.
Consistency
Imagine A is a maximal lottery for electorate V0 and for electorate V1. For any other lottery B, A dominates B for both electorates. Let V2=pV0+(1−p)V1. Imagine sampling a candidate a from A, b from B, and a voter v from V2. Imagine that v was sampled by first flipping a coin that comes up heads with probability p, and then sampling from V0 in the case of heads and V1 in the case of tails. Conditioned on heads, the probability that v(a)>v(b) is at least the probability that v(b)>v(a). Conditioned on tails this is also true. Thus, the unconditioned probability that v(a)>v(b) is at least the unconditioned probability that v(b)>v(a), so A dominates B for the electorate V2. Since B was arbitrary, A is a maximal lottery for electorate V2.
Maximal Lotteries vs Random Dictatorship
Let us now compare maximal lotteries to random dictatorship. In random dictatorship a voter is sampled at random, and their favorite candidate wins. Random dictatorship is nice because it is strategy proof. Ignoring coalitions, you should always rank you favorite candidate most highly.
I think that maximal lotteries and random dictatorship are actually incomparable. Which one you should use depends on context. Here is a quick test:
Imagine there are only two candidates c and d. Imagine that 51% of voters prefer c to d. If you think this means c should win with probability 100%, then perhaps you should use maximal lotteries. If you think you should flip a coin and c should win with probability 51%, then perhaps you should use random dictatorship.
If you think of democracy as replacing the need for war, and voting power is proportional to how many resources you would have in a war, then if the winner of the war would be determined randomly proportionally to resources, then random dictatorship makes more sense, If the side with the most resources would win with certainty, than maximal lotteries makes more sense.
The fact that c does not win with certainty in random dictatorship is what makes random dictatorship not Condorcet.
This test is not perfect. There are other factors. Often there is a trade off between strategy-proof-ness and Pareto optimality, and while neither is perfect in either criteria, Random Dictatorship is more strategy proof and maximal lotteries are more optimal. (I believe that if you have a bunch of random utility functions, maximal lotteries will lead to higher expected average utility than random dictatorship, but I haven’t checked this.) Maximal lotteries is also more robust to a few bad actors, If a minority (or even one person) wants to choose a really bad candidate, random dictatorship will choose that candidate with a small probability. Maximal lotteries will not.
Fixing U.S. Presidential Elections
So now that we have identified the best voting system as maximal lotteries, we just need to use it in practice.
You might think that given all the math and the randomness and weirdness, it is politically infeasible to implement maximal lotteries. But I think it is actually very feasible. In fact, we are actually pretty close.
One way to compute (and sample from) the maximal lottery is to simulate a two player game. If the game is set up correctly the two players will play Nash equilibria, and will thus choose candidates according to the maximal lottery. In the United States, these two players are the Democrats and the Republicans.
To more accurately compute the maximal lottery, we must:
Eliminate all third-parties. We need a two player game.
The game needs to be zero-sum. Both parties need to care about nothing other than winning. Anyone who votes in primary election should care only about electability. All that matters is that the candidate chosen by their party wins. The views of the candidates are irrelevant except in so far as the change electability. I believe we can achieve this by increasing political polarization and tribal feelings.
The two parties must choose their candidates independently. We already have some of this by having the primaries take place at the same time, but it will help if members of different parties do not talk to each other at all.
The above will only set up the game where the two parties are trying to select the candidate that would win in a one-on-one election. We actually need the two parties to try to select the candidate that would be selected by a randomly chosen voter. To achieve this, we need to decrease voter turnout in the general election. If every voter flips a coin and stays home with probability 1−ε, then with probability almost one nobody will vote, and the president will be selected from the two candidates by other means, but since it will be from the nominees, it will be according to maximal lotteries. With probability on the order of ε, one person will vote chosen at random. The probability that more than one person will vote is negligible, so both parties will select the candidate that is most likely to be preferred by one randomly selected voter, and thus select candidates according to maximal lotteries!
As a sanity check that we have done this correctly, we can just look at the policies of the nominees. Any ability to predict correlations between views of candidates and political parties means we have failed (probably by not having enough polarization). Since the game is symmetric, every election cycle the two candidates should be indistinguishable in expectation.
(As a stretch goal, we can approximate the proposal in my next post by having obnoxiously large numbers of virtually indistinguishable candidates in the primary elections.)