Ambiguity in Prediction Market Resolution is Still Harmful
A brief followup to this post in light of recent events.
Free and Fair Elections
Polymarket has an open market ‘Venezuela Presidential Election Winner’. Its description is as follows:
The 2024 Venezuela presidential election is scheduled to take place on July 28, 2024.
This market will resolve to “Yes” if Nicolás Maduro wins. Otherwise, this market will resolve to “No.”
This market includes any potential second round. If the result of this election isn’t known by December 31, 2024, 11:59 PM ET, the market will resolve to “No.”
In the case of a two-round election, if this candidate is eliminated before the second round this market may immediately resolve to “No”.
The primary resolution source for this market will be official information from Venezuela, however a consensus of credible reporting will also suffice.
Can you see any ambiguity in this specification? Any way in which, in a nation whose 2018 elections “[did] not in any way fulfill minimal conditions for free and credible elections” according to the UN, there could end up being ambiguity in how this market should resolve?
If so, I have bad news and worse news.
The bad news is that Polymarket could not, and so this market is currently in a disputed-outcome state after Maduro’s government announced a more-official-but-almost-certainly-faked election win, while the opposition announced somewhat-more-credible figures in which they won.
The worse news is that $3,546,397 has been bet on that market as of this writing.
How should that market resolve? I am not certain! Commenters on the market have...ah...strong views in both directions. And the description of the market does not make it entirely clear. If I were in charge of resolving this market I would probably resolve it to Maduro, just off the phrase about the ‘primary resolution source’. However, I don’t think that’s unambiguous, and I would feel much happier if the market had begun with a wording that made it clear how a scenario like this would be treated.
(Update 8/2: market is still unresolved, Maduro trading at 75%).
(Update 8/5: market still unresolved, Maduro trading at 57%).
(Update 8/6: the market, with $6.15M bet, has resolved to opposition candidate Edmundo Gonzales)
How did other markets do?
I’ve given Manifold a hard time on similar issues in the past, but they actually did a lot better here. There is a ‘Who will win Venezuela’s 2024 presidential election’ market, but it’s clear that it “Resolves to the person the CNE declares as winner of the 2024 presidential elections in Venezuela” (which would be Maduro). There are a variety of “Who will be the president of Venezuela on [DATE]” markets, which have the potential to be ambiguous but at least should be better.
Metaculus did (in my opinion) a bit better than Polymarket but worse than Manifold on the wording, with a market that resolves “based on the official results released by the National Electoral Council of Venezuela or other credible sources,” a description which, ah, seems to assume something about the credibility of the CNE. Nevertheless, they’ve resolved it to Maduro (I think correctly given that wording).
On the other hand, neither of these markets had $3.5M bet on them. So.
What does this mean for prediction markets?
This is really nowhere near as bad as this can get:
Venezuelan elections are not all that important to the world (sorry, Venezuelans), and I don’t think they get all that much interest compared to other elections, or other events in general. (Polymarket has $3.5M on the Venezuelan election. It has $459M on the US election, $68M on the US Democratic VP nominee, and both $2.4M on ‘most medals in the Paris Olympics’ and $2.2M on ‘most gold medals in the Paris Olympics’).
Venezuela’s corruption is well-known. I don’t think anyone seriously believes Maduro legitimately won the election. I don’t think it was hard to realize in advance that something like this was a credible outcome. There is really very little ambiguity about the actual nature of reality here!
Venezuela is sufficiently dictatorial that all ‘official sources’ are likely to announce the same thing. There isn’t likely to be e.g. disagreement between two different parts of the Venezuelan government on who won the election.
How would current prediction markets do in the 2000 Bush-Gore US election? Or, more to the point, how will they do the next time something even slightly unexpected happens, when it turns out that their wording did not quite predict it?
And when that inevitably happens, will there be tens of millions of dollars invested in the question?
I don’t support e.g. the CFTC decision to try to ban prediction markets entirely. I think prediction markets are a potentially interesting tool. But seeing things like this happen (over and over) makes me less and less optimistic about prediction markets as a way to resolve questions that are even slightly complicated or controversial. And if you want prediction markets used broadly as a way of getting trustworthy information on complicated issues, I think you need to realize this as a major problem.
I find the datapoint interesting, but I don’t really why it’s much evidence of this being “a huge issue”. Most markets resolve neatly. It seems fine for 5% of markets or so to end up in dispute. People can price in a bunch of the resolution uncertainty. Do we have any evidence this is seriously hurting prediction market adoption or performance?
I agree that most markets resolve successfully, but think we might not be on the same page on how big a deal it is for 5% of markets to end up ambiguous.
If someone offered you a security with 95% odds to track Google stock performance and 5% odds to instead track how many hairs were on Sundar Pichai’s head, this would not be a great security! A stock market that worked like that would not be a great stock market!
In particular:
I think this ambiguity is a massive blow to arbitrage strategies (which are a big part of the financial infrastructure we’re hoping will make prediction markets accurate). There are already a lot of barriers in the way of profiting from a situation where e.g. one market says 70% and one market says 80%: if there’s a chance that the two will resolve some ambiguity differently, that adds a very big risk to anyone attempting to arbitrage that difference.[1]
I think this ambiguity is very dangerous to the hopes of prediction markets as a trustworthy canonical source on controversial events. If Maduro says “See, everyone, I won the election fair and square, the prediction markets agree!”[2], I think it’s very important that the prediction markets be perfectly clear on what they are tracking and what they are not tracking, and I don’t think that’s currently the case.
Related to #2, I think that this ambiguity is vastly more likely to occur in the case of the controversial events where it’s most valuable to have a trustworthy and canonical source. The cases where markets resolve cleanly are exactly the cases where non-prediction-market mechanisms already reach a canonical consensus on their own.
This also makes Manifold’s preferred strategy of dealing with ambiguity by N/A-ing a market less valuable: that’s an acceptable resolution for someone who just did buy-and-hold on that one market, but can be very bad for someone who was trading actively across multiple markets some of which N/A-ed and some of which did not.
This imagines a world where prediction markets are major enough and mainstream enough for people to be looking at them and talking about them: but that’s exactly what prediction market advocates want!
I disagree with the 5% of switching to a Sundar Pichai hairs simile:
Prediction market prices are bounded between 0 and 1
Polymarket has > 1k markets, and maybe 3 to 10 ambiguous resolutions a year. It’s more like 0.3% to 1%.
My sense is this security would be fine? Is there a big issue with this being a security?
In most domains except the most hardened part of the stock market counterparty risk is generally >5%. The key issues come when failure is correlated, but it seems to me indeed that in prediction markets it’s pretty random which way ambiguity resolves, and so you get pretty uncorrelated failures (like, if you are invested in 10,000 markets, while it might be the case that 500 of them resolve in a surprising and ambiguous way, you will pretty randomly be on the losing or winning side of it, so it mostly just cancels out).
This seems quite wrong to me:
High Yield Corporate Bond OAS spreads are <5% according to bloomberg, and most of that is economic risk, not “you will get screwed by a change of rules” risk.
Trades on US stock exchanges almost always succeed, many more 9s than just one.
If I buy a product in a box in a supermarket the contents of the box match the label >>95% of the time.
Banks make errors with depositor balances <<5% of the time.
Most employers manage to pay fortnightly wages on time without missing one or more paycheques per year.
Once you’re seated in an Uber or Taxi they take you to your destination almost all the time.
Your utility company fulfills its obligations to supply your house >>95% of the time under all but the most extreme circumstances.
Most employees turn up >95% of non-holiday days, and most students maintain >95% attendance.
Sorry, I wanted to say “except the most hardened parts of the world (like the stock market)”. I agree with you that basically anything in the stock market has much less counterparty risk than that. I disagree with basically all non-trading examples you give.
My sense is around 1⁄20 Ubers don’t show up, or if they show up, fail to do their job in some pretty obvious and clear way.
True for the most commoditized products. For anything else error rates seem to me to be around 5%. My guess is my overall Amazon error rate has been around 2%, which is lower, but not much lower (usually Amazon sent me something broken that previously was returned where they couldn’t spot the error).
I think that’s false, at least the statistics on wage theft seemed quite substantial to me. I am kind of confused how to interpret these, but various different studies on Wikipedia suggest wage theft on-average to be around 5%-15% (higher among lower-income workers).
I agree this is true for gas and water (and mostly true for electricity, though PG&E is terrible and Berkeley really has a lot of outages).
Overall, I think 5% counterparty risk seems about right for most contracts I sign or business relationship I have. I agree that trading infrastructure is quite robust and in highly commoditized environments you get below that, but that’s not the majority of my economic transactions.
It’s not just the stock market, it’s true for the bond market, the derivatives market, the commodities market… financial markets, a category which includes prediction markets, cannot function effectively with counterparty risk anything like 5%.
If the Uber doesn’t show up I’m not sure that’s counterparty risk: you haven’t paid anything, so it seems more like them declining the contract. The equivalent for a prediction market would be if you hit ‘buy’ and the button didn’t work, not for when you have paid the money and then don’t get the result taken from you. That’s much less bad than if the trade went through and then was settled incorrectly.
I think those studies have significant methodological flaws, though unfortunately I can’t remember the specific issues off the top off my head, so this may not be very convincing to you.
According to the first google hit, PG&E said the average customer suffered 255.9 minutes of outage in 2013, which is a lot higher than I expected, but is still only 100*255.9/(60*24*365) = 0.05%
Hmm, maybe I am just failing to model something here. Isn’t really the only thing that happens when you have 5% randomly-distributed counterparty risk that you end up with like 5% spreads? That seems fine to me.
To be clear, I don’t feel very confident here, I just don’t really understand why you can’t just price in counterparty risk and then maybe end up with some bigger spreads (which I do agree is sad for prediction markets, but for most markets I don’t mind the spread that much).
In the sense that it would find a market-clearing price, it’s fine. But in the sense of its price movements being informative...well. Say the price of that security has just dropped by 10%.
Is the market reflecting that bad news about Google’s new AI model is likely to reflect poor long-term prospects? Is it indicating that increased regulatory scrutiny is likely to be bad for Google’s profitability?
Or is Sundar Pichai going bald?
I mean, I feel like random things affect the price of securities all the time. During early COVID random fiscal policy decisions had a much bigger effect on the stock price of companies than their actual competence. Similarly, COVID itself of course had huge effects.
I feel like it’s normal that when the stock price of a company moves, this often has little to do with the company, but can be chased back to kind of “random” other things. In this case, the stock price would go down, and it would be pretty easy to check whether that was because something related to the resolution criteria changed, or whether something “core” to the company changed.
Most Polymarket markets resolve neatly, I’d also estimate <5% contentious.
For myself, and I’d guess many LW users, the AI-related questions on Manifold and Metaculus are of particular interest though, and these are a lot worse. My guesses as to the state of affairs there:
33% of AI-related questions on Metaculus having significant ambiguity (shifting my credence by >10%).
66% of AI-related questions on Manifold having significant ambiguity
For example, most AI benchmarking questions do not specify whether or not they allow things like N-trajectory majority vote or web search. And, most of the ambiguities I’m thinking of are worse than this.
On AI, I expect bringing down the ambiguity rate by a factor of 2 would be quite easy, but getting to 5% sounds hard. I wrote up my suggestions for Manifold here a few days ago. For Metaculus, I think they’d benefit from having a dedicated AI-benchmarking mod who is familiar with common ambiguities in that area (they might already have one, but they should be assigned by default).
We can expect analogical things to happen in future, so I guess the answer is to wait until this is somehow resolved… and then write an official note saying “situations like X, unless explicitly specified otherwise, will be interpreted as Y”.
For example, “if the election is fraudulent, but the guy is inaugurated as president, we count is as ‘won the election’, unless the bet specifically mentions ‘fair election’ or something like that”.
Over years, notes like this will accumulate, and future cases can be solved using precedents.
I’m confused about how this is ambiguous? It’s sort of awkward that “official information from Venezuela” and “a consensus of credible reporting” give different answers, but it’s clear that the official info is primary.
Update: the market has resolved to Edmundo Gonzales (not to Maduro). If you think this is not the right resolution given the wording, I agree with you. But if you think the wording was clear and unambiguous to being with, I think this should suggest otherwise.
I don’t think ‘official information from Venezuela’ is fully unambiguous. What should happen if the CNE declares Maduro the winner, but Venezuela’s National Assembly refuses to acknowledge Maduro’s win and appoints someone else to the presidency? This is not a pure hypothetical, this literally happened in 2018! Do we need to wait on resolving the market until we see whether that happens again?
I agree that resolving that market to Maduro is probably the right approach right now. But I don’t actually think the market description is entirely clear even now, and I think it would be very easy for things to have turned out much much worse.
Edited to add: it also seems that many people on Polymarket don’t find it unambiguous in that way. The market is trading currently at 84% for Maduro with noticeable recent activity, which does not seem like what should happen if everyone is clear on the resolution:
But the market isn’t “who will eventually become president”, it’s “who will win the election (according to official sources)”. Like how “who will win the US election (according to AP, Fox and NBC)” and “who will be president on inaugeration day” are different questions.
The standard of “what if the result changes” would make almost any market impossible to resolve. Like what if AP/Fox/NBC call the election for Harris, but then Trump does a coup and threatens them until they announce that actually he won? What if Trump wins but the person who actually gets sworn in is an actor who looks like Trump? Do we need to wait to see if that happens before we resolve the question?
Making most questions not resolve at all is worse than weird edge cases where they resolve in ways people don’t like, so I think in the absence of clear rules that the question won’t resolve until some standard is met, resolving as soon as possible seems like the best default.
I agree that that would probably be the reasonable thing to do at this point. However, that’s not actually what the Polymarket market has done—it’s still Disputed, and in fact Maduro has traded down to 81%.
And I think a large portion of the reason why this has happened is poor decision-making in how the question was initially worded.
Edited to add: Maduro is now down to 63% in the market, probably because the US government announced that it thinks his opponent won? No, that’s not an official Venezuelan source. But it seems to have moved the market anyway.