This seems like a really hard problem: if a market like this “wins,” so that having a lot of points makes you high-status, people will try to game it, and if gaming it is easy, this will kill respect for the market.
Specific gaming strategies I can think of:
Sybil attacks: I create one “real” account and 100 sock puppets; my sock puppets make dumb bets against my real account; my real account gains points, and I discard my sock puppets. Defenses I’ve heard of against Sybil attacks: make it costly to participate (e.g. proof-of-work); make the cost of losing at least as great as the benefit of winning (e.g. make “points” equal money); or do Distributed Trust Stuff (e.g. Rangzen, TrustDavis).
Calibration-fluffing: if the market grades me on calibration, then I can make dumb predictions but still look perfectly calibrated by counterbalancing those with more dumb predictions (e.g. predict “We’ll have AGI by Tuesday, 90%”, then balance that out with nine “The sun will rise tomorrow, 90%” predictions). To protect against this… seems like you’d need some sort of way to distinguish “predictions that matter” from “calibration fluff.”
Buying status: pay people to make dumb bets against you. The Metaculus equivalent of buying Likes or Amazon reviews. On priors, if Amazon can’t squash this problem, it probably can’t be squashed.
Buying status: pay people to make dumb bets against you. The Metaculus equivalent of buying Likes or Amazon reviews. On priors, if Amazon can’t squash this problem, it probably can’t be squashed.
Note that this could be mitigated by other people being able to profit off of obvious epistemic inefficiencies in the prediction markets: if your bots drive the community credence down super far, and if other people notice this, then other people might come in and correct part of the issue. This would reduce your advantage relative to other Metaculites.
This seems like a really hard problem: if a market like this “wins,” so that having a lot of points makes you high-status, people will try to game it, and if gaming it is easy, this will kill respect for the market.
This is not really a problem at Metaculus. Metaculus has metrics for player prediction, community prediction and Metaculus prediction on the questions someone answers.
The community prediction could be changed with sockpuppets but the Metaculus prediction can’t. You can judge people on how near they are to the Metaculus prediction in predictive accuracy or whether they even outperform it.
Metaculus however decides to keep that metric mostly private and makes it nonpublic. Metaculus has the problem of not wanting to embarrass users who make a lot of predictions when those predictions are on average bad
The Metaculus equivalent of buying Likes or Amazon reviews. On priors, if Amazon can’t squash this problem, it probably can’t be squashed.
I don’t think Amazon makes a serious effort at battling review fraud just as Youtube doesn’t make a serious effort at comment quality when it easily could do something about it.
Amazon also has a harder problem because as it’s ground truth is less clear.
For scoring systems, rather than betting markets, none of these particular attacks work. This is trivially true for the first and third attack, since you don’t be against individuals. And for any proper scoring rule, calibration-fluffing is worse than predicting your true odds for the dumb predictions. (Aligning incentives is still very tricky, but the set of attacks are very different.)
I would love to live in this world.
This seems like a really hard problem: if a market like this “wins,” so that having a lot of points makes you high-status, people will try to game it, and if gaming it is easy, this will kill respect for the market.
Specific gaming strategies I can think of:
Sybil attacks: I create one “real” account and 100 sock puppets; my sock puppets make dumb bets against my real account; my real account gains points, and I discard my sock puppets. Defenses I’ve heard of against Sybil attacks: make it costly to participate (e.g. proof-of-work); make the cost of losing at least as great as the benefit of winning (e.g. make “points” equal money); or do Distributed Trust Stuff (e.g. Rangzen, TrustDavis).
Calibration-fluffing: if the market grades me on calibration, then I can make dumb predictions but still look perfectly calibrated by counterbalancing those with more dumb predictions (e.g. predict “We’ll have AGI by Tuesday, 90%”, then balance that out with nine “The sun will rise tomorrow, 90%” predictions). To protect against this… seems like you’d need some sort of way to distinguish “predictions that matter” from “calibration fluff.”
Buying status: pay people to make dumb bets against you. The Metaculus equivalent of buying Likes or Amazon reviews. On priors, if Amazon can’t squash this problem, it probably can’t be squashed.
Note that this could be mitigated by other people being able to profit off of obvious epistemic inefficiencies in the prediction markets: if your bots drive the community credence down super far, and if other people notice this, then other people might come in and correct part of the issue. This would reduce your advantage relative to other Metaculites.
This is not really a problem at Metaculus. Metaculus has metrics for player prediction, community prediction and Metaculus prediction on the questions someone answers.
The community prediction could be changed with sockpuppets but the Metaculus prediction can’t. You can judge people on how near they are to the Metaculus prediction in predictive accuracy or whether they even outperform it.
Metaculus however decides to keep that metric mostly private and makes it nonpublic. Metaculus has the problem of not wanting to embarrass users who make a lot of predictions when those predictions are on average bad
I don’t think Amazon makes a serious effort at battling review fraud just as Youtube doesn’t make a serious effort at comment quality when it easily could do something about it.
Amazon also has a harder problem because as it’s ground truth is less clear.
For scoring systems, rather than betting markets, none of these particular attacks work. This is trivially true for the first and third attack, since you don’t be against individuals. And for any proper scoring rule, calibration-fluffing is worse than predicting your true odds for the dumb predictions. (Aligning incentives is still very tricky, but the set of attacks are very different.)