Forecasting is a responsibility
Real world examples of the Parable of Predict-O-Matic show that trust in predictive accuracy has the power to shape world events. Accuracy brings trust, and trust brings power.
It’s therefore a bit surprising that more people don’t publish verifiable forecasts about major world events, along the lines of Philip Tetlock’s Superforecasting. Even if they were guessing randomly, a lucky streak might bring fame and fortune. Publishing your forecasts would be like buying a lottery ticket.
The explanation is that society defends itself against this threat by demanding that forecasters earn the right to predict. Hence, society’s obsession with credentials, the struggle over who gets to grant them, and constant attempts to grab the right to predict by unlawful or deceptive means. By requiring that wannabe prognosticators earn a graduate degree before they’re taken seriously, and by limit their forecasting privileges to their field of expertise, we hope to cut down on the false positive rate.
No free advertising for lucky charlatans. Forecasting is a privilege, not a right.
However.
Our credentialing system is bloated. Pundits get platforms without a credible track record of success. Feedback loops are inadequate. Ultimately, our system is supposed to rest on democratic accountability.
If citizens aren’t qualified to make forecasts, how are they qualified to choose between the experts making forecasts? If I’m not qualified to forecast which economic policy will be most beneficial, or how we should tackle Coronavirus, why am I qualified to vote at all? The point of debate is to understand the world, and we do this by doing our best to make sound mechanistic arguments that generate meaningful predictions.
If voting is a right, then forecasting is also a right.
However.
There is no clear line between evaluating the opinions of experts, and asserting one’s own authority. If people don’t like the system by which experts are certified, they can invent their own informal process to ratify charlatans.
Experts and charlatans alike advance their careers by learning how to act like Very Serious People. If Very Serious People don’t do something as crass as making predictions, neither will experts nor charlatans, and it will be hard to tell them apart. We end up with a group of Very Serious People, all of whom have different audiences, none of whom will do what it takes to show us that those credentials mean something.
But when election season comes, we vote. Because voting is a responsibility.
I say to the certified experts of the world: voting is our responsibility. Forecasting is yours.
Show us that you understand your field.
Show us that we should listen to you.
Give us ways to compare the wisdom of the experts in your field. If high school teachers, baseball players, and chili cooks at the county fair can stand up to competition and testing, then so can you. Make your forecasting competitions fun, meaningful, and rewarding. It doesn’t have to hurt. It just has to help.
Forecasting is a responsibility.
Thanks for this post!
I’m by no means an expert in my old field (distributed computing) or my new one (technical AI Safety), but you might push me to try forecasting more by default, so that if I’m ever an expert that people listen to, I’ll be used to doing what is my responsibility.
That makes me think: maybe a culture where everyone makes verifiable forecasts would be one that judges expert on how good their forecasts are? Not that I know how to push such a culture.
I’ve been thinking a lot about this. One thing I came up with was Reddit + Prediction Market. Instead of winning/losing money based on predictions, you get win/lose ability to influence content on the site. Front page content would be based on what correct predictors want to see, rather than all votes being equal.
We do have something very loosely like this here. Unlike on Reddit, getting more karma on LW gives your own posts more starting karma, and also ups the amount of Karma you’re able to add/remove from posts. Being able to write well-received posts gives you more of an ability to highlight or bury other people’s posts.
It would be neat if there was a forecasting equivalent to Facebook. One unified identity associated with various prediction markets. Perhaps with lots of tools to create specialized forecasts. You could imagine that there was an official “City of New York” forecasting page, in which the government could post questions that citizens could make forecast about the effect of bills, conditional on them passing or not passing.
Other websites could tie into… let’s call it “Castbook”… and gauge the amount of influence that accounts received on their site based on their track record on Castbook. Castbook could be a professional-esque website like LinkedIn, so that employers would consider who to hire based on their track record.
At its highest growth potential, I imagine that this would be sort of like a “light side” version of China’s reputation tracking system. It would be voluntary, and be about making forecasts rather than judging people. But it would allow this tremendous resource of people considering outcomes to be distilled, preserved, and used not only to make decisions, but also as a complement or alternative to our present credentialing system.
I actually didn’t realize that is how karma works here. Thanks. I think the critique would be that, as I understand it, we would be rewarding popularity rather than accuracy with this model.
Yeah, and that’s not necessarily a bad thing. In theory, it rewards people who are strong content producers, rather than critics.
Undoubtedly true.
However. I suspect most here would claim they want accurate content rather than “strong” content.
We should be candid about what we are optimizing for either way.
Well, the key challenge with forecasting is that you have to choose questions that can be empirically verified.
Most of the posts here aren’t forecasts, though some are, and those often do have explicit forecasts.
When they don’t contain forecasts, sometimes they imply them.
I think a “strong” post implies many plausible forecasts, or gives a reasonable suggestion as to how to make them.
For example, Eliezer has a post titled “belief in belief” that advises that beliefs should pay rent in terms of concrete predictions. This doesn’t obviously imply an easy-to-empirically-evaluate prediction, except that readers may be sympathetic to this idea and find it useful.
Do you mean for pushing this culture on LW?
I’m confused about the difference of your proposal with karma. Is it basically that predictions give/remove karma when they’re respectively correct/incorrect, and thus that a high karma post would probably represent what most best predictors want to see?
I think you’re interpreting me correctly.
Most internet forums are pretty pure democracies. If you can get an account, you can upvote/downvote as much as anyone else and each post you make is graded in a vacuum. I’m proposing an alternative that does not treat all participants equally.
Accurate predictions give you more karma and that karma means your posts/comments/upvotes/downvotes rank higher and have more influence than people who have not built up “accuracy karma”
Amen.
Given how trivial it would be to do this in many fields....one has to wonder it isn’t being done.
Yep! I’ve been thinking about this quite a bit lately. May write another post later.
It seems to me that we make social progress by imposing a competitive or game-like structure on meandering human activity. This gives it direction and allows judgment of expertise by a uniform set of standards.
This might be a strategy for the “move fast and break things” approach to political activism.
For example, imagine the town of Libertyville is considering whether to take on debt to pay for connecting itself to the sewage system of the nearby metropolis, so that a Walmart will be able to be built.
Some people think this is bad for the local economy; they don’t want to take on debt, and they think the Walmart will destroy local businesses. Others think it’s good for the local economy because of the jobs it will bring.
Currently, everybody will debate this, and then they’ll vote and see what happens. The valuable thing for the whole community that is lost is a record of who thought what, who changed their minds and how, and who had the greatest insight.
Repeat over and over again for many other decisions. What’s happening when we don’t do forecasting tournaments is we waste a tremendous amount of information about who in our communities is most politically engaged, and who has good judgment. Forecasting tournaments have an opportunity to preserve this information in a distilled form that we can use to make communities stronger.
If Berg wanted to do better, they could organize such a tournament in which they ask each side in the debate to specify the most significant verifiable impacts a bill would have, year by year, after it was passed. Forecasters vote on their predicted outcomes for both results; each forecaster is graded on the forecasts they made for the bill chosen. For example, all forecasters might vote on:
If the bill is passed:
a) Construction on a Walmart will begin
b) Construction on a Walmart will be complete
c) Population will increase by at least 2%
d) At least three currently registered businesses will go out of business within two years
e) At least 5 new businesses in the city limits will open up within two years.
If the bill is not passed:
a) Population will decrease
d) At least three currently registered businesses will go out of business within two years
e) At least 5 new businesses in the city limits will open up within two years.
I imagine that at first, the forecast results would be kind of useless. It would just be blabbing and partisan cheerleading. People might rope their friends into making thoughtless “forecasts” just to up the numbers for their preferred side.
But over time, forecasters would accumulate a track record. The overall community forecast could be weighted by that track record. Perhaps newer people have something equivalent to “lightweight credit” and consistently bad forecasters have “heavy, poor credit.”
The forecasts wouldn’t displace voting. But they would impose some beneficial structure into the city-wide political conversation, shape people’s conversation in a way that might be helpful, and influence votes by creating some record of the thoughts and track records of the best forecasters.
I would be absolutely delighted if such a competition existed in my city. It seems like it could be set up amongst a small group of avid forecasters without asking anybody for permission, grow by invitation, and naturally come to be a fixture of local politics.
In order to facilitate that, I think it would be wonderful if there was a website or app in which forecasting competitions could be created.
I agree with all that.
My question is why isn’t it already being done?
Why don’t the media make pundits and politicians go on record and challenge or ignore them when they get things wrong? One theory might be they aren’t interested in informing as much as getting ratings and accuracy isn’t a big part of that. Another more conspiracy minded theory might be that they are interesting in driving a certain narrative.
It’s even more interesting to think about why companies don’t use prediction markets. Seems like a company would be more profitable if they had the best predictors....so why aren’t they doing more to attract them? Robin Hanson has, what I think, are some great thoughts on this.
Again, agree with everything you’ve said in this post, but I don’t feel we have a great theory on why people don’t take forecasting more seriously and until we do are unlikely to increase their use.
In reality, many things happen at the same time. It is difficult to give a prediction of what X will achieve, if you don’t know the outcomes of Y and Z, which will be decided tomorrow. Perhaps the decision on Y and Z will depend on the outcome of votes for X.
For example, imagine making a prediction about a future of a startup that breaks some regulations, such as Uber. (Depending on your prediction, citizens will vote whether to invite Uber into your city.) Well, the prediction depends on how strongly the breaking of regulations will be prosecuted and punished. Suppose your political opponents have a strong influence on prosecutors and judges. Now, whatever prediction about the future of Uber you make, they can make the opposite happen. It would cost them some political points, but maybe it is worth doing for them, if it results in hugely discrediting you and eliminating you from the competition.
Now not only can your enemies make your project fail, they can also make you seem like a bad predictor if that happens. And if your prediction contains hundreds of “if”s, no one will take it seriously. “My project will increase the GDP of our community by 2% withing 5 years, assuming the government will not do something else that decreases the GDP, nor change the way GDP is measured, nor adopt new laws that make parts of my project illegal, nor change the tax code to make my project more expensive, nor … organize a Twitter mob that would attack people participating in my project, nor ….”
This is a really good point. It would be a terrible idea to judge economic policy according to whether GDP went up after it was enacted.
What is the way to judge economic policy?
Sure.
But thats true of pretty much every prediction market. Unless you’re opposed to prediction markets very broadly, I’m not sure this changes anything I’ve said.
prediction market + future depends on the voters’ actions = moral hazard
I guess this is an argument against prediction markets in general, unless the results are independent on what voters do (or a voter activity aimed to change the outcome is prosecuted as a fraud).
We can see this in politics. The Blue Party says that a major Green Party policy will fail. When the Green Party enacts it, the Blues sabotage it, then use its failure as evidence that Blue ideology is correct.
Likewise, partisan pollsters run polls designed to exaggerate support for their preferred candidate, under the theory that predictions of success can cause success.
It also seems to me that any form of prediction market has moral hazard. They’ve been criticized as assassination markets, and any questions that have anything to do with the behavior of an individual person set up an incentive for murder.
Hence, the key question is not whether moral hazard exists, but whether the tradeoffs are acceptable.
It’s hard to say if the entire world should be run by prediction markets. But I think we should be experimenting more with forecasting and prediction markets at more local levels.
Can you give an example? What is the difference between “sabotaging” a policy and simply opposing a policy you don’t support?
Sure. Republicans couldn’t afford to repeal the Affordable Care Act (I’m assuming you’re American, let me know if not and I’ll explain further), because it was too popular even among republicans.
One of the gears that makes it work is the individual mandate, a tax penalty for not having health insurance. This penalty was meant to incentivize young healthy people to sign up for health insurance, effectively subsidizing the insurance policies of the old and sick. The Republicans repealed this component, which should have the effect of gutting the revenue stream that makes the ACA work, driving up the cost of insurance.
Opposing the policy would look like Republicans repealing the ACA. Sabotaging it looks like keeping it in place, but repealing a tax penalty that effectively funds it, making it untenable in the long run as insurance prices spike. They can blame this on Democrats, arguing that a big government insurance program was bound to fail, while taking credit for the tax cut.
The dynamic I’m describing is meant only to illustrate the difference between political opposition and sabotage, not a serious piece of contemporary political analysis.
To define them more generally, political opposition is about making the consequences of your opposing actions clear.
Political sabotage is hiding the consequences of your opposing actions.
Well, in politics, you’re dealing with an issue where enforcing compliance would be difficult. It’s famously hard to get a politician to give a clear answer to a clear question.
I think that partly, the idea of forecasting as practiced by Tetlock et al is just relatively new. It hasn’t seeped into society to the extent that it might someday. Companies probably need to be of a certain size to make such tournaments seem worthwhile.
But I wouldn’t be surprised if some companies do in fact run internal forecasting competitions or prediction markets. The Pentagon tried running a prediction market to identify security threats, but shut it down when it was criticized as as “terrorism futures market.”
My particular interest is not the macro question of “why isn’t this done more,” but the micro question of “what’s the easiest way that an individual or small group could start a forecasting competition?”
One way is to build better prediction markets. If we would have better prediction markets that move a decent chunk of money those prediction could be used.
Another is to think through the application of forcasting in individual fields. In my post on cancer I for example wrote about Prediction-Based Medicine for compassionate use. While it’s likely impossible to convince people of Prediction-Based Medicine for those decisions that are currently allegedly made via Evidence-Based Medicine campaigning for it being used for in cases where compassionate use is currently done without any need for any evidence might be more promising.
On the one hand you have people who don’t like that doctors can promise patients whatever they want in cases of compassionate use and on the other hand you have people who find compassionate use important because it allows patients to get life saving drugs before they are approved by the FDA. Prediction-Based Medicine looks to me like a great compromize between the two sides.
When it comes to computer programming on tool I would like to see is a program that asks you to make a prediction whenever you run your unit test about whether or not those unit tests will fail.
There are a lot of cases of specific expertise where it’s worth thinking about how to build forcasting based systems for them.
When it comes to journalism, a new way to do journalism can be invented that works on the Blockchain. It’s possible to raise sizeable amounts of capital in ICO’s if you have a well thought out idea.
Phil Tetlock tried doing something like this at the pundits—unilaterally—in a project called the Alpha Pundit Challenge. I don’t know if it went anywhere but it’s an exciting and bold idea.
https://www.openphilanthropy.org/giving/grants/university-pennsylvania-philip-tetlock-forecasting#Key_questions_for_follow_up
https://www.openphilanthropy.org/files/Grants/Tetlock/Revolutionizing_the_interviewing_of_alpha-pundits_nov_10_2015.pdf