I had written up this summary of my takeaways (after observing this conversation in realtime, plus some related conversations). This is fairly opinionated, rather than a strict summary. Seems maybe better to just list it entirely here:
Epistemic Status: quite rough, I didn’t take very good notes and was summarizing the salient bits after the fact. Apologies for anything I got wrong here, grateful for Ozzie and Vaniver clarifying some things in the comments.
Just spent a weekend at the Internet Intellectual Infrastructure Retreat. One thing I came away with was a slightly better sense of was forecasting and prediction markets, and how they might be expected to unfold as an institution.
I initially had a sense that forecasting, and predictions in particular, was sort of “looking at the easy to measure/think about stuff, which isn’t necessarily the stuff that connected to stuff that matters most.”
Tournaments over Prediction Markets
Prediction markets are often illegal or sketchily legal. But prediction tournaments are not, so this is how most forecasting is done.
The Good Judgment Project
Held an open tournament, the winners of which became “Superforecasters”. Those people now… I think basically work as professional forecasters, who rent out their services to companies, NGOs and governments that have a concrete use for knowing how likely a given country is to go to war, or something. (I think they’d been hired sometimes by Open Phil?)
Vague impression that they mostly focus on geopolitics stuff?
High Volume and Metaforecasting
Ozzie described a vision where lots of forecasters are predicting things all the time, which establishes how calibrated they are. This lets you do things like “have one good forecaster with a good track record make lots of predictions. Have another meta-forecaster evaluate a small sample of their predictions to sanity check that they are actually making good predictions”, which could get you a lot of predictive power for less work than you’d expect.”
This seemed interesting, but I still had some sense of “But how do you get all these people making all these predictions? The prediction markets I’ve seen don’t seem to accomplish very interesting things, for reasons Zvi discussed here.” Plus I’d heard that sites like Metaculus end up mostly being about gaming the operationalization rules than actually predicting things accurately.
Automation
One thing I hadn’t considered is that Machine Learning is already something like high volume forecasting, in very narrow domains (i.e. lots of bots predicting which video you’ll click on next). One of Ozzie’s expectations is that over time, as ML improves, it’ll expand the range of things that bots can predict. So some of the high volume can come from automated forecasters.
Neural nets and the like might also be able to assist in handling the tricky “operationalization bits”, where you take a vague prediction like “will country X go to war against country Y” and turn that into the concrete observations that would count for such a thing. Currently this takes a fair amount of overhead on Metaculus. But maybe at some point this could get partly automated.
(there wasn’t a clear case for how this would happen AFAICT, just ‘i dunno neural net magic might be able to help.’ I don’t expect neural-net magic to help here in the next 10 years but I could see it helping in the next 20 or 30. I’m not sure if it happens much farther in advance than “actual AGI” though)
I [think] part of the claim was that for both the automated-forecasting and automated-operationalization, it’s worth laying out tools, infrastructure and/or experiments now that’ll set up our ability to take advantage of them later.
Sweeping Visions vs Near-Term Practicality, and Overly Narrow Ontologies
An aesthetic disagreement I had with Ozzie was:
My impression is that Ozzie is starting with lots of excitement for forecasting as a whole, and imagining entire ecosystems built out of it. And… I think there’s something important and good about people being deeply excited for things, exploring them thoroughly, and then bringing the best bits of their exploration back to the “rest of the world.”
But when I look at the current forecasting ecosystem, it looks like the best bits of it aren’t built out of sweeping infrastructural changes, they’re built of small internal teams building tools that work for them, or consulting firms of professionals that hire themselves out. (Good Judgment project being one, and the How To Measure Anything guy being another)
The problem with large infrastructural ecosystems is this general problem you also find on Debate-Mapping sites – humans don’t actually think in clean boxes that are easy to fit into database tables. They think in confused thought patterns that often need to meander, explore special cases, and don’t necessarily fit whatever tool you built for them to think in.
Relatedly: every large company I’ve worked at has built internal tools of some sort, even for domains that seem like they sure out to be able to be automated and sold at scale. Whenever I’ve seen someone try to purchase enterprise software for managing a product map, it’s either been a mistake, or the enterprise software has required a lot of customization before it fit the idiosyncratic needs of the company.
Google sheets is really hard to beat as a coordination tool (but a given google sheet is hard to scale)
So for the immediate future I’m more excited by hiring forecasters and building internal forecasting teams than ecosystem-type websites.
Those people now… I think basically work as professional forecasters.
I don’t think any of the superforecasters are full-time on forecasting, instead doing it as a contractor gig; mostly due to lack of demand for the services.
Vague impression that they mostly focus on geopolitics stuff?
Yes, initial tournament were sponsored by IARPA who cared about that, and Tetlock’s earlier work in the 90′s and 00′s also considered expert political forecasting.
I don’t think any of the superforecasters are full-time on forecasting, instead doing it as a contractor gig; mostly due to lack of demand for the services.
Good to know. I’d still count contractors as professionals though.
(there wasn’t a clear case for how this would happen AFAICT, just ‘i dunno neural net magic might be able to help.’ I don’t expect neural-net magic to help here in the next 10 years but I could see it helping in the next 20 or 30. I’m not sure if it happens much farther in advance than “actual AGI” though)
I thought Ozzie’s plan here was closer to “if you have a knowledge graph, you can durably encode a lot of this in ways that transfer between questions”, and you can have lots of things where you rapidly build out a suite of forecasts with quantifiers and pointers. I thought “maybe NLP will help you pick out bad questions” but I think this is more “recognizing common user errors” than it is “understanding what’s going on.”
Yep. I don’t think any/much NLP is interesting for a lot of interesting work, if things are organized well with knowledge graphs. I haven’t thought much about operationalizing questions using ML, but have been thinking that by focussing on questions that could be scaled (like, GDP/Population of every country for every year), we could get a lot of useful information without a huge amount of operationalization work.
I think it would probably take a while to figure out the specific cruxes of our disagreements.
On your “aesthetic disagreement”, I’d point out that there are, say, three types of forecasting work with respect to organizations.
Organization-specific, organization-unique questions.
These are questions such as, “Will this specific initiative be more successful than this other specific initiative?” Each one needs to be custom made for that organization.
Organization-specific, standard questions.
These are questions such as, “What is the likelihood that employee X will leave in 3 months”; where this question can be asked at many organizations and compared as such. A specific instance is unique to an organization, but the more general question is quite generic.
Inter-organization questions.
These are questions such as, “Will this common tool that everyone uses get hacked by 2020?”. Lots of organizations would be interested.
I think right now organizations are starting traditional judgemental forecasting for type (1), but there are several standard tools already for type (2). For instance, there are several startups that help businesses forecast key variables; like engineering timelines, sales, revenue, and HR issues.
https://www.liquidplanner.com/
I think type (3) is most exciting to me; that’s where PredictIt and Metaculus are currently. Getting the ontology right is difficult, but possible. Wikipedia and Wikidata are two successful (in my mind) examples of community efforts with careful ontologies that are useful to many organizations; I see many future public forecasting efforts in a similar vein. That said, I have a lot of uncertainty, so would like to see everything tried more.
I could imagine, in the “worst” case, that the necessary team for this could just be hired. You may be able to do some impressive things with just 5 full time equivalents, which isn’t that expensive in the scheme of things. The existing forecasting systems don’t seem to have that many full time equivalents to me (almost all forecasters are very part time)
I had written up this summary of my takeaways (after observing this conversation in realtime, plus some related conversations). This is fairly opinionated, rather than a strict summary. Seems maybe better to just list it entirely here:
Epistemic Status: quite rough, I didn’t take very good notes and was summarizing the salient bits after the fact. Apologies for anything I got wrong here, grateful for Ozzie and Vaniver clarifying some things in the comments.
Factual correction:
I don’t think any of the superforecasters are full-time on forecasting, instead doing it as a contractor gig; mostly due to lack of demand for the services.
Yes, initial tournament were sponsored by IARPA who cared about that, and Tetlock’s earlier work in the 90′s and 00′s also considered expert political forecasting.
Good to know. I’d still count contractors as professionals though.
I thought Ozzie’s plan here was closer to “if you have a knowledge graph, you can durably encode a lot of this in ways that transfer between questions”, and you can have lots of things where you rapidly build out a suite of forecasts with quantifiers and pointers. I thought “maybe NLP will help you pick out bad questions” but I think this is more “recognizing common user errors” than it is “understanding what’s going on.”
Nod, I definitely expect I missed some details, and defer to you or Ozzie on a more precise picture.
Yep. I don’t think any/much NLP is interesting for a lot of interesting work, if things are organized well with knowledge graphs. I haven’t thought much about operationalizing questions using ML, but have been thinking that by focussing on questions that could be scaled (like, GDP/Population of every country for every year), we could get a lot of useful information without a huge amount of operationalization work.
I think it would probably take a while to figure out the specific cruxes of our disagreements.
On your “aesthetic disagreement”, I’d point out that there are, say, three types of forecasting work with respect to organizations.
Organization-specific, organization-unique questions. These are questions such as, “Will this specific initiative be more successful than this other specific initiative?” Each one needs to be custom made for that organization.
Organization-specific, standard questions. These are questions such as, “What is the likelihood that employee X will leave in 3 months”; where this question can be asked at many organizations and compared as such. A specific instance is unique to an organization, but the more general question is quite generic.
Inter-organization questions. These are questions such as, “Will this common tool that everyone uses get hacked by 2020?”. Lots of organizations would be interested.
I think right now organizations are starting traditional judgemental forecasting for type (1), but there are several standard tools already for type (2). For instance, there are several startups that help businesses forecast key variables; like engineering timelines, sales, revenue, and HR issues. https://www.liquidplanner.com/
I think type (3) is most exciting to me; that’s where PredictIt and Metaculus are currently. Getting the ontology right is difficult, but possible. Wikipedia and Wikidata are two successful (in my mind) examples of community efforts with careful ontologies that are useful to many organizations; I see many future public forecasting efforts in a similar vein. That said, I have a lot of uncertainty, so would like to see everything tried more.
I could imagine, in the “worst” case, that the necessary team for this could just be hired. You may be able to do some impressive things with just 5 full time equivalents, which isn’t that expensive in the scheme of things. The existing forecasting systems don’t seem to have that many full time equivalents to me (almost all forecasters are very part time)