Experienced quant trader, based in Dublin. Formerly a volunteer at Rethink Priorities, where I did some forecasting research. Interested in longtermism and meta causes. More active on the EA Forum https://forum.effectivealtruism.org/users/charles_dillon
CharlesD
How does forecast quantity impact forecast quality on Metaculus?
The article convincingly makes the weaker claim that there’s no guarantee of a fire alarm, and provides several cases which support this. I don’t buy the claim (which the article also tries to make) that there is no possible fire alarm, and such a claim seems impossible to prove anyway.
Whether it’s smoke or a fire alarm, that doesn’t really address the specific question I’m asking, in any case.
Great point—I’m not sure if that contained aspects which are similar enough to AI to resolve such a question. This source doesn’t think it counts as AI (though it doesn’t provide much of an argument for this) and I can’t find reference to machine learning or AI on the MCAS page, though clearly one could use AI tools to develop an automated control system like this and I don’t feel well positioned to judge whether it should count.
[Question] What could small scale disasters from AI look like?
An examination of Metaculus’ resolved AI predictions and their implications for AI timelines
Data on forecasting accuracy across different time horizons and levels of forecaster experience
Thanks for a great post! I have a concern about your sample sizes however.
I am looking into similar questions myself, and while reading your post I was surprised to see your Metaculus sample claimed as 45k predictions. These are not actually individual predictions, but rather the time series of community predictions, which are much less information dense, as this is just the median of the recent community predictions at that time and typically a new prediction will have a small effect on this value. I think claiming the sample size is 45k is therefore a bit misleading.
It also has the effect of linearly weighting Metaculus questions by community interest, which is not obviously a desirable method (this is mitigated by the cap on the time series length to 101, which means the effect will ultimately be small as you have 557 Metaculus questions, implying your average question must have more than 80 predictions)
That seems like a different question which is partially entangled with AI but not necessarily, as more screen time doesn’t necessarily need to be caused by AI, and the harms are harder to evaluate (even the sign of the value of “more screen time” is probably disputed).