I was experimenting with exactly the same thing using GPT-4. Only 20 top questions, result—more or less equal to community.
But then it came to numeric estimations like “how much people will die due to covid”—it was outperforming humans giving highly accurate predictions (i was asking not for values but for ranges with quartiles, and results of humans had much higher dispersion comparing to GPT)
Also I was comparing only results of community predictions for January 2022 since gpt-4 was trained on sept 2021, and its unfair to compare predictions if people had a lot of additional evidence which gpt doesnt have.
if it’s interesting I can share methodology, results and dataset.
I was experimenting with exactly the same thing using GPT-4. Only 20 top questions, result—more or less equal to community.
But then it came to numeric estimations like “how much people will die due to covid”—it was outperforming humans giving highly accurate predictions (i was asking not for values but for ranges with quartiles, and results of humans had much higher dispersion comparing to GPT)
Also I was comparing only results of community predictions for January 2022 since gpt-4 was trained on sept 2021, and its unfair to compare predictions if people had a lot of additional evidence which gpt doesnt have.
if it’s interesting I can share methodology, results and dataset.