Two months ago I said I’d be creating a list of predictions about the future in honor of my baby daughter Artemis. Well, I’ve done it, in spreadsheet form. The prediction questions all have a theme: “Cyberpunk.” I intend to make it a fun new year’s activity to go through and make my guesses, and then every five years on her birthdays I’ll dig up the spreadsheet and compare prediction to reality.
I hereby invite anybody who is interested to go in and add their own predictions to the spreadsheet. Also feel free to leave comments asking for clarifications and proposing new questions or resolution conditions.
I’m thinking about making a version in Foretold.io, since that’s where people who are excited about making predictions live. But the spreadsheet I have is fine as far as I’m concerned. Let me know if you have an opinion one way or another.
(Thanks to Kavana Ramaswamy and Ramana Kumar for helping out!)
We (Foretold) have been recently experimenting with “notebooks”, which help structure tables for things like this.
I think a notebook/table setup for your spreadsheet could be a decent fit. These take a bit of time to set up now (because we need to generate each cell using a separate tool), but we could help with that if this looks interesting to you.
You can click on cells to add predictions to them.
Foretold is more experimental than Metaculus and doesn’t have as large a community. But it could be a decent fit for this (and this should get better in the next 1-3 months, as notebooks get improved)
OK, thanks Ozzie on your recommendation I’ll try to make this work. I’ll see how it works, see if I can do it myself, and reach out to you if it seems hard.
1. By measureables you mean questions, right? Using the “New question” button? Is there a way for me to have a single question of the form “X is true” and then have four columns, one for each year (2025, 2030, 2035, 2040) where people can put in four credences for whether X will be true at each of those years?
2. I created a notebook/table with what I think are correctly formatted columns. Before I can add a “data” section to it, I need IDs, and for those I need to have made questions, right?
Yes, sorry. Yep, you need to use the “New question” button. If you want separate things for 4 different years, you need to make 4 different questions. Note that you can edit the names & descriptions in the notebook view, so you can make them initially with simple names, then later add the true names to be more organized.
You are correct. In the “details” sections of questions, you can see their IDs. These are the items to use.
You can of course edit notebooks after making them, so you may want to first make it without the IDs, then once you make the questions, add the IDs in, if you’d prefer.
Sept 22nd 2025 has passed now, which is the date that the first column of probabilities was referring to.
I was curious how they turned out so I asked a Claude (don’t remember which one) to judge whether the events had happened or not. And then got GPT-5.1-thinking to do check if it agreed with Claude’s judgments. (With disagreements between Claude and GPT-5.1 lazily adjudicated by me.) Here’s the link to the GPT-5.1 convo if you’re interested. (Results at the bottom.) There might well be major errors in the LLM’s judgments and my adjudications.
Thank you! I continued your ChatGPT conversation and got the following tidbits of extra info:
For my 2020 predictions, I asked it to summarize my biggest sources of Brier loss:
Top 5 Brier losses from FALSE outcomes (overconfidence)
Rank
Claim
p (2020)
Loss
1
100+ drone swarms operationally fielded
0.50
0.250
2
10%+ news articles AI-generated
0.20
0.040
3
$1 deepfake costs $100+ to detect
0.35
0.123
4
Orbital rocket reused 100 times
0.30
0.090
5
90%+ Rio Tinto mines automated
0.30
0.090
Top 5 Brier losses from TRUE outcomes (pessimism)
Rank
Claim
p (2020)
Loss
1
AI wins International Math Olympiad
0.15
0.723
2
Neural net writes B-average college essays
0.20
0.640
3
Crypto restaurants near Berkeley
0.20
0.640
4
Combat drones <50 lbs fielded
0.50
0.250
5
AI beats top StarCraft players
0.70
0.090
The IMO prediction alone is your single largest loss across all 2020 items.
Interestingly the two biggest hits to Brier score I took (on jan 1 2020) were from underestimating LLMs basically.
Also interestingly, it seems like maybe I took significantly more Brier loss from saying things wouldn’t happen, that did happen, then from saying things would happen, that didn’t? In other words I was too cautious overall rather than too credulous?
ChatGPT goes on to say that I outperformed Rick (who also made predictions in 2020 in the spreadsheet.) However, looking over the data briefly, I’m not sure I agree with some of the scores, e.g. are there really robotaxis in 20+ cities now? And drone delivery?
Kiwibot has operated delivery robots in Berkeley since 2017, founded in UC Berkeley’s Skydeck incubator. Delivers food within approximately one mile of campus with over 250,000 total deliveries completed.
Googling quickly, there are claims that it has since shut down and also that it was remote-controlled rather than fully autonomous. In any case, it’d be pretty niche and clearly only available due to the novelty value.
Robotaxis in 20+ cities was something claude initially thought false and then gpt-5.1 thought it was “borderline true” based on a bunch of baidu deployments. E.g. source. No idea whether that holds up, idk the robotaxi situation in china. (Also that news is slightly after september 22.)
I also think the starcraft one is probably wrong. Looking now, the models seem to be mainly leaning on 2019 cites, which I think weren’t sufficient to show AI consistently beating humans.
Two months ago I said I’d be creating a list of predictions about the future in honor of my baby daughter Artemis. Well, I’ve done it, in spreadsheet form. The prediction questions all have a theme: “Cyberpunk.” I intend to make it a fun new year’s activity to go through and make my guesses, and then every five years on her birthdays I’ll dig up the spreadsheet and compare prediction to reality.
I hereby invite anybody who is interested to go in and add their own predictions to the spreadsheet. Also feel free to leave comments asking for clarifications and proposing new questions or resolution conditions.
I’m thinking about making a version in Foretold.io, since that’s where people who are excited about making predictions live. But the spreadsheet I have is fine as far as I’m concerned. Let me know if you have an opinion one way or another.
(Thanks to Kavana Ramaswamy and Ramana Kumar for helping out!)
Hi Daniel!
We (Foretold) have been recently experimenting with “notebooks”, which help structure tables for things like this.
I think a notebook/table setup for your spreadsheet could be a decent fit. These take a bit of time to set up now (because we need to generate each cell using a separate tool), but we could help with that if this looks interesting to you.
Here are some examples: https://www.foretold.io/c/0104d8e8-07e4-464b-8b32-74ef22b49f21/n/6532621b-c16b-46f2-993f-f72009d16c6b https://www.foretold.io/c/47ff5c49-9c20-4f3d-bd57-1897c35cd42d/n/2216ee6e-ea42-4c74-9b11-1bde30c7dd02 https://www.foretold.io/c/1bea107b-6a7f-4f39-a599-0a2d285ae101/n/5ceba5ae-60fc-4bd3-93aa-eeb333a15464
You can click on cells to add predictions to them.
Foretold is more experimental than Metaculus and doesn’t have as large a community. But it could be a decent fit for this (and this should get better in the next 1-3 months, as notebooks get improved)
OK, thanks Ozzie on your recommendation I’ll try to make this work. I’ll see how it works, see if I can do it myself, and reach out to you if it seems hard.
Sure thing. We don’t have documentation for how to do this yet, but you can get an idea from seeing the “Markdown” of some of those examples.
The steps to do this:
Make a bunch of measurables.
Get the IDs of all of those measurables (you can see these in the Details tabs on the bottom)
Create the right notebook/table, and add all the correct IDs to the right places within them.
OK, some questions:
1. By measureables you mean questions, right? Using the “New question” button? Is there a way for me to have a single question of the form “X is true” and then have four columns, one for each year (2025, 2030, 2035, 2040) where people can put in four credences for whether X will be true at each of those years?
2. I created a notebook/table with what I think are correctly formatted columns. Before I can add a “data” section to it, I need IDs, and for those I need to have made questions, right?
Yes, sorry. Yep, you need to use the “New question” button. If you want separate things for 4 different years, you need to make 4 different questions. Note that you can edit the names & descriptions in the notebook view, so you can make them initially with simple names, then later add the true names to be more organized.
You are correct. In the “details” sections of questions, you can see their IDs. These are the items to use.
You can of course edit notebooks after making them, so you may want to first make it without the IDs, then once you make the questions, add the IDs in, if you’d prefer.
Sept 22nd 2025 has passed now, which is the date that the first column of probabilities was referring to.
I was curious how they turned out so I asked a Claude (don’t remember which one) to judge whether the events had happened or not. And then got GPT-5.1-thinking to do check if it agreed with Claude’s judgments. (With disagreements between Claude and GPT-5.1 lazily adjudicated by me.) Here’s the link to the GPT-5.1 convo if you’re interested. (Results at the bottom.) There might well be major errors in the LLM’s judgments and my adjudications.
Thank you! I continued your ChatGPT conversation and got the following tidbits of extra info:
For my 2020 predictions, I asked it to summarize my biggest sources of Brier loss:
Top 5 Brier losses from FALSE outcomes (overconfidence)
Top 5 Brier losses from TRUE outcomes (pessimism)
The IMO prediction alone is your single largest loss across all 2020 items.
Interestingly the two biggest hits to Brier score I took (on jan 1 2020) were from underestimating LLMs basically.
Also interestingly, it seems like maybe I took significantly more Brier loss from saying things wouldn’t happen, that did happen, then from saying things would happen, that didn’t? In other words I was too cautious overall rather than too credulous?
ChatGPT goes on to say that I outperformed Rick (who also made predictions in 2020 in the spreadsheet.) However, looking over the data briefly, I’m not sure I agree with some of the scores, e.g. are there really robotaxis in 20+ cities now? And drone delivery?
Yep, resolutions not very reliable.
The drone delivery one was claude claiming:
Googling quickly, there are claims that it has since shut down and also that it was remote-controlled rather than fully autonomous. In any case, it’d be pretty niche and clearly only available due to the novelty value.
Robotaxis in 20+ cities was something claude initially thought false and then gpt-5.1 thought it was “borderline true” based on a bunch of baidu deployments. E.g. source. No idea whether that holds up, idk the robotaxi situation in china. (Also that news is slightly after september 22.)
I also think the starcraft one is probably wrong. Looking now, the models seem to be mainly leaning on 2019 cites, which I think weren’t sufficient to show AI consistently beating humans.
Well, many of them live on Metaculus.
Right, no offense intended, haha! (I already made a post about this on Metaculus, don’t worry I didn’t forget them except in this post here!)