Daniel Kokotajlo comments on evhub’s Shortform

Daniel Kokotajlo 31 Dec 2024 1:02 UTC
LW: 16 AF: 11
1
AF
Another idea: “AI for epistemics” e.g. having a few FTE’s working on making Claude a better forecaster. It would be awesome if you could advertise “SOTA by a significant margin at making real-world predictions; beats all other AIs in prediction markets, forecasting tournaments, etc.”

And it might not be that hard to achieve (e.g. a few FTEs maybe). There are already datasets of already-resolved forecasting questions, plus you could probably synthetically generate OOMs bigger datasets—and then you could modify the way pretraining works so that you train on the data chronologically, and before you train on data from year X you do some forecasting of events in year X....

Or even if you don’t do that fancy stuff there are probably low-hanging fruit to pick to make AIs better forecasters.

Ditto for truthful AI more generally. Could train Claude to be well-calibrated, consistent, extremely obsessed with technical correctness/accuracy (at least when so prompted)...

You could also train it to be good at taking people’s offhand remarks and tweets and suggesting bets or forecasts with resolveable conditions.

You could also e.g. have a quarterly poll of AGI timelines and related questions of all your employees, and publish the results.
- ryan_greenblatt 31 Dec 2024 1:46 UTC
  LW: 7 AF: 7
  2
  AF Parent
  My current tenative guess is that this is somewhat worse than other alignment science projects that I’d recommend at the margin, but somewhat better than the 25th percentile project currently being done. I’d think it was good at the margin (of my recommendation budget) if the project could be done in a way where we think we’d learn generalizable scalable oversight / control approaches.