There’s an interesting appendix listing desiderata for good AI forecasting, that includes the catchy phrase “epistemically temporally fractal” (which I feel compelled to find a place to use in my life). This first three points are reminiscent of Zvi’s recent post.
Appendix A: Forecasting Desiderata
1. We want our forecasting targets to be indicators for relevant achievements. This includes targets that serve as (leading) indicators for important economic capabilities, such as a capability that would pose a substantial employment threat to a large group of people. It includes indicators for important security capabilities, such as in potent AI cyber-offense, surveillance and imagery intelligence, or lie detection. It includes indicators for important technical achievements, such as those that are thought to be crucial steps on the path to more transformative capabilities (e.g. AGI), those that are central to many problem areas, or those that would otherwise substantially accelerate AI progress. [115]
2. We want them to be accurate indicators, as opposed to noisy indicators that are not highly correlated with the important events. Specifically, where E is the occurrence or near occurrence of some important event, and Y is whether the target has been reached, we want P(not Y|not E)~1, and P(Y | E) ~1. An indicator may fail to be informative if it can be “gamed” in that there are ways of achieving the indicator without the important event being near. It may be a noisy indicator if it depends on otherwise irrelevant factors, such as whether a target happens to take on symbolic importance as the focus of research.
3. We want them to be well specified: they are unambiguous and publicly observable, so that it will not be controversial to evaluate whether E has taken place. These could be either targets based on a commonly agreed objective metric such as an authoritative measure of performance, or a subjective target likely to involve agreement across judges. Judges will not ask later: “what did you mean”? [116]
4. We want them to be somewhat near-term probable: we should not be too confident in the near-term about whether they will occur. If they all have tiny probabilities (<1%) then we will not learn much after not seeing any of them resolve. The closer the [117 probability of a forecasting event and a set of predictions is to 50%, over a given time frame, the more we will learn about forecasting ability, and the world, over that time frame.
5. We ideally want them to be epistemically temporally fractal: we want them to be such that good forecasting performance on near-term forecasts is informative of good forecasting performance on long-term predictions. Near-term forecasting targets are more likely to have this property as they depend on causal processes that are likely to continue to be relevant over the long-term.
6. We want them to be jointly maximally informative. This means that we ideally want a set of targets that score well on the above criteria. A way in which this could not be so is if some targets are highly statistically dependent on others, such as if some are logically entailed by others. Another heuristic here is to aim for forecasting targets that exhaustively cover the different causal pathways to relevant achievements.
---
[115] The Good Judgment Project sometimes refers to an indicator that is relevant as one that is diagnostic of a bigger issue that we care about.
[116] Tetlock has called this the “Clairvoyance Test”: if you asked a clairvoyant about your forecasting question, would they be able to answer you or would they require clarification on what you meant. See Tetlock, Philip E., and Dan Gardner. Superforecasting: The art and science of prediction. Random House, 2016, and https://www.edge.org/conversation/philip_tetlock-how-to-win-at-forecasting
[117] Though we should learn a lot from seeing one such unexpected event occur. Thus, such a “long-shot” target would be a worthwhile forecasting target to a person who assigns intermediate subjective probability of it occurring, even if everyone else in the community is confident it will (not) occur.
There’s an interesting appendix listing desiderata for good AI forecasting, that includes the catchy phrase “epistemically temporally fractal” (which I feel compelled to find a place to use in my life). This first three points are reminiscent of Zvi’s recent post.