One thing that I’m excited about is using future AIs to judge current ones. So we could have a system that does: 1. An AI today (or a human) would output a certain recommended strategy. 2. In 10 years, we agree to have the most highly-trusted AI evaluator evaluate how strong this strategy was, on some numeric scale. We could also wait until we have a “sufficient” AI, meaning that there might be some set point at which we’d trust AIs to do this evaluation. (I discussed this more here) 3. Going back to ~today, we have forecasting systems predict how well the strategy (1) will do on (2).
Agreed. I’m curious how to best do this.
One thing that I’m excited about is using future AIs to judge current ones. So we could have a system that does:
1. An AI today (or a human) would output a certain recommended strategy.
2. In 10 years, we agree to have the most highly-trusted AI evaluator evaluate how strong this strategy was, on some numeric scale. We could also wait until we have a “sufficient” AI, meaning that there might be some set point at which we’d trust AIs to do this evaluation. (I discussed this more here)
3. Going back to ~today, we have forecasting systems predict how well the strategy (1) will do on (2).