These are all fair points. I originally thought this discussion was about the likelihood of poor near-term RL generalization when varying horizon length (ie affecting timelines) rather than what type of human-level RL agent will FOOM (ie takeoff speeds). Rereading the original post I see I was mistaken, and I see how my phrasing left that ambiguous. If we’re at the point where the agent is capable of using forecasting techniques to synthesize historical events described in internet text into probabilities, then we’re well-past the point where I think “horizon-length” might really matter for RL scaling laws. In general, you can find and replace my mentions of “tail risk” with “events with too low a frequency in the training distribution to make the agent well-calibrated.”
I think it’s important to note that some important agenty decisions are like this! Military history is all about people who studied everything that came before them, but are dealing with such a high-dimensional and adversarial context that generals still get it wrong in new ways every time.
To address your actual comment, I definitely don’t think humans are good at tail-risks. (E.g. there are very few people with successful track records across multiple paradigm shifts.) I would expect a reasonably good AGI to do better, for the reasons you describe. That said, I do think that FOOM is indeed taking on more weird unknown unknowns than average. (There aren’t great reference classes for inner-aligning your successor given that humans failed to align you.) Maybe not that many! Maybe there is a robust characterizable path to FOOM where everything you need to encounter has a well-documented reference class. I’m not sure.
These are all fair points. I originally thought this discussion was about the likelihood of poor near-term RL generalization when varying horizon length (ie affecting timelines) rather than what type of human-level RL agent will FOOM (ie takeoff speeds). Rereading the original post I see I was mistaken, and I see how my phrasing left that ambiguous. If we’re at the point where the agent is capable of using forecasting techniques to synthesize historical events described in internet text into probabilities, then we’re well-past the point where I think “horizon-length” might really matter for RL scaling laws. In general, you can find and replace my mentions of “tail risk” with “events with too low a frequency in the training distribution to make the agent well-calibrated.”
I think it’s important to note that some important agenty decisions are like this! Military history is all about people who studied everything that came before them, but are dealing with such a high-dimensional and adversarial context that generals still get it wrong in new ways every time.
To address your actual comment, I definitely don’t think humans are good at tail-risks. (E.g. there are very few people with successful track records across multiple paradigm shifts.) I would expect a reasonably good AGI to do better, for the reasons you describe. That said, I do think that FOOM is indeed taking on more weird unknown unknowns than average. (There aren’t great reference classes for inner-aligning your successor given that humans failed to align you.) Maybe not that many! Maybe there is a robust characterizable path to FOOM where everything you need to encounter has a well-documented reference class. I’m not sure.