Most recent large safety projects seem to be focused on language models. So in case the evidence pointed towards problem complexity not mattering that much, I would expect the shift in prioritization towards more RL-safety research to outweigh the effect on capability improvements (especially for the small version of the project, about which larger actors might not care that much). I am also sceptical whether the capabilities of the safety community are in fact increasing exponentially.
I am also confused about the resources/reputation framing. To me this is a lot more about making better predictions when we will get to transformative AI, and how this AI might work, such that we can use the available resources as efficiently as possible by prioritizing the right kind of work and hedging for different scenarios to an appropriate degree. This is particularly true for the scenario where complexity matters a lot (which I find overwhelmingly likely), in which too much focus on very short timelines might be somewhat costly (obviously none of these experiements can remotely rule out short timelines, but I do expect that they could attenuate how much people update on the XLand results).
Still, I do agree that it might make sense to publish any results on this somewhat cautiously.
I’m extremely reluctant to trade off any acceleration of AI for an increase in forecasting ability because:
a) I’m skeptical of how accurately AI can be forecasted b) I’m skeptical of how much people’s actions will change based on forecasts, especially since people are skeptical of how accurately it can be forecast
The biggest and most reliable updates from technical research will probably come from “We tried X and it was a lot easier than we thought it’d be” (ie. GPT3, AlphaGo) which involves accelerating capabilities. On the other hand, we tried X and didn’t make a lot of progress is less persuasive as maybe it’d have worked with a minor change.
”I am also confused about the resources/reputation framing. To me this is a lot more about making better predictions”—I know you’re focused on improving predictions—I’m just explaining my personal opinion resources/reputation are more likely to be worth the risk of accelerating timelines.
Your point b) seems like it should also make you somewhat sceptical of any of this accelerating AI capabilities, unless you belief that capabilities-focused actors would change their actions based on forecasts, while safety-focused actors wouldn’t. Obviously, this is a matter of degree, and it could be the case that the same amount of action-changing by both actors still leads to worse outcomes.
I think that if OpenAI unveiled GPT4 and it did not perform noticeably better than GPT3 despite a lot more parameters, that would be a somewhat important update. And it seems like a similar kind of update could be produced by well-conducted research on scaling laws for complexity.
“I think that if OpenAI unveiled GPT4 and it did not perform noticeably better than GPT3 despite a lot more parameters, that would be a somewhat important update”
Yeah, that would be an important update. But not as much as if it had worked since there might be a slight tweak that would make it work.
Most recent large safety projects seem to be focused on language models. So in case the evidence pointed towards problem complexity not mattering that much, I would expect the shift in prioritization towards more RL-safety research to outweigh the effect on capability improvements (especially for the small version of the project, about which larger actors might not care that much). I am also sceptical whether the capabilities of the safety community are in fact increasing exponentially.
I am also confused about the resources/reputation framing. To me this is a lot more about making better predictions when we will get to transformative AI, and how this AI might work, such that we can use the available resources as efficiently as possible by prioritizing the right kind of work and hedging for different scenarios to an appropriate degree. This is particularly true for the scenario where complexity matters a lot (which I find overwhelmingly likely), in which too much focus on very short timelines might be somewhat costly (obviously none of these experiements can remotely rule out short timelines, but I do expect that they could attenuate how much people update on the XLand results).
Still, I do agree that it might make sense to publish any results on this somewhat cautiously.
I’m extremely reluctant to trade off any acceleration of AI for an increase in forecasting ability because:
a) I’m skeptical of how accurately AI can be forecasted
b) I’m skeptical of how much people’s actions will change based on forecasts, especially since people are skeptical of how accurately it can be forecast
The biggest and most reliable updates from technical research will probably come from “We tried X and it was a lot easier than we thought it’d be” (ie. GPT3, AlphaGo) which involves accelerating capabilities. On the other hand, we tried X and didn’t make a lot of progress is less persuasive as maybe it’d have worked with a minor change.
”I am also confused about the resources/reputation framing. To me this is a lot more about making better predictions”—I know you’re focused on improving predictions—I’m just explaining my personal opinion resources/reputation are more likely to be worth the risk of accelerating timelines.
Your point b) seems like it should also make you somewhat sceptical of any of this accelerating AI capabilities, unless you belief that capabilities-focused actors would change their actions based on forecasts, while safety-focused actors wouldn’t. Obviously, this is a matter of degree, and it could be the case that the same amount of action-changing by both actors still leads to worse outcomes.
I think that if OpenAI unveiled GPT4 and it did not perform noticeably better than GPT3 despite a lot more parameters, that would be a somewhat important update. And it seems like a similar kind of update could be produced by well-conducted research on scaling laws for complexity.
“I think that if OpenAI unveiled GPT4 and it did not perform noticeably better than GPT3 despite a lot more parameters, that would be a somewhat important update”
Yeah, that would be an important update. But not as much as if it had worked since there might be a slight tweak that would make it work.