Understanding the scaling laws would improve our ability to make estimates, but would also provide information useful for increasing capabilities by making it clearer how much resources should be invested in scaling vs. how much effort should be invested elsewhere. Obviously, someone is going to try this soon, but given that the capabilities of the safety community seem to be increasing exponentially, small bits of extra time seem to actually matter.
(If OpenAI turns out to be net-positive, I suspect that it’ll largely be because it managed to capture the resources/reputation that were always going to accrue to whoever was the first party to explore scaling and use them to hire a safety team/increase the status of AI safety. If this project is worth pursuing, I suspect it will have been largely because it would also allow you to capture a significant amount of resources/reputation).
Most recent large safety projects seem to be focused on language models. So in case the evidence pointed towards problem complexity not mattering that much, I would expect the shift in prioritization towards more RL-safety research to outweigh the effect on capability improvements (especially for the small version of the project, about which larger actors might not care that much). I am also sceptical whether the capabilities of the safety community are in fact increasing exponentially.
I am also confused about the resources/reputation framing. To me this is a lot more about making better predictions when we will get to transformative AI, and how this AI might work, such that we can use the available resources as efficiently as possible by prioritizing the right kind of work and hedging for different scenarios to an appropriate degree. This is particularly true for the scenario where complexity matters a lot (which I find overwhelmingly likely), in which too much focus on very short timelines might be somewhat costly (obviously none of these experiements can remotely rule out short timelines, but I do expect that they could attenuate how much people update on the XLand results).
Still, I do agree that it might make sense to publish any results on this somewhat cautiously.
I’m extremely reluctant to trade off any acceleration of AI for an increase in forecasting ability because:
a) I’m skeptical of how accurately AI can be forecasted b) I’m skeptical of how much people’s actions will change based on forecasts, especially since people are skeptical of how accurately it can be forecast
The biggest and most reliable updates from technical research will probably come from “We tried X and it was a lot easier than we thought it’d be” (ie. GPT3, AlphaGo) which involves accelerating capabilities. On the other hand, we tried X and didn’t make a lot of progress is less persuasive as maybe it’d have worked with a minor change.
”I am also confused about the resources/reputation framing. To me this is a lot more about making better predictions”—I know you’re focused on improving predictions—I’m just explaining my personal opinion resources/reputation are more likely to be worth the risk of accelerating timelines.
Your point b) seems like it should also make you somewhat sceptical of any of this accelerating AI capabilities, unless you belief that capabilities-focused actors would change their actions based on forecasts, while safety-focused actors wouldn’t. Obviously, this is a matter of degree, and it could be the case that the same amount of action-changing by both actors still leads to worse outcomes.
I think that if OpenAI unveiled GPT4 and it did not perform noticeably better than GPT3 despite a lot more parameters, that would be a somewhat important update. And it seems like a similar kind of update could be produced by well-conducted research on scaling laws for complexity.
“I think that if OpenAI unveiled GPT4 and it did not perform noticeably better than GPT3 despite a lot more parameters, that would be a somewhat important update”
Yeah, that would be an important update. But not as much as if it had worked since there might be a slight tweak that would make it work.
Understanding the scaling laws would improve our ability to make estimates, but would also provide information useful for increasing capabilities by making it clearer how much resources should be invested in scaling vs. how much effort should be invested elsewhere. Obviously, someone is going to try this soon, but given that the capabilities of the safety community seem to be increasing exponentially, small bits of extra time seem to actually matter.
(If OpenAI turns out to be net-positive, I suspect that it’ll largely be because it managed to capture the resources/reputation that were always going to accrue to whoever was the first party to explore scaling and use them to hire a safety team/increase the status of AI safety. If this project is worth pursuing, I suspect it will have been largely because it would also allow you to capture a significant amount of resources/reputation).
Most recent large safety projects seem to be focused on language models. So in case the evidence pointed towards problem complexity not mattering that much, I would expect the shift in prioritization towards more RL-safety research to outweigh the effect on capability improvements (especially for the small version of the project, about which larger actors might not care that much). I am also sceptical whether the capabilities of the safety community are in fact increasing exponentially.
I am also confused about the resources/reputation framing. To me this is a lot more about making better predictions when we will get to transformative AI, and how this AI might work, such that we can use the available resources as efficiently as possible by prioritizing the right kind of work and hedging for different scenarios to an appropriate degree. This is particularly true for the scenario where complexity matters a lot (which I find overwhelmingly likely), in which too much focus on very short timelines might be somewhat costly (obviously none of these experiements can remotely rule out short timelines, but I do expect that they could attenuate how much people update on the XLand results).
Still, I do agree that it might make sense to publish any results on this somewhat cautiously.
I’m extremely reluctant to trade off any acceleration of AI for an increase in forecasting ability because:
a) I’m skeptical of how accurately AI can be forecasted
b) I’m skeptical of how much people’s actions will change based on forecasts, especially since people are skeptical of how accurately it can be forecast
The biggest and most reliable updates from technical research will probably come from “We tried X and it was a lot easier than we thought it’d be” (ie. GPT3, AlphaGo) which involves accelerating capabilities. On the other hand, we tried X and didn’t make a lot of progress is less persuasive as maybe it’d have worked with a minor change.
”I am also confused about the resources/reputation framing. To me this is a lot more about making better predictions”—I know you’re focused on improving predictions—I’m just explaining my personal opinion resources/reputation are more likely to be worth the risk of accelerating timelines.
Your point b) seems like it should also make you somewhat sceptical of any of this accelerating AI capabilities, unless you belief that capabilities-focused actors would change their actions based on forecasts, while safety-focused actors wouldn’t. Obviously, this is a matter of degree, and it could be the case that the same amount of action-changing by both actors still leads to worse outcomes.
I think that if OpenAI unveiled GPT4 and it did not perform noticeably better than GPT3 despite a lot more parameters, that would be a somewhat important update. And it seems like a similar kind of update could be produced by well-conducted research on scaling laws for complexity.
“I think that if OpenAI unveiled GPT4 and it did not perform noticeably better than GPT3 despite a lot more parameters, that would be a somewhat important update”
Yeah, that would be an important update. But not as much as if it had worked since there might be a slight tweak that would make it work.