I agree I haven’t filled in all the details to argue for continuous progress (mostly because I don’t know the exact numbers), but when you get better results by investing more resources to push forward on a predicted scaling law, if there is a discontinuity it comes from a discontinuity in resource investment, which feels quite different from a technological discontinuity (e.g. we can model it and see a discontinuity is unlikely). This was the case with AlphaGo for example.
Separately, I also predict GPT-3 was not an example of discontinuity on perplexity, because it did not constitute a discontinuity in resource investment. (There may have been a discontinuity from resource investment in language models earlier in 2018-19, though I would guess even that wasn’t the case.)
Several of the discontinuities in the AI Impacts investigation were the result of discontinuities in resource investment, IIRC.
I think Ajeya’s report mostly assumes, rather than argues, that there won’t be a discontinuity of resource investment. Maybe I’m forgetting something but I don’t remember her analyzing the different major actors to see if any of them has shown signs of secretly running a Manhattan project or being open to doing so in the future.
Also, discontinuous progress is systematically easier than both of you in this conversation make it sound: The process is not “Choose a particular advancement (GPT-3), identify the unique task or dimension which it is making progress on, and then see whether or not it was a discontinuity on the historical trend for that task/dimension.” There is no one task or dimension that matters; rather, any “strategically significant” dimension matters. Maybe GPT-3 isn’t a discontinuity in perplexity, but is still a discontinuity in reasoning ability or common-sense understanding or wordsmithing or code-writing.
(To be clear, I agree with you that GPT-3 probably isn’t a discontinuity in any strategically significant dimension, for exactly the reasons you give: GPT-3 seems to be just continuing a trend set by the earlier GPTs, including the resource-investment trend.)
Maybe GPT-3 isn’t a discontinuity in perplexity, but is still a discontinuity in reasoning ability or common-sense understanding or wordsmithing or code-writing.
I was disagreeing with this statement in the OP:
GPT-3 was maybe a discontinuity for language models.
I agree that it “could have been” a discontinuity on those other metrics, and my argument doesn’t apply there. I wasn’t claiming it would.
I think Ajeya’s report mostly assumes, rather than argues, that there won’t be a discontinuity of resource investment. Maybe I’m forgetting something but I don’t remember her analyzing the different major actors to see if any of them has shown signs of secretly running a Manhattan project or being open to doing so in the future.
It doesn’t argue for it explicitly, but if you look at the section and the corresponding appendix, it just seems pretty infeasible for there to be a large discontinuity—a Manhattan project in the US that had been going on for the last 5 years and finished tomorrow would cost ~$1T, while current projects cost ~$100M, and 4 orders of magnitude at the pace in AI and Compute would be a discontinuity of slightly under 4 years. This wouldn’t be a large / robust discontinuity according to the AI Impacts methodology, and I think it wouldn’t even pick this up as a “small” discontinuity?
Several of the discontinuities in the AI Impacts investigation were the result of discontinuities in resource investment, IIRC.
I didn’t claim otherwise? I’m just claiming you should distinguish between them.
If anything this would make me update that discontinuities in AI are less likely, given that I can be relatively confident there won’t be discontinuities in AI investment (at least in the near-ish future).
I agree I haven’t filled in all the details to argue for continuous progress (mostly because I don’t know the exact numbers), but when you get better results by investing more resources to push forward on a predicted scaling law, if there is a discontinuity it comes from a discontinuity in resource investment, which feels quite different from a technological discontinuity (e.g. we can model it and see a discontinuity is unlikely). This was the case with AlphaGo for example.
Separately, I also predict GPT-3 was not an example of discontinuity on perplexity, because it did not constitute a discontinuity in resource investment. (There may have been a discontinuity from resource investment in language models earlier in 2018-19, though I would guess even that wasn’t the case.)
Several of the discontinuities in the AI Impacts investigation were the result of discontinuities in resource investment, IIRC.
I think Ajeya’s report mostly assumes, rather than argues, that there won’t be a discontinuity of resource investment. Maybe I’m forgetting something but I don’t remember her analyzing the different major actors to see if any of them has shown signs of secretly running a Manhattan project or being open to doing so in the future.
Also, discontinuous progress is systematically easier than both of you in this conversation make it sound: The process is not “Choose a particular advancement (GPT-3), identify the unique task or dimension which it is making progress on, and then see whether or not it was a discontinuity on the historical trend for that task/dimension.” There is no one task or dimension that matters; rather, any “strategically significant” dimension matters. Maybe GPT-3 isn’t a discontinuity in perplexity, but is still a discontinuity in reasoning ability or common-sense understanding or wordsmithing or code-writing.
(To be clear, I agree with you that GPT-3 probably isn’t a discontinuity in any strategically significant dimension, for exactly the reasons you give: GPT-3 seems to be just continuing a trend set by the earlier GPTs, including the resource-investment trend.)
I was disagreeing with this statement in the OP:
I agree that it “could have been” a discontinuity on those other metrics, and my argument doesn’t apply there. I wasn’t claiming it would.
It doesn’t argue for it explicitly, but if you look at the section and the corresponding appendix, it just seems pretty infeasible for there to be a large discontinuity—a Manhattan project in the US that had been going on for the last 5 years and finished tomorrow would cost ~$1T, while current projects cost ~$100M, and 4 orders of magnitude at the pace in AI and Compute would be a discontinuity of slightly under 4 years. This wouldn’t be a large / robust discontinuity according to the AI Impacts methodology, and I think it wouldn’t even pick this up as a “small” discontinuity?
I didn’t claim otherwise? I’m just claiming you should distinguish between them.
If anything this would make me update that discontinuities in AI are less likely, given that I can be relatively confident there won’t be discontinuities in AI investment (at least in the near-ish future).
OK, sure. I think I misread you.