Interesting, but what is the probability you assign to this chain of events? Just as well, the probability you would assign to the advent of transformative AI (AGI) being prosaic- as in its achieved by scaling existing architectures with more compute and better hardware?
I am not sure at all about a specific probability for this exact chain of events. I think the secrecy part is quite likely (90%) to happen once a lab actually gets something human level, no matter their commitment to openness, i think seeing their model become truly human-level would scare the shit out of them. Patching obvious security holes also seems 90% likely to me, even Yann Lecun would do that. The real uncertainties come from whether the lab would try to use the model to solve AI safety, or whether they would think their security patches are enough, and push for monetizing the model directly, I’m pretty sure Deepmind and OpenAI would do something like that, I’m unsure about the others.
Regarding the probability of transformative AI being prosaic, i’m thinking 80%. GPT-3 has basically guaranteed that we will explore that particular approach as far as it can go. When I look at all the ways that I can think of making GPT better, of training it faster, of merging image and video understanding into it, of giving it access to true Metadata for each example, longer context length, etc. I see just how easy it is to improve it.
I am completely unsure about timelines. I have a small project going on where I’ll try to get a timeline probability estimate from estimates of the following factors:
cheapness of compute (including next generation computing possibilities)
data growth. Text, video, images, games, vr interaction
investment rate (application vs leading research)
Response of investment rate to increased progress
Response of compute availability to investment
researcher numbers as a function of increased progress
different approaches that could lead to AGI (simulation: minecraft style. Joint Text comprehension with image and video understanding, generative stuff?)
level of compute required for AGI
effect of compute availability on speed of algorithm discovery (architecture search)
discovery of new model architectures
discovery of new training algorithms
discovery of new approaches(like GANs, alphaZero, etc.)
Interesting, but what is the probability you assign to this chain of events? Just as well, the probability you would assign to the advent of transformative AI (AGI) being prosaic- as in its achieved by scaling existing architectures with more compute and better hardware?
I am not sure at all about a specific probability for this exact chain of events. I think the secrecy part is quite likely (90%) to happen once a lab actually gets something human level, no matter their commitment to openness, i think seeing their model become truly human-level would scare the shit out of them. Patching obvious security holes also seems 90% likely to me, even Yann Lecun would do that. The real uncertainties come from whether the lab would try to use the model to solve AI safety, or whether they would think their security patches are enough, and push for monetizing the model directly, I’m pretty sure Deepmind and OpenAI would do something like that, I’m unsure about the others.
Regarding the probability of transformative AI being prosaic, i’m thinking 80%. GPT-3 has basically guaranteed that we will explore that particular approach as far as it can go. When I look at all the ways that I can think of making GPT better, of training it faster, of merging image and video understanding into it, of giving it access to true Metadata for each example, longer context length, etc. I see just how easy it is to improve it.
I am completely unsure about timelines. I have a small project going on where I’ll try to get a timeline probability estimate from estimates of the following factors:
cheapness of compute (including next generation computing possibilities)
data growth. Text, video, images, games, vr interaction
investment rate (application vs leading research)
Response of investment rate to increased progress
Response of compute availability to investment
researcher numbers as a function of increased progress
different approaches that could lead to AGI (simulation: minecraft style. Joint Text comprehension with image and video understanding, generative stuff?)
level of compute required for AGI
effect of compute availability on speed of algorithm discovery (architecture search)
discovery of new model architectures
discovery of new training algorithms
discovery of new approaches(like GANs, alphaZero, etc.)
switch to secrecy and impact on speed of progress
impact of safety concerns on speed