Is this strong evidence that we don’t need to work hard on AI safety? No.
Are elements of the simple generative-model-finetuning paradigm going to be reused to ensure safety of superintelligent AI (conditional on things going well)? Maybe, maybe not. I’d say that the probability is around 30%. That’s pretty likely in the grand scheme of things! But it’s even more likely that we’ll use new approaches entirely and the safety guardrails on GPT-4 will be about as technologically relevant to superintelligent AI as the safety guardrails on industrial robots.
More or less.
Is this good news? Yes.
Is this strong evidence that we don’t need to work hard on AI safety? No.
Are elements of the simple generative-model-finetuning paradigm going to be reused to ensure safety of superintelligent AI (conditional on things going well)? Maybe, maybe not. I’d say that the probability is around 30%. That’s pretty likely in the grand scheme of things! But it’s even more likely that we’ll use new approaches entirely and the safety guardrails on GPT-4 will be about as technologically relevant to superintelligent AI as the safety guardrails on industrial robots.