Let me go one further, here: Suppose an GPT-N, having been trained on the sum total academic output of humanity, could in-principle output a correct proof of the Riemann Hypothesis. You give it the prompt: “You’re a genius mathematician, more capable than any alive today. Please give me a complete and detailed proof of the Riemann Hypothesis.” What would happen?
Wouldn’t it be far simpler for GPT-N to output something that looks like the proof of the Reimann Hypothesis rather than the bona-fide thing? Maybe it would write it in a style that mimics current academic work far better than GPT-4 does now. Maybe it would seem coherent on first glance. But there’s a lot of ways to be wrong and only a few to be correct. And it’s rewarded the same amount it produces a correct result versus one that seems correct unless probed extensively. Wouldn’t it be most parsimonious if it just … didn’t spend it’s precious weights to actually create new mathematical insights and just learned to ape humans (with all their faults and logical dead-ends) even better. After all that’s the only thing it was trained to do.
So yes, I expect with the current training regime GPT won’t discover novel mathematical proofs or correct ways to make nano-tech if those things would take super-human ability. It could roleplay those tasks for you, but it doesn’t really have a reason to actually make those things true. Just to sound true. Which isn’t to say it wouldn’t be able to cause widespread damage with just things humans can and already do (like cyber terrorism or financial manipulation). It probably won’t be pie in the sky things like novel nano-tech.
Let me go one further, here: Suppose an GPT-N, having been trained on the sum total academic output of humanity, could in-principle output a correct proof of the Riemann Hypothesis. You give it the prompt: “You’re a genius mathematician, more capable than any alive today. Please give me a complete and detailed proof of the Riemann Hypothesis.” What would happen?
Wouldn’t it be far simpler for GPT-N to output something that looks like the proof of the Reimann Hypothesis rather than the bona-fide thing? Maybe it would write it in a style that mimics current academic work far better than GPT-4 does now. Maybe it would seem coherent on first glance. But there’s a lot of ways to be wrong and only a few to be correct. And it’s rewarded the same amount it produces a correct result versus one that seems correct unless probed extensively. Wouldn’t it be most parsimonious if it just … didn’t spend it’s precious weights to actually create new mathematical insights and just learned to ape humans (with all their faults and logical dead-ends) even better. After all that’s the only thing it was trained to do.
So yes, I expect with the current training regime GPT won’t discover novel mathematical proofs or correct ways to make nano-tech if those things would take super-human ability. It could roleplay those tasks for you, but it doesn’t really have a reason to actually make those things true. Just to sound true. Which isn’t to say it wouldn’t be able to cause widespread damage with just things humans can and already do (like cyber terrorism or financial manipulation). It probably won’t be pie in the sky things like novel nano-tech.