Many research tasks have very long delays until they can be verified. The history of technology is littered with apparently good ideas that turned out to be losers after huge development efforts were poured into them. Supersonic transport, zeppelins, silicon-on-sapphire integrated circuits, pigeon-guided bombs, object-oriented operating systems, hydrogenated vegetable oil, oxidative decoupling for weight loss…
Finding out that these were bad required making them, releasing them to the market, and watching unrecognized problems torpedo them. Sometimes it took decades.
But if the core difficulty in solving alignment is developing some difficult mathematical formalism and figuring out relevant proofs then I think we won’t suffer from the problems with the technologies above. In other words, I would feel comfortable delegating and overseeing a team of AIs that have been tasked with solving the Riemann hypothesis—and I think this is what a large part of solving alignment might look like.
“May it go from your lips to God’s ears,” as the old Jewish saying goes. Meaning, I hope you’re right. Maybe aligning superintelligence will largely be a matter of human-checkable mathematical proof.
I have 45 years experience as a software and hardware engineer, which makes me cynical. When one of my designs encounters the real world, it hardly ever goes the way I expect. It usually either needs some rapid finagling to make it work (acceptable) or it needs to be completely abandoned (bad). This is no good for the first decisive try at superalignment; that has to work first time. I hope our proof technology is up to it.
Many research tasks have very long delays until they can be verified. The history of technology is littered with apparently good ideas that turned out to be losers after huge development efforts were poured into them. Supersonic transport, zeppelins, silicon-on-sapphire integrated circuits, pigeon-guided bombs, object-oriented operating systems, hydrogenated vegetable oil, oxidative decoupling for weight loss…
Finding out that these were bad required making them, releasing them to the market, and watching unrecognized problems torpedo them. Sometimes it took decades.
But if the core difficulty in solving alignment is developing some difficult mathematical formalism and figuring out relevant proofs then I think we won’t suffer from the problems with the technologies above. In other words, I would feel comfortable delegating and overseeing a team of AIs that have been tasked with solving the Riemann hypothesis—and I think this is what a large part of solving alignment might look like.
“May it go from your lips to God’s ears,” as the old Jewish saying goes. Meaning, I hope you’re right. Maybe aligning superintelligence will largely be a matter of human-checkable mathematical proof.
I have 45 years experience as a software and hardware engineer, which makes me cynical. When one of my designs encounters the real world, it hardly ever goes the way I expect. It usually either needs some rapid finagling to make it work (acceptable) or it needs to be completely abandoned (bad). This is no good for the first decisive try at superalignment; that has to work first time. I hope our proof technology is up to it.