I think if you could demonstrably “solve alignment” for any architecture, you’d have a decent chance of convincing people to build it as fast as possible, in lieu of other avenues they had been pursuing.
Some people. But it would depend what the prospects were for that type of AGI. Because I don’t think you could convince everyone else to stop working on other types of AGI. So it would be a race between the new “more alignable” type and the currently-leading types. If the “more alignable” type seemed guaranteed to lose that race, I’m not sure many people would even try building it.
I think if you could demonstrably “solve alignment” for any architecture, you’d have a decent chance of convincing people to build it as fast as possible, in lieu of other avenues they had been pursuing.
Some people. But it would depend what the prospects were for that type of AGI. Because I don’t think you could convince everyone else to stop working on other types of AGI. So it would be a race between the new “more alignable” type and the currently-leading types. If the “more alignable” type seemed guaranteed to lose that race, I’m not sure many people would even try building it.