Generally, a WBE-first future seems difficult to pull off, because (I claim) as soon as we understand the brain well enough for WBE, then we already understand the brain well enough to make non-WBE AGI, and someone will probably do that first. But if we could pull it off, it would potentially be very useful for a safe transition to AGI.
One of the dangers in transition to AGI, besides first AGIs being catastrophically misaligned, is first (aligned) AGIs inventing/deploying novel catastrophically misaligned AGIs, in the absence of sufficiently high intelligence to spontaneously set up effective security measures that prevent that. A significant jump in capabilities that doesn’t originate from AGIs themselves doing work is safer in this respect, things like scaling of models/training that doesn’t involve generating novel agent designs or mesa-optimizers. WBEs don’t have that by default, even if they look much better on alignment.
One of the dangers in transition to AGI, besides first AGIs being catastrophically misaligned, is first (aligned) AGIs inventing/deploying novel catastrophically misaligned AGIs, in the absence of sufficiently high intelligence to spontaneously set up effective security measures that prevent that. A significant jump in capabilities that doesn’t originate from AGIs themselves doing work is safer in this respect, things like scaling of models/training that doesn’t involve generating novel agent designs or mesa-optimizers. WBEs don’t have that by default, even if they look much better on alignment.