That makes sense under certain assumptions—I find them so foreign I wasn’t thinking in those terms. I find this move strange if you worry about either alignment or misuse. If you hand AGI to a bunch of people, one of them is prone to either screw up and release a misaligned AGI, or deliberately use their AGI to self-improve and either take over or cause mayhem.
To me these problems both seem highly likely. That’s why the move of responding to concern over AGI by making more AGIs makes no sense to me. I think a singleton in responsible hands is our best chance at survival.
If you think alignment is so easy nobody will screw it up, or if you strongly believe that an offense-defense balance will strongly hold so that many good AGIs safely counter a few misaligned/misused ones, then sure. I just don’t think either of those are very plausible views once you’ve thought back and forth through things.
Cruxes of disagreement on alignment difficulty explains why I think anybody who thinks alignment is super easy is overestimating their confidence (as is anyone who’s sure it’s really really hard) - we just haven’t done enough analysis or experimentation yet.
If we solve alignment, do we die anyway? addresses why I think offense-defense balance is almost guaranteed to shift to offense with self-improving AGI, meaning a massively multipolar scenario means we’re doomed to misuse.
My best guess is that people who think open-sourcing AGI is a good idea either are thinking only of weak “AGI” and not the next step to autonomously self-improving AGI, or they’ve taken an optimistic guess at the offense-defense balance with many human-controlled real AGIs.
That makes sense under certain assumptions—I find them so foreign I wasn’t thinking in those terms. I find this move strange if you worry about either alignment or misuse. If you hand AGI to a bunch of people, one of them is prone to either screw up and release a misaligned AGI, or deliberately use their AGI to self-improve and either take over or cause mayhem.
To me these problems both seem highly likely. That’s why the move of responding to concern over AGI by making more AGIs makes no sense to me. I think a singleton in responsible hands is our best chance at survival.
If you think alignment is so easy nobody will screw it up, or if you strongly believe that an offense-defense balance will strongly hold so that many good AGIs safely counter a few misaligned/misused ones, then sure. I just don’t think either of those are very plausible views once you’ve thought back and forth through things.
Cruxes of disagreement on alignment difficulty explains why I think anybody who thinks alignment is super easy is overestimating their confidence (as is anyone who’s sure it’s really really hard) - we just haven’t done enough analysis or experimentation yet.
If we solve alignment, do we die anyway? addresses why I think offense-defense balance is almost guaranteed to shift to offense with self-improving AGI, meaning a massively multipolar scenario means we’re doomed to misuse.
My best guess is that people who think open-sourcing AGI is a good idea either are thinking only of weak “AGI” and not the next step to autonomously self-improving AGI, or they’ve taken an optimistic guess at the offense-defense balance with many human-controlled real AGIs.