Seth Herd comments on OpenAI Email Archives (from Musk v. Altman and OpenAI blog)

Seth Herd 17 Nov 2024 3:47 UTC
8 points
8
That makes sense under certain assumptions—I find them so foreign I wasn’t thinking in those terms. I find this move strange if you worry about either alignment or misuse. If you hand AGI to a bunch of people, one of them is prone to either screw up and release a misaligned AGI, or deliberately use their AGI to self-improve and either take over or cause mayhem.
To me these problems both seem highly likely. That’s why the move of responding to concern over AGI by making more AGIs makes no sense to me. I think a singleton in responsible hands is our best chance at survival.
If you think alignment is so easy nobody will screw it up, or if you strongly believe that an offense-defense balance will strongly hold so that many good AGIs safely counter a few misaligned/misused ones, then sure. I just don’t think either of those are very plausible views once you’ve thought back and forth through things.
Cruxes of disagreement on alignment difficulty explains why I think anybody who thinks alignment is super easy is overestimating their confidence (as is anyone who’s sure it’s really really hard) - we just haven’t done enough analysis or experimentation yet.
If we solve alignment, do we die anyway? addresses why I think offense-defense balance is almost guaranteed to shift to offense with self-improving AGI, meaning a massively multipolar scenario means we’re doomed to misuse.
My best guess is that people who think open-sourcing AGI is a good idea either are thinking only of weak “AGI” and not the next step to autonomously self-improving AGI, or they’ve taken an optimistic guess at the offense-defense balance with many human-controlled real AGIs.
- TristanTrim 28 Nov 2024 9:07 UTC
  3 points
  0
  Parent
  There may also be a perceived difference between “open” and “open-source”. If the goal is to allow anyone to query the HHH AGI, that’s different from anyone being able to modify and re-deploy the AGI. Not that I think that way. In my view the risk that AGI is uncontrollable is too high and we should pursue an “aligned from boot” strategy like I describe in: How I’d like alignment to get done