Building a stably aligned agent also doesn’t prevent new misaligned agents from getting built and winning, which is even worse than not having alignment stability problem solved.
Agreed, if you’re in a world where people will stop making AGI if they don’t have the alignment problem solved. But we don’t seem to be in that world. Although we should see if we can get there. I agree that having stably aligned AGI agents doesn’t prevent other misaligned AGIs. Unless they’re powerful enough and specifically used to prevent misaligned agents from being created or being effective in causing mayhem. It seems like that’s the scenario we’re heading for.
Building a stably aligned agent also doesn’t prevent new misaligned agents from getting built and winning, which is even worse than not having alignment stability problem solved.
Agreed, if you’re in a world where people will stop making AGI if they don’t have the alignment problem solved. But we don’t seem to be in that world. Although we should see if we can get there. I agree that having stably aligned AGI agents doesn’t prevent other misaligned AGIs. Unless they’re powerful enough and specifically used to prevent misaligned agents from being created or being effective in causing mayhem. It seems like that’s the scenario we’re heading for.