The biggest extinction risk from AI comes from instrumental convergence for resource acquisition in which an AI not aligned with human values uses the atoms in our bodies for whatever goals it has. An advantage of such instrumental convergence is that it would prevent an AI from bothering to impose suffering on us.
Unfortunately, this means that making progress on the instrumental convergence problem increases S-risks. We get hell if we solve instrumental convergence, but not, say, mesa-optimization and we get a powerful AGI that cares about our fate, but does something to us we consider worse than death.
The biggest extinction risk from AI comes from instrumental convergence for resource acquisition in which an AI not aligned with human values uses the atoms in our bodies for whatever goals it has. An advantage of such instrumental convergence is that it would prevent an AI from bothering to impose suffering on us.
Unfortunately, this means that making progress on the instrumental convergence problem increases S-risks. We get hell if we solve instrumental convergence, but not, say, mesa-optimization and we get a powerful AGI that cares about our fate, but does something to us we consider worse than death.