I think so. By symmetry, imperfect anti-alignment will destroy almost all the disvalue the same way imperfect alignment will destroy almost all the value. Thus, the overwhelming majority of alignment problems are solved by default with regard to hyperexistential risks.
More intuitively, problems become much easier when there isn’t a powerful optimization process to push against. E.g. computer security is hard because there are intelligent agents out there trying to break your system, not because cosmic rays will randomly flip some bits in your memory.
I think so. By symmetry, imperfect anti-alignment will destroy almost all the disvalue the same way imperfect alignment will destroy almost all the value. Thus, the overwhelming majority of alignment problems are solved by default with regard to hyperexistential risks.
More intuitively, problems become much easier when there isn’t a powerful optimization process to push against. E.g. computer security is hard because there are intelligent agents out there trying to break your system, not because cosmic rays will randomly flip some bits in your memory.